Wednesday, May 13, 2015

What is a Data Scientist

What is a Data Scientist

Join our #DataTalk on Thursdays at 5 p.m. ET. This week, we tweeted with Dr. Michael Wu, the Chief Scientist at Lithium, where he applies data-driven methodologies to investigate the complex dynamics of the social web.

Michael works with big data and has developed many predictive and prescriptive social analytics with actionable insights. His R&D won him the recognition as a 2010 Influential Leader by CRM Magazine.

You can see all tweets and resources here:
http://www.experian.com/blogs/news/about/data-scientists/
Published in: Data & Analytics


 Transcript

  • 1. #DataTalkWhat is a Data Scientist? LIVE TWEETCHAT FEATURING: Dr. Michael Wu Chief Scientist, Lithium @mich8elwu
  • 2. Join our #DataTalk on Thursdays at 5 p.m. ET This week, we tweeted with Dr. Michael Wu, the Chief Scientist at Lithium, where he applies data-driven methodologies to investigate the complex dynamics of the social web. Check out all tweets from this Twitter chat: ex.pn/scientist
  • 3. What type of work does a data scientist do?
  • 4. Dr. Michael Wu Chief Scientist, Lithium @mich8elwu “A data scientist’s work includes everything from data infrastructure (capture, store, process) to data service (retrieval). #DataTalk ex.pn/datatalk
  • 5. Shanji Xiong Global Chief Scientist, Experian @ShanjiXiong “A data scientist converts data into business intelligence. #DataTalk ex.pn/datatalk
  • 6. Dr. Michael Wu Chief Scientist, Lithium @mich8elwu “A data scientist’s work includes: decision science, business intelligence, customer analytics, marketing analytics, fraud, security, etc. #DataTalk ex.pn/datatalk
  • 7. What are attributes of a good data scientist?
  • 8. Dr. Michael Wu Chief Scientist, Lithium @mich8elwu To be a data scientist, you need the technical expertise in computer science, statistics, and knowledge/experience with large data sets. #DataTalk ex.pn/datatalk “
  • 9. Data scientists should have good intuition, strong coding capability, solid training in statistics & machine learning. #DataTalk ex.pn/datatalk “ Shanji Xiong Global Chief Scientist, Experian @ShanjiXiong
  • 10. Dr. Michael Wu Chief Scientist, Lithium @mich8elwu Educational background for data scientists can be computational genomics, astrophysicists, fluid dynamics, chemistry, biophysics (like me) ... #DataTalk ex.pn/datatalk “
  • 11. Dr. Michael Wu Chief Scientist, Lithium @mich8elwu To be a good data scientist, you need more than tech expertise. You must be a good communicator to explain complex data/analysis. #DataTalk ex.pn/datatalk “
  • 12. Dr. Michael Wu Chief Scientist, Lithium @mich8elwu Good data scientists also need to be passionate about data. I’d highly value curiosity, creativity and perseverance when hiring one. #DataTalk ex.pn/datatalk “
  • 13. What kinds of companies have (or need) data scientists?
  • 14. Dr. Michael Wu Chief Scientist, Lithium @mich8elwu Any company that already is invested in modern big data infrastructure will need data scientists to crunch the data. “ ex.pn/datatalk #DataTalk
  • 15. All companies need to have data scientists to stay competitive.“ ex.pn/datatalk #DataTalk Shanji Xiong Global Chief Scientist, Experian @ShanjiXiong
  • 16. Dr. Michael Wu Chief Scientist, Lithium @mich8elwu All businesses use data, and data will grow so big that our brain and databases eventually can’t handle … they all need data scientists eventually. “ ex.pn/datatalk #DataTalk
  • 17. What types of teams do data scientists work with?
  • 18. Dr. Michael Wu Chief Scientist, Lithium @mich8elwu It all depends how the data organization is structured within the enterprise: independent team, hub & spoke, or silo in dept. ex.pn/datatalk “ #DataTalk
  • 19. Data scientists can work in R&D, product development and support business operations. ex.pn/datatalk “ #DataTalk Shanji Xiong Global Chief Scientist, Experian @ShanjiXiong
  • 20. Dr. Michael Wu Chief Scientist, Lithium @mich8elwu SMBs use hub and spoke data teams where they report to different departments, but collaborate and work together, so data expertise is shared. ex.pn/datatalk “ #DataTalk
  • 21. Dr. Michael Wu Chief Scientist, Lithium @mich8elwu Large companies typically have entire teams of data scientists within each department and they usually don’t collaborate. ex.pn/datatalk “ #DataTalk
  • 22. Dr. Michael Wu Chief Scientist, Lithium @mich8elwu Personally, I work internally with engineering, product, marketing, best practice, service, consulting, strategy, even sales and human resources. ex.pn/datatalk “ #DataTalk
  • 23. I process data, build models, engage with clients, and facilitate collaboration among Experian Data Labs. ex.pn/datatalk “ #DataTalk Shanji Xiong Global Chief Scientist, Experian @ShanjiXiong
  • 24. What are some big challenges that data scientists face?
  • 25. Dr. Michael Wu Chief Scientist, Lithium @mich8elwu One of the biggest challenges for data scientists is communication. Many data scientists speak tech & stats, but they don’t speak business. “ ex.pn/datatalk #DataTalk
  • 26. Challenges for data scientists: data governance and what data can be used for “ ex.pn/datatalk #DataTalk Shanji Xiong Global Chief Scientist, Experian @ShanjiXiong
  • 27. Dr. Michael Wu Chief Scientist, Lithium @mich8elwu Other challenges for data scientists: Data access, data integration, and motivation.“ ex.pn/datatalk #DataTalk
  • 28. Dr. Michael Wu Chief Scientist, Lithium @mich8elwu If a company is starting a data science initiative, their data scientist may not have access to all data due to security and compliance. “ ex.pn/datatalk #DataTalk
  • 29. Is there an art and science to working with big data?
  • 30. Absolutely! Good intuition and domain knowledge are the keys for successful big data projects. “ #DataTalk ex.pn/datatalk Shanji Xiong Global Chief Scientist, Experian @ShanjiXiong
  • 31. Dr. Michael Wu Chief Scientist, Lithium @mich8elwu There’s definitely science to working with big data… there are rigorous stats and implementation details you learn from statistics and computer science. “ #DataTalk ex.pn/datatalk
  • 32. Dr. Michael Wu Chief Scientist, Lithium @mich8elwu There’s also art in working with big data, and this only comes with years of experience on working with big data. “ #DataTalk ex.pn/datatalk
  • 33. Dr. Michael Wu Chief Scientist, Lithium @mich8elwu Picking a good problem is sort of an art, choosing right features from an infinite number of features, too (feature engineering). “ #DataTalk ex.pn/datatalk
  • 34. Dr. Michael Wu Chief Scientist, Lithium @mich8elwu Exploratory data analysis (EDA): getting a feel or a hunch for how the data behaves is definitely an art. “ #DataTalk ex.pn/datatalk
  • 35. How can data scientists make a big impact in their business?
  • 36. Dr. Michael Wu Chief Scientist, Lithium @mich8elwu First, data scientists need to learn about the business, so they have the context to interpret the data and result of models/analyses. “ #DataTalk ex.pn/datatalk
  • 37. Dr. Michael Wu Chief Scientist, Lithium @mich8elwu Second, they need to pick a good problem: the most impactful problem that can be addressed with data they have access to. “ #DataTalk ex.pn/datatalk
  • 38. Dr. Michael Wu Chief Scientist, Lithium @mich8elwu Third, they must communicate effectively and make businesses understand the analysis and business implication of the insight they found. “ #DataTalk ex.pn/datatalk
  • 39. To be impactful, data scientists need to keep an open mind and concentrate efforts on most impactful problems. “ #DataTalk ex.pn/datatalk Shanji Xiong Global Chief Scientist, Experian @ShanjiXiong
  • 40. Dr. Michael Wu Chief Scientist, Lithium @mich8elwu A good data scientist must be a good communicator, storyteller, teacher, etc. who can simplify complex data science for business. “ #DataTalk ex.pn/datatalk
  • 41. What trends some big data trends?
  • 42. Big data will be embraced by more and more businesses. More decisions will be driven by data and analytics. #DataTalk ex.pn/datatalk “ Shanji Xiong Global Chief Scientist, Experian @ShanjiXiong
  • 43. Dr. Michael Wu Chief Scientist, Lithium @mich8elwu In past 5 years, most of the big data tech operate at the infrastructure layer. Now more people are focused on the algorithm layer. #DataTalk ex.pn/datatalk “
  • 44. Dr. Michael Wu Chief Scientist, Lithium @mich8elwu If there’s a big data trend, it’s the shift from infrastructure to analytics/ algorithms on people’s big data asset. #DataTalk ex.pn/datatalk “
  • 45. Dr. Michael Wu Chief Scientist, Lithium @mich8elwu It used to be that data scientists can do everything about any data, now there is data engineering, algorithm, decision science. #DataTalk ex.pn/datatalk “
  • 46. Dr. Michael Wu Chief Scientist, Lithium @mich8elwu Now there’s experts in natural language processing, image analysis, video/audio processing, streaming data, etc. #DataTalk ex.pn/datatalk “
  • 47. Why should companies invest more in data science?
  • 48. Businesses invested in big data wisely will have a huge competitive advantage over their peers. “ ex.pn/datatalk #DataTalk Shanji Xiong Global Chief Scientist, Experian @ShanjiXiong
  • 49. Dr. Michael Wu Chief Scientist, Lithium @mich8elwu Companies should invest more in data science. They will need to eventually anyway.“ ex.pn/datatalk #DataTalk
  • 50. Any final tips for those who want to work in data science?
  • 51. Tips for new data scientists: Keep an open mind, think outside the box, and work hard. The future is bright. #DataTalk ex.pn/datatalk “ Shanji Xiong Global Chief Scientist, Experian @ShanjiXiong
  • 52. Dr. Michael Wu Chief Scientist, Lithium @mich8elwu Final Tips: Learn the tech and stats, learn the business context, learn to communicate the tech/stats to business to bridge the gap #DataTalk ex.pn/datatalk “
  • 53. Dr. Michael Wu Chief Scientist, Lithium @mich8elwu Be patient, follow your passion (which should be data), and pick a good problem to solve. #DataTalk ex.pn/datatalk “
  • 54. Dr. Michael Wu Chief Scientist, Lithium @mich8elwu Be patient, follow your passion (which should be data), and pick a good problem to solve. #DataTalk ex.pn/datatalk “
  • 55. Join our #DataTalk on Twitter on Thursdays at 5 p.m. ET. experian.com/datatalk