what is data science
play

What is Data Science? January 23, 2020 Data Science CSCI 1951A - PowerPoint PPT Presentation

What is Data Science? January 23, 2020 Data Science CSCI 1951A Brown University Instructor: Ellie Pavlick HTAs: Josh Levin, Diane Mutako, Sol Zitter Your Phenomenal Staff! Karlly Feng Shunjia Zhu Diane Sol Josh Mutako Zitter Levin


  1. What is Data Science? January 23, 2020 Data Science CSCI 1951A Brown University Instructor: Ellie Pavlick HTAs: Josh Levin, Diane Mutako, Sol Zitter

  2. Your Phenomenal Staff! Karlly Feng Shunjia Zhu Diane Sol Josh Mutako Zitter Levin Shash Sinha Maggie Wu Neil Sehgal Huay JP Natalie Ben Jonathan Champa Delworth Gershuny Weisskoff Nazem Aldroubi Will Glaser Sunny Deng Ben Vu Mounika Dandu Marcin Arvind Kolaszewski Juho Choi Yalavarti Nam Do Minna Kimura-

  3. Waitlist • If you are not registered, make sure you are on the waitlist ( link is on course webpage ) • We have a *little* wiggle room in the enrollment cap • We will prioritize fairly (i.e. graduating and need this to graduate > graduating > not graduating…)

  4. What is Data Science?

  5. Moneyball! https://en.wikipedia.org/wiki/Moneyball

  6. Obama Campaign http://crowdsourcing-class.org/slides/ab-testing.pdf

  7. Google’s “40 Shades of Blue” Why Google has 200m reasons to put engineers over designers. The Gaurdian. The Origin of A/B Testing. Nicolai Kramer Jakobsen.

  8. Data Science = Magic

  9. Data Science!

  10. The Scientific Method https://en.wikipedia.org/wiki/Scientific_method

  11. The Scientific Method

  12. The Scientific Method Data Analytics, Visualization, Presentation

  13. The Scientific Method Data Analytics, Visualization, Presentation Machine Learning, Forecasting, Modeling

  14. The Scientific Method Data Analytics, Visualization, Data Collection, Presentation Sampling, Cleaning and Processing Machine Learning, Forecasting, Modeling

  15. The Scientific Method 👎 👎 👎 👎

  16. The Scientific Method 👎 👎 👎 👎

  17. What is Data Science?

  18. What is Data Science?

  19. Data “Science”

  20. Data “Science” https://www.dailydot.com/unclick/state-googled-2017 http://nerdgeeks.co/us-state-words-map

  21. Data “Science” Natalie Delworth https://www.dailydot.com/unclick/state-googled-2017 http://nerdgeeks.co/us-state-words-map

  22. Data “Science” So many maps! https://xkcd.com/1845/

  23. Data “Science” • To be fair… • Intuition plays a huge role in the scientific method (“make observations” is Step 1). • Exploratory analysis is necessary, its okay to not be all rigor all the time • But! • Exploratory analysis (even when it involves the biggest of data) is meant to *form* a hypothesis, not test one • Good experimental design and rigorous statistics are essential if we want to make claims about how the world works

  24. Data “Science” • To be fair… • Intuition plays a huge role in the scientific method (“make observations” is Step 1). • Exploratory analysis is necessary, its okay to not be all rigor all the time • But! • Exploratory analysis (even when it involves the biggest of data) is meant to *form* a hypothesis, not test one • Good experimental design and rigorous statistics are essential if we want to make claims about how the world works

  25. Data “Science” • To be fair… • Intuition plays a huge role in the scientific method (“make observations” is Step 1). • Exploratory analysis is necessary, its okay to not be all rigor all the time • But! • Exploratory analysis (even when it involves the biggest of data) is meant to *form* a hypothesis, not test one • Good experimental design and rigorous statistics are essential if we want to make claims about how the world works

  26. Data “Science” “Eyeballing it” 13-18 23-29 19-22 30-65 Facebook posts by age group Personality, Gender, and Age in the Language of Social Media: The Open-Vocabulary Approach. Schwartz et al. (2013).

  27. Data “Science” “Eyeballing it” Frequent topics observed in 17,000 Science articles Probabilistic Topic Models. Blei (2012).

  28. Data “Science” “Eyeballing it” https://devopedia.org/word-embedding

  29. Data “Science” • To be fair… • Intuition plays a huge role in the scientific method (“make observations” is Step 1). • Exploratory analysis is necessary, its okay to not be all rigor all the time • But! • Exploratory analysis (even when it involves the biggest of data) is meant to *form* a hypothesis, not test one • Good experimental design and rigorous statistics are essential if we want to make claims about how the world works

  30. Data “Science” • To be fair… • Intuition plays a huge role in the scientific method (“make observations” is Step 1). • Exploratory analysis is necessary, its okay to not be all rigor all the time • But! • Exploratory analysis (even when it involves the biggest of data) is meant to *form* a hypothesis, not test one • Good experimental design and rigorous statistics are essential if we want to make claims about how the world works

  31. Data “Science” • To be fair… • Intuition plays a huge role in the scientific method (“make observations” is Step 1). • Exploratory analysis is necessary, its okay to not be all rigor all the time • But! • Exploratory analysis (even when it involves the biggest of data) is meant to *form* a hypothesis, not test one • Good experimental design and rigorous statistics are essential if we want to make claims about how the world works

  32. Data “Science” Per capita cheese consumption correlates with Number of people who died by becoming tangled in their bedsheets 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 800 deaths 33lbs Bedsheet tanglings Cheese consumed 600 deaths 31.5lbs 400 deaths 30lbs 28.5lbs 200 deaths 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 ρ = 0.95 Bedsheet tanglings Cheese consumed tylervigen.com https://en.wikipedia.org/wiki/Data_dredging http://www.tylervigen.com/spurious-correlations

Recommend


More recommend