10.5.2019 Data science for business Data science for business Sebastian Sauer file:///Users/sebastiansaueruser/Documents/Publikationen/blog_ses/data_se/public/slides/data-science-business/intro-data-science-talk-2019-05-14.html#31 1/27
10.5.2019 Data science for business Five Questions on the use of data science for business 1. What's the meaning of data science , machine learning , and all these fancy terms? 2. What's the best model out there? 3. How do I know my model is doing good or bad? 4. Can you give me a cook book for data science? 5. What are all the core concepts of the �eld? 2 / 27 file:///Users/sebastiansaueruser/Documents/Publikationen/blog_ses/data_se/public/slides/data-science-business/intro-data-science-talk-2019-05-14.html#31 2/27
10.5.2019 Data science for business 1. What's the meaning of data science , machine learning , and all these fancy terms? 3 / 27 file:///Users/sebastiansaueruser/Documents/Publikationen/blog_ses/data_se/public/slides/data-science-business/intro-data-science-talk-2019-05-14.html#31 3/27
10.5.2019 Data science for business statistical machine models: learning: probability theory algorithmic models Source: Wikipedia by en:User:RolandH 4 / 27 file:///Users/sebastiansaueruser/Documents/Publikationen/blog_ses/data_se/public/slides/data-science-business/intro-data-science-talk-2019-05-14.html#31 4/27
10.5.2019 Data science for business 'data science' is a popular term Google Trends (2019-04-32) of data analysis jargon keyword 100 artificial intelligence data mining 75 Data science machine learning Predictive analytics hits 50 predictive modeling statistical modeling 25 0 2015 2016 2017 2018 2019 date 5 / 27 file:///Users/sebastiansaueruser/Documents/Publikationen/blog_ses/data_se/public/slides/data-science-business/intro-data-science-talk-2019-05-14.html#31 5/27
10.5.2019 Data science for business What's data science? Depends on whom you ask. 6 / 27 file:///Users/sebastiansaueruser/Documents/Publikationen/blog_ses/data_se/public/slides/data-science-business/intro-data-science-talk-2019-05-14.html#31 6/27
10.5.2019 Data science for business Common theme Art and science of learning from data Y = f ( X ) + ϵ Y = ^ ^ f ( X ) 7 / 27 file:///Users/sebastiansaueruser/Documents/Publikationen/blog_ses/data_se/public/slides/data-science-business/intro-data-science-talk-2019-05-14.html#31 7/27
10.5.2019 Data science for business Machine learning: Feed the computer data, not rules Source: Molnar, C. (2019). Interpretable Machine Learning [ePub Book]. Morrisville, NC: Christoph Molnar. 8 / 27 file:///Users/sebastiansaueruser/Documents/Publikationen/blog_ses/data_se/public/slides/data-science-business/intro-data-science-talk-2019-05-14.html#31 8/27
10.5.2019 Data science for business 2. What's the best model out there? 9 / 27 file:///Users/sebastiansaueruser/Documents/Publikationen/blog_ses/data_se/public/slides/data-science-business/intro-data-science-talk-2019-05-14.html#31 9/27
10.5.2019 Data science for business A lot of models out there Show entries 5 package caret Search: name value getModelInfo() %>% names() %>% 1 1 ada length() 2 2 AdaBag ## [1] 238 3 3 AdaBoost.M1 4 4 adaboost 5 5 amdai Showing 1 to 5 of 238 entries Previous 1 2 3 4 5 … 48 Next 10 / 27 file:///Users/sebastiansaueruser/Documents/Publikationen/blog_ses/data_se/public/slides/data-science-business/intro-data-science-talk-2019-05-14.html#31 10/27
10.5.2019 Data science for business Wait, tell me which model is best 11 / 27 file:///Users/sebastiansaueruser/Documents/Publikationen/blog_ses/data_se/public/slides/data-science-business/intro-data-science-talk-2019-05-14.html#31 11/27
10.5.2019 Data science for business There is no single best model Black box models "White box" models Random forests Linear regression Support vector machines k-nearest neighbours Neural networks Decision trees ... ... less interpretable more interpretable more accurate (at times) less accurate (at times) less robust more robust 12 / 27 file:///Users/sebastiansaueruser/Documents/Publikationen/blog_ses/data_se/public/slides/data-science-business/intro-data-science-talk-2019-05-14.html#31 12/27
10.5.2019 Data science for business Blackbox models do not explain Source: Molnar, C. (2019). Interpretable Machine Learning [ePub Book]. Morrisville, NC: Christoph Molnar. 13 / 27 file:///Users/sebastiansaueruser/Documents/Publikationen/blog_ses/data_se/public/slides/data-science-business/intro-data-science-talk-2019-05-14.html#31 13/27
10.5.2019 Data science for business Ensemble learners show a good track record Source: Sauer, S. (2018). Moderne Datenanalyse mit R: Daten einlesen, aufbereiten, visualisieren und modellieren. Wiesbaden: Springer. 14 / 27 file:///Users/sebastiansaueruser/Documents/Publikationen/blog_ses/data_se/public/slides/data-science-business/intro-data-science-talk-2019-05-14.html#31 14/27
10.5.2019 Data science for business The �t of a model depends on eg the linearity of associations 15 / 27 file:///Users/sebastiansaueruser/Documents/Publikationen/blog_ses/data_se/public/slides/data-science-business/intro-data-science-talk-2019-05-14.html#31 15/27
10.5.2019 Data science for business 3. How do I know my model is doing good or bad? 16 / 27 file:///Users/sebastiansaueruser/Documents/Publikationen/blog_ses/data_se/public/slides/data-science-business/intro-data-science-talk-2019-05-14.html#31 16/27
10.5.2019 Data science for business Short answer: The less error, the better the model 17 / 27 file:///Users/sebastiansaueruser/Documents/Publikationen/blog_ses/data_se/public/slides/data-science-business/intro-data-science-talk-2019-05-14.html#31 17/27
10.5.2019 Data science for business Wait ... Which model do you prefer? 18 / 27 file:///Users/sebastiansaueruser/Documents/Publikationen/blog_ses/data_se/public/slides/data-science-business/intro-data-science-talk-2019-05-14.html#31 18/27
10.5.2019 Data science for business 4. Can you give me a cook book for data science? 19 / 27 file:///Users/sebastiansaueruser/Documents/Publikationen/blog_ses/data_se/public/slides/data-science-business/intro-data-science-talk-2019-05-14.html#31 19/27
10.5.2019 Data science for business Step 1: Choose your model(s) Classify stu� Estimate stu� 20 / 27 file:///Users/sebastiansaueruser/Documents/Publikationen/blog_ses/data_se/public/slides/data-science-business/intro-data-science-talk-2019-05-14.html#31 20/27
10.5.2019 Data science for business Step 2: Build model fed on historical data Over�tting Under�tting 21 / 27 file:///Users/sebastiansaueruser/Documents/Publikationen/blog_ses/data_se/public/slides/data-science-business/intro-data-science-talk-2019-05-14.html#31 21/27
10.5.2019 Data science for business Step 3: Predict the future Run the model on new data 22 / 27 file:///Users/sebastiansaueruser/Documents/Publikationen/blog_ses/data_se/public/slides/data-science-business/intro-data-science-talk-2019-05-14.html#31 22/27
10.5.2019 Data science for business Step 4: Evaluate the model 23 / 27 file:///Users/sebastiansaueruser/Documents/Publikationen/blog_ses/data_se/public/slides/data-science-business/intro-data-science-talk-2019-05-14.html#31 23/27
10.5.2019 Data science for business Here's one way how to get going Source: https://www.williamrchase.com/slides/slide_img/throw_into_pool.gif 24 / 27 file:///Users/sebastiansaueruser/Documents/Publikationen/blog_ses/data_se/public/slides/data-science-business/intro-data-science-talk-2019-05-14.html#31 24/27
10.5.2019 Data science for business Some literature explaining core concepts of data science Grolemund, G., & Wickham, H. (2016). R for Data Science. Retrieved from https://books.google.de/books?id=aZRYrgEACAAJ James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning (Vol. 6). New York City, NY: Springer. Sauer, S. (2019). Moderne Datenanalyse mit R: Daten einlesen, aufbereiten, visualisieren und modellieren (1. Au�age 2019). Wiesbaden: Springer. 25 / 27 file:///Users/sebastiansaueruser/Documents/Publikationen/blog_ses/data_se/public/slides/data-science-business/intro-data-science-talk-2019-05-14.html#31 25/27
10.5.2019 Data science for business Sebastian Sauer sebastiansauer https://data-se.netlify.com/ sebastian.sauer@data-divers.com sauer_sebastian Get slides here: https://data-se.netlify.com/slides/afd_ecda2019/afd- modeling-ECDA-2019.pdf CC-BY 26 / 27 file:///Users/sebastiansaueruser/Documents/Publikationen/blog_ses/data_se/public/slides/data-science-business/intro-data-science-talk-2019-05-14.html#31 26/27
Recommend
More recommend