Intro to ML March 10, 2020 Data Science CSCI 1951A Brown - PowerPoint PPT Presentation

Features • Recency: Float • Words in title: String • Presence of photo: Boolean • Reading level: Integer 44

Features Reading Clicks Recency Photo Title Level 10 1.3 11 1 “New Tax Guidelines” 1000 1.7 3 1 “This 600lb baby…” “18 reasons you should never 1000000 2.4 2 1 look at this cat unless you…” “The Brothers Karamazov: a neo- 1 5.9 19 0 post-globalist perspective” 45

Features y Reading Recency Photo Title Clicks Level 10 1.3 11 1 “New Tax Guidelines” 1000 1.7 3 1 “This 600lb baby…” “18 reasons you should never 1000000 2.4 2 1 look at this cat unless you…” “The Brothers Karamazov: a neo- 1 5.9 19 0 post-globalist perspective” 46

Features x Reading Clicks Recency Photo Title Level 10 1.3 11 1 “New Tax Guidelines” 1000 1.7 3 1 “This 600lb baby…” “18 reasons you should never 1000000 2.4 2 1 look at this cat unless you…” “The Brothers Karamazov: a neo- 5.9 19 0 1 post-globalist perspective” 47

Features numeric features — defined for (nearly) every row Reading Clicks Photo Title Recency Level 10 1.3 11 1 “New Tax Guidelines” 1000 1.7 3 1 “This 600lb baby…” “18 reasons you should never 1000000 2.4 2 1 look at this cat unless you…” “The Brothers Karamazov: a neo- 5.9 19 1 0 post-globalist perspective” 48

Features boolean features — 0 or 1 (“dummy” variables) Reading Clicks Recency Title Photo Level 10 1.3 11 1 “New Tax Guidelines” 1000 1.7 3 1 “This 600lb baby…” “18 reasons you should never 1000000 2.4 2 1 look at this cat unless you…” “The Brothers Karamazov: a neo- 0 1 5.9 19 post-globalist perspective” 49

Features strings = boolean features — 0 or 1 (“dummy” variables) Reading Clicks Recency Photo Title Level 10 1.3 11 1 “New Tax Guidelines” 1000 1.7 3 1 “This 600lb baby…” “18 reasons you should never 1000000 2.4 2 1 look at this cat unless you…” “The Brothers Karamazov: a neo- 1 5.9 19 0 post-globalist perspective” 50

Features strings = boolean features — 0 or 1 (“dummy” variables) Clicks Recency Reading Title: Title: Title: Title: Photo … Level “new” “tax” “this” “…” 10 1.3 11 1 1 0 0 0 … 1000 1.7 3 1 0 0 1 1 … 1000000 2.4 2 1 0 0 1 1 … 0 0 0 0 … 1 5.9 19 0 51

Features “sparse features” — 0 for most rows Clicks Recency Reading Title: Title: Title: Title: Photo … Level “new” “tax” “this” “…” 10 1.3 11 1 1 0 0 0 … 1000 1.7 3 1 0 0 1 1 … 1000000 2.4 2 1 0 0 1 1 … 0 0 0 0 … 1 5.9 19 0 52

Clicker Question! 53 53

Clicker Question! For the problem set up, how many features will there be? I.e. how many columns in our X matrix, (not including Y)? Y: happiness X1: day of week (“monday”, “tuesday”, … “sunday”) X2: bank account balance (real value) X3: breakfast (yes,no) X4: whether you have found your inner peace (yes,no) X5: words from last week’s worth of tweets (assuming tweets are at most 15 words long and there are 100K words in the English vocabulary) (a)112,000 (c) 27 (b)5 (d)110,000 54 54

Clicker Question! For the problem set up, how many features will there be? I.e. how many columns in our X matrix, (not including Y)? Y: happiness 7 X1: day of week (“monday”, “tuesday”, … “sunday”) 1 X2: bank account balance (real value) 1 X3: breakfast (yes,no) 1 X4: whether you have found your inner peace (yes,no) 100,000 X5: words from last week’s worth of tweets (assuming tweets are at most 15 words long and there are 100K words in the English vocabulary) (a)100,012 (c) 27 (b)5 (d)100,010 55 55

Defining an ML problem Objective/Loss Function = squared difference between predicted total number of clicks and actual total number of clicks Task = Increase Consumption Model Data = Reading Habits Features = ??? 56

Defining an ML problem Objective/Loss Function = squared difference between predicted total number of clicks and actual total number of clicks Task = Increase Consumption Model Data = Reading Habits Features = {Recency:float, ReadingLevel:Int, Photo:Bool, Title_New:Bool, Title_Tax:Bool, …} 57

Defining an ML problem Objective/Loss Function = squared difference between predicted total number of clicks and actual total number of clicks Task = Increase Consumption Model Data = Reading Habits Features = {Recency:float, ReadingLevel:Int, Photo:Bool, Title_New:Bool, Title_Tax:Bool, …} 58

Model ML = Function Approximation 59

Model ML = Function Approximation You define inputs and outputs. 64

Model ML = Function Approximation You define inputs and outputs. (The really hard part) 65

Model ML = Function Approximation The machine will (ideally) learn the function (with a lot of help from you) 66

Model ML = Function Approximation The machine will (ideally) learn the function (with a lot of help from you) (The part that gets the most attention.) 67

Model 1 # • Make assumptions about the problem domain. • How is the data generated? • How is the decision-making procedure structured? • What types of dependencies exist? • Trending buzzword: “inductive biases” • How to train the model? 68

Model 1 # • Make assumptions about the problem domain. • How is the data generated? • How is the decision-making procedure structured? • What types of dependencies exist? • Trending buzzword: “inductive biases” 2 # • How to train the model? 73

Model 1 # • Make assumptions about the problem domain. • How is the data generated? • How is the decision-making procedure structured? • What types of dependencies exist? • Trending buzzword: “inductive biases” 2 # • How to train the model? 74

Model clicks reading level 75

Model Regression: continuous (infinite) output f(reading level) = # of clicks clicks reading level 76

Model Classification: discrete (finite) output f(reading level) = {clicked, not clicked} clicks reading level 77

Model clicks = m(reading_level) + b m = -2.4 Linear Regression —> The specific “model” we are using here. clicks reading level 78

Model clicks = m(reading_level) + b m = -2.4 clicks —> output/labels/target clicks reading level 79

Model clicks = m(reading_level) + b m = -2.4 reading level —> The “feature” which is observed/derived from the data clicks reading level 80

Model clicks = m(reading_level) + b m = -2.4 m and b —> The “parameters” which need to be set (by looking at data) clicks reading level 81

Model clicks = m(reading_level) + b m = cov(rl, c)/var(rl) “setting parameters”, “learning”, “training”, “estimation” clicks reading level 82

Model clicks = m(reading_level) + b m = -2.4 parameter values, “weights”, “coefficients” clicks reading level 83

Defining an ML problem Objective/Loss Function = squared difference between predicted total number of clicks and actual total number of clicks Task = Increase Consumption Model Linear Regression Data = Reading Habits Features = {Recency:float, ReadingLevel:Int, Photo:Bool, Title_New:Bool, Title_Tax:Bool, …} 84

Defining an ML problem Objective/Loss Function = squared difference between predicted total number of clicks and actual total number of clicks Linear Regression Features = {Recency:float, ReadingLevel:Int, Photo:Bool, Title_New:Bool, Title_Tax:Bool, …} 85

Defining an ML problem Objective/Loss Function = squared difference between predicted total number of clicks and actual total number of clicks Linear Regression Features = {Recency:float, ReadingLevel:Int, Photo:Bool, Title_New:Bool, Title_Tax:Bool, …} 86

Defining an ML problem Objective/Loss Function = squared difference between predicted total number of clicks and actual total number of clicks Soooo….how do I know if my model is good? Linear Regression Features = {Recency:float, ReadingLevel:Int, Photo:Bool, Title_New:Bool, Title_Tax:Bool, …} 87

Train/Test Splits 88

Train/Test Splits MSE = 10 89

Train/Test Splits MSE = 10 90

Train/Test Splits Train 91

Train/Test Splits Train MSE = 6 92

Train/Test Splits Test 93

Clicker Question! 94

Clicker Question! What should we expect MSE to do? Test (a) Go up (b) Go down (c) Stay the same (modulo random variation) 95

Clicker Question! What should we expect MSE to do? Test If your model isn’ t “right” yet (i.e. in practice, most (a) Go up of the time) (b) Go down (c) Stay the same (modulo random variation) 96

Clicker Question! What should we expect MSE to do? Test If your model is “right” or is not yet powerful (a) Go up enough (i.e. can’ t (b) Go down memorize training data). (c) Stay the same (modulo random variation) 97

Train/Test Splits Test 98

Train/Test Splits Test MSE = 12 99

Train/Test Splits Problem gets worse as models get more powerful/flexible Train MSE = 4 100

Intro to ML March 10, 2020 Data Science CSCI 1951A Brown - PowerPoint PPT Presentation

Intro to ML March 10, 2020 Data Science CSCI 1951A Brown University Instructor: Ellie Pavlick HTAs: Josh Levin, Diane Mutako, Sol Zitter 1 Announcements This class is going viral! (Funny? No? Too soon?) Not officially, but starting to

Interchange Intro Presentation Plus: Intro (Mixed media Interchange Intro Presentation Plus: Intro

Interchange Intro Presentation Plus: Intro (Mixed media Interchange Intro Presentation Plus: Intro

INTRO: What is a MOOD BOARD? What is it? INTRO: Why are they Used? INTRO: Things to Consider

Intro to Life Cycle Analysis Intro to Life Cycle Analysis Intro to Life Cycle Analysis

Intro to Electronics Week 2 Intro to Electronics, Week 2 Last updated Oct. 17, 2012 1 Build a

Large-Scale Data Engineering Intro to LSDE, Intro to Big Data & Intro to Cloud Computing

Lecture 5: HW1 Discussion, Intro to GPUs G63.2011.002/G22.2945.001 October 5, 2010 Discuss HW1

Large-Scale Data Engineering Intro to LSDE, Intro to Big Data & Intro to Cloud Computing

Lab 0 Objectives Intro to Labs Intro to Operating Systems Start Lab #0 UNIX/Linux

Some issues in model-based development for embedded control systems Paul Caspi Verimag-Cnrs

Intro to Electronics Week 5 Intro to Electronics, Week 5 Last updated Nov. 14, 2012 1 Build a

MA/CSSE 473 Day 01 Course Intro Algorithms Intro Pick up a handout from the back table MA/CSSE

Intro to FreeSurfer Jargon Intro to FreeSurfer Jargon voxel surface volume vertex

Hello! TaA - Beverly Chou - 1 What are we doing ? intro part one Intro to gear mechanisms.

06/09/14 10. A (very) short intro to JSP 10. A (very) short intro to JSP Dynamic web pages

Intro to Electronics Week 4 Intro to Electronics, Week 4 Last updated Oct. 31, 2012 1 Make an

CS-184: Computer Graphics Lecture #2: Color Prof. James OBrien University of California,

Convex hull 1 - 1 Convex hull 1 - 2 Convex hull 1 - 3 Convex hull Definition, extremal

Re-Analysis of Radiation Epidemiologc Data 2018/10/1 ANS&HPS Joint Meeting

The Incredible Shrinking Genome Utricularia gibba Enrique Ibarra-Laclette 2 , Eric Lyons 1 ,

Randomness in number theory Edgar Costa (MIT) November 29th, 2018 Colorado State University

Bowing the violin A case study for auditory-motor pattern modeling in the context of music

Its the Performance, Baby! Producing Motion Capture and Animation Tom Tolles & Jarrod

Talk outline Long-range horizontal network and the association field Electrophysiology,

Sambuz

Useful Links

Newsletter

Mail Us

Intro to ML March 10, 2020 Data Science CSCI 1951A Brown - PowerPoint PPT Presentation

Intro to ML March 10, 2020 Data Science CSCI 1951A Brown University Instructor: Ellie Pavlick HTAs: Josh Levin, Diane Mutako, Sol Zitter 1 Announcements This class is going viral! (Funny? No? Too soon?) Not officially, but starting to

Interchange Intro Presentation Plus: Intro (Mixed media Interchange Intro Presentation Plus: Intro

Interchange Intro Presentation Plus: Intro (Mixed media Interchange Intro Presentation Plus: Intro

INTRO: What is a MOOD BOARD? What is it? INTRO: Why are they Used? INTRO: Things to Consider

Intro to Life Cycle Analysis Intro to Life Cycle Analysis Intro to Life Cycle Analysis

Intro to Electronics Week 2 Intro to Electronics, Week 2 Last updated Oct. 17, 2012 1 Build a

Large-Scale Data Engineering Intro to LSDE, Intro to Big Data &amp; Intro to Cloud Computing

Lecture 5: HW1 Discussion, Intro to GPUs G63.2011.002/G22.2945.001 October 5, 2010 Discuss HW1

Large-Scale Data Engineering Intro to LSDE, Intro to Big Data &amp; Intro to Cloud Computing

Lab 0 Objectives Intro to Labs Intro to Operating Systems Start Lab #0 UNIX/Linux

Some issues in model-based development for embedded control systems Paul Caspi Verimag-Cnrs

Intro to Electronics Week 5 Intro to Electronics, Week 5 Last updated Nov. 14, 2012 1 Build a

MA/CSSE 473 Day 01 Course Intro Algorithms Intro Pick up a handout from the back table MA/CSSE

Intro to FreeSurfer Jargon Intro to FreeSurfer Jargon voxel surface volume vertex

Hello! TaA - Beverly Chou - 1 What are we doing ? intro part one Intro to gear mechanisms.

06/09/14 10. A (very) short intro to JSP 10. A (very) short intro to JSP Dynamic web pages

Intro to Electronics Week 4 Intro to Electronics, Week 4 Last updated Oct. 31, 2012 1 Make an

CS-184: Computer Graphics Lecture #2: Color Prof. James OBrien University of California,

Convex hull 1 - 1 Convex hull 1 - 2 Convex hull 1 - 3 Convex hull Definition, extremal

Re-Analysis of Radiation Epidemiologc Data 2018/10/1 ANS&amp;HPS Joint Meeting

The Incredible Shrinking Genome Utricularia gibba Enrique Ibarra-Laclette 2 , Eric Lyons 1 ,

Randomness in number theory Edgar Costa (MIT) November 29th, 2018 Colorado State University

Bowing the violin A case study for auditory-motor pattern modeling in the context of music

Its the Performance, Baby! Producing Motion Capture and Animation Tom Tolles &amp; Jarrod

Talk outline Long-range horizontal network and the association field Electrophysiology,

Sambuz

Useful Links

Newsletter

Mail Us

Large-Scale Data Engineering Intro to LSDE, Intro to Big Data & Intro to Cloud Computing

Large-Scale Data Engineering Intro to LSDE, Intro to Big Data & Intro to Cloud Computing

Re-Analysis of Radiation Epidemiologc Data 2018/10/1 ANS&HPS Joint Meeting

Its the Performance, Baby! Producing Motion Capture and Animation Tom Tolles & Jarrod