20/10/2011 1
Anthony Goldbloom CEO, Kaggle
e-mail anthony.goldbloom@kaggle.com twitter @antgoldbloom
Predictive modeling competitions
making data science a sport
- 1. Motivation
- 2. Does it Work?
- 3. Why it Works
- 4. How it Works
- 5. Case Studies
1. Motivation 2. Does it Work? 3. Why it Works 4. How it Works 5. - - PDF document
20/10/2011 Predictive modeling competitions making data science a sport Anthony Goldbloom CEO, Kaggle e-mail anthony.goldbloom@kaggle.com twitter @antgoldbloom 1. Motivation 2. Does it Work? 3. Why it Works 4. How it Works 5. Case Studies
20/10/2011 1
e-mail anthony.goldbloom@kaggle.com twitter @antgoldbloom
making data science a sport
20/10/2011 2
Mismatch between those with data and those with the skills to analyse it
Crowdsourcing
20/10/2011 3
Forecast Error (MASE)
Existing model
Tourism Forecasting Competition
Aug 9 2 weeks later 1 month later Competition End
dunnhumby Shopping Challenge
9 10 11 12 13 14 15 16 17 18 19 20 1 2 3 4 5 6 7 8 9 10 11
% Correctly Predicted Visits Competition Progress (Weeks)
20/10/2011 4
20/10/2011 5
“In less than a week, Martin O’Leary, a PhD student in glaciology, outperformed the state-of-the-art algorithms” “The world’s brightest physicists have been working for decades on solving one of the great unifying problems of our universe” Kaggle’s Dark Matter Competition
20/10/2011 6
20/10/2011 7
20/10/2011 8
Upload Submit Evaluate & Exchange
20/10/2011 9
20/10/2011 10
Competitions are judged based on predictive accuracy
20/10/2011 11
20/10/2011 12
20/10/2011 13
20/10/2011 14
2011 $3 million prize
Successful grant applications
Outcomes of a competition to predict the success of grant applications:
avoid wasting resources on hopeless applications
characteristics of a successful application to future applicants
20/10/2011 15
20/10/2011 16
Photo by gidzy, www.flickr.com/photos/gidzy
e-mail anthony.goldbloom@kaggle.com phone +1 650 283 9781