Introd u ction to s w imming data C ASE STU D IE S IN STATISTIC AL TH IN K IN G J u stin Bois Lect u rer , Caltech
The 2015 FINA World Championships 1 Photo b y Chan - Fan , CC - BY - SA -4.0 CASE STUDIES IN STATISTICAL THINKING
Strokes at the World Championships Freest y le Breaststroke B u � er �y Backstroke CASE STUDIES IN STATISTICAL THINKING
E v ents at the World Championships De � ned b y gender , distance , stroke E x ample : men ' s 200 m freest y le CASE STUDIES IN STATISTICAL THINKING
Ro u nds of e v ents Heats : First ro u nd Semi � nals : Pen u ltimate ro u nd in some e v ents Finals : The � nal ro u nd ; the w inner is champion CASE STUDIES IN STATISTICAL THINKING
Data so u rce Data are freel y a v ailable from OMEGA at omegatiming . com CASE STUDIES IN STATISTICAL THINKING
Domain - specific kno w ledge is Imperati v e An absol u te pleas u re CASE STUDIES IN STATISTICAL THINKING
Let ' s practice ! C ASE STU D IE S IN STATISTIC AL TH IN K IN G
Do s w immers go faster in the finals ? C ASE STU D IE S IN STATISTIC AL TH IN K IN G J u stin Bois Lect u rer , Caltech
E v ent Time Ven u e Date Ro u nd 100 m free 47.51 Beijing 2008-08-11 Final 200 m free 1:42.96 Beijing 2008-08-12 Final 400 m free 3:47.79 Indianapolis 2005-04-01 Final 100 m back 53.01 Indianapolis 2007-08-03 Final 200 m back 1:54.65 Indianapolis 2007-08-01 Final 100 m breast 1:02.57 Col u mbia 2008-02-17 Final 200 m breast 2:11.30 San Antonio 2015-08-10 Final 100 m �y 49.82 Rome 2009-08-01 Final 200 m �y 1:51.51 Rome 2009-29-07 Final CASE STUDIES IN STATISTICAL THINKING
E v ent Time Ven u e Date Ro u nd 50 m free 23.67 B u dapest 2017-07-29 Semi � nal 100 m free 51.71 B u dapest 2017-07-23 Final 200 m free 1.54.08 Rio de Janeiro 2016-08-09 Final 400 m free 4.06.04 Amiens 2014-03-16 Final 50 m back 27.80 Borås 2017-06-30 Final 100 m back 59.98 Eindho v en 2015-04-05 Final 50 m �y 24.43 Borås 2014-07-05 Final 100 m �y 55.48 Rio de Janeiro 2016-08-07 Final CASE STUDIES IN STATISTICAL THINKING
Yo u r q u estion Do s w immers s w im faster in the � nals than in other ro u nds ? Indi v id u al s w immers , or the w hole � eld ? Faster than heats ? Faster than semi � nals ? For w hat strokes ? For w hat distances ? CASE STUDIES IN STATISTICAL THINKING
Yo u r q u estion Do indi v id u al female s w immers s w im faster in the � nals compared to the semi � nals ? E v ents : 50, 100, 200 meter freest y le , breaststroke , b u � er �y, backstroke CASE STUDIES IN STATISTICAL THINKING
Diff ' rent strokes CASE STUDIES IN STATISTICAL THINKING
Fractional impro v ement semifinals time − finals time f = semifinals time CASE STUDIES IN STATISTICAL THINKING
Yo u r q u estion ( s ) Original q u estion : Do s w immers s w im faster in the � nals than in other ro u nds ? Sharpened q u estions : What is the fractional impro v ement of indi v id u al female s w immers from the semi � nals to the � nals ? Is the obser v ed fractional impro v ement commens u rate w ith there being no di � erence in performance in the semi � nals and � nals ? CASE STUDIES IN STATISTICAL THINKING
Let ' s practice ! C ASE STU D IE S IN STATISTIC AL TH IN K IN G
Ho w does the performance of s w immers decline o v er long e v ents ? C ASE STU D IE S IN STATISTIC AL TH IN K IN G J u stin Bois Lect u rer , Caltech
More s w imming backgro u nd 1 Photo b y Chan - Fan , CC - BY - SA -4.0 CASE STUDIES IN STATISTICAL THINKING
More s w imming backgro u nd Split : The time is takes to s w im one length of the pool CASE STUDIES IN STATISTICAL THINKING
More s w imming backgro u nd CASE STUDIES IN STATISTICAL THINKING
More s w imming backgro u nd 1 Image : Miho NL , CC - BY -3.0 CASE STUDIES IN STATISTICAL THINKING
More s w imming backgro u nd CASE STUDIES IN STATISTICAL THINKING
Slo w ing do w n CASE STUDIES IN STATISTICAL THINKING
Q u antif y ing slo w do w n Use w omen ' s 800 m freest y le heats Omit � rst and last 100 meters Comp u te mean split time for each split n u mber Perform linear regression to get slo w do w n per split Perform h y pothesis test : can the slo w do w n be e x plained b y random v ariation ? CASE STUDIES IN STATISTICAL THINKING
H y pothesis tests for correlation Posit n u ll h y pothesis : split time and split n u mber are completel y u ncorrelated Sim u late data ass u ming n u ll h y pothesis is tr u e scrambled_split_number = np.random.permutation( split_number ) Use Pearson correlation , denoted rho , as test statistic rho = dcst.pearson_r(scrambled_split_number, splits) Comp u te p -v al u e as the fraction of replicates that ha v e Pearson correlation at least as large as obser v ed CASE STUDIES IN STATISTICAL THINKING
Let ' s practice ! C ASE STU D IE S IN STATISTIC AL TH IN K IN G
Recommend
More recommend