data mining meets football
play

Data Mining meets Football (soccer) Ulf Brefeld Knowledge Mining - PowerPoint PPT Presentation

Data Mining meets Football (soccer) Ulf Brefeld Knowledge Mining & Assessment TU Darmstadt / DIPF brefeld@cs.tu-darmstadt.de Data Mining meets Football (soccer) Ulf Brefeld Machine Learning Group Leuphana University of Lneburg


  1. Data Mining meets Football (soccer) Ulf Brefeld Knowledge Mining & Assessment TU Darmstadt / DIPF brefeld@cs.tu-darmstadt.de

  2. Data Mining meets Football (soccer) Ulf Brefeld Machine Learning Group Leuphana University of Lüneburg

  3. Machine Learning / Data Mining Information Extraction & Aggregation Recommendations Personalisation Ulf Brefeld Knowledge Mining & Assessment Group 3

  4. Machine Learning / Data Mining Information Extraction & Aggregation Sports Analytics Recommendations Personalisation Ulf Brefeld Knowledge Mining & Assessment Group 4

  5. German Bundesliga On average 43,502 attendees per game 13.31m attendees per season Ulf Brefeld Knowledge Mining & Assessment Group 5 http://www.ruhrnachrichten.de/storage/pic/mdhl/artikelbilder/sport/4081417_1_Bayern1.jpg?version=1387208424

  6. Monetary Aspects http://www.statista.com/topics/1774/bundesliga/ Revenue of European soccer market € 19.90bn Revenue of German Bundesliga € 2,172.59m German Bundesliga total value of player assets € 413.77m FC Bayern Munich brand value € 794.60m FC Bayern Munich profit after tax € 14.00m Ulf Brefeld Knowledge Mining & Assessment Group 6

  7. Traditional Sports Analytics ๏ Monetary aspects ๏ Statistics to serve information needs… Ulf Brefeld Knowledge Mining & Assessment Group 7

  8. Descriptive Statistics #players season #goals season Ulf Brefeld Knowledge Mining & Assessment Group 8

  9. Distribution of Goals home team away team Ulf Brefeld Knowledge Mining & Assessment Group 9

  10. Yellow Cards Ulf Brefeld Knowledge Mining & Assessment Group 10

  11. Average Player Value incomplete data values in € season Ulf Brefeld Knowledge Mining & Assessment Group 11

  12. ๏ Yeah, interesting… but what does it tell us? Ulf Brefeld Knowledge Mining & Assessment Group 12

  13. Ulf Brefeld Knowledge Mining & Assessment Group 13

  14. “B. Charlton v F. Beckenbauer”, David Marsh 1966 World Cup Final, England - W. Germany Ulf Brefeld Knowledge Mining & Assessment Group 14

  15. Trajectories and Tactics ๏ Understanding player movements is a precondition for analysing game strategy (i.e., tactics) Ulf Brefeld Knowledge Mining & Assessment Group 15

  16. Player Trajectory Data ๏ Cameras capture positions of players and ball* * Referee also tracked and recorded but data usually kept private ๏ x,y,(z) coordinates ๏ ≥ 24 frames p second ๏ Manually denoised (corners, mass confrontations,…) ๏ Players annotated ๏ Perfect data for analysing movements, coordination, tactics, etc. Ulf Brefeld Knowledge Mining & Assessment Group 16

  17. Ball touches of Franck Ribery (FCB vs BMG, season 2013/14) Ulf Brefeld Knowledge Mining & Assessment Group 17

  18. Shots leading to Goals (season 2009/10 - 2013/14) Ulf Brefeld Knowledge Mining & Assessment Group 18

  19. Goalmouth Coordinates (penalties) Ulf Brefeld Knowledge Mining & Assessment Group 19

  20. ๏ Hm… still, what does it tell us? Ulf Brefeld Knowledge Mining & Assessment Group 20

  21. Use Cases ๏ Analyse opponent tactics ๏ Detect strengths/weaknesses in strategy ๏ Automatic game plans ๏ Serious games / training ๏ Player scouting ๏ Improved media coverage ๏ … Ulf Brefeld Knowledge Mining & Assessment Group 21

  22. Identifying Patterns ๏ Pattern = “interesting” event ๏ E.g., A plays 1-2 with B and crosses to C A C B Ulf Brefeld Knowledge Mining & Assessment Group 22

  23. Why is it difficult? ๏ >3 million positions per game ๏ Every player generates ≈ 135000 positions per game ๏ There are ≈ 135000 23 different candidate patterns* * Ignoring the fact that patterns are of different lengths ๏ This is considerably larger than the number of atoms in our galaxy** ** Dark and exotic matter already included ๏ Explicit enumeration infeasible ๏ What similarity measure to use? Ulf Brefeld Knowledge Mining & Assessment Group 23

  24. Identifying Patterns ๏ Pattern = “interesting” event ๏ E.g., A plays 1-2 with B and crosses to C A C B Ulf Brefeld Knowledge Mining & Assessment Group 24

  25. Identifying Patterns ๏ Pattern = “interesting” event ๏ E.g., A plays 1-2 with B and crosses to C ๏ frequent ๏ rare (anomalies/ outliers) ๏ predefined (e.g., match plan, training) ๏ … A C B Ulf Brefeld Knowledge Mining & Assessment Group 25

  26. Identifying Patterns ๏ Pattern = “interesting” event ๏ E.g., A plays 1-2 with B and crosses to C ๏ frequent ๏ rare (anomalies/ outliers) ๏ predefined (e.g., match plan, training) ๏ … A C B Ulf Brefeld Knowledge Mining & Assessment Group 26

  27. Representation ๏ Position = player coordinates on the pitch ๏ A game of soccer = positional data stream ๏ Player trajectory = sequence of consecutive positions ๏ Positions represented by angles wrt reference vector v ref (t ranslation, rotation, scale invariant) v >  ✓ ◆� i v ref cos � 1 α i = sign ( v i , v ref ) k v i k k v ref k Vlachos et al. (KDD, 2004) Ulf Brefeld Knowledge Mining & Assessment Group 27

  28. Dynamic Time Warping Rabiner & Juang (1993) ๏ Movements should be independent of player speed ๏ Dynamic time warping compensates phase shifts h i ๏ Distance measure function dist : R ⇥ R ! R (e.g., ๏ DTW for sequences s and q defined recursively g ( ; , ; ) = 0 g ( s , ; ) = dist ( ; , q ) = 1 8 9 g ( s , h q 2 , . . . , q m i ) < = g ( h s 2 , . . . , s m i , q ) g ( s , q ) = dist ( s 1 , q 1 ) + min g ( h s 2 , . . . , s m i , h q 2 , . . . , q m i ) : ; Ulf Brefeld Knowledge Mining & Assessment Group 28

  29. Dynamic Time Warping Rabiner & Juang (1993) ๏ Movements should be independent of player speed ๏ Dynamic time warping compensates phase shifts h i ๏ Distance measure function dist : R ⇥ R ! R (e.g., ๏ DTW for sequences s and q defined recursively O(| s || q |) g ( ; , ; ) = 0 g ( s , ; ) = dist ( ; , q ) = 1 8 9 g ( s , h q 2 , . . . , q m i ) < = g ( h s 2 , . . . , s m i , q ) g ( s , q ) = dist ( s 1 , q 1 ) + min g ( h s 2 , . . . , s m i , h q 2 , . . . , q m i ) : ; Ulf Brefeld Knowledge Mining & Assessment Group 29

  30. Approximate DTW ๏ Approximate DTW by lower bounds i.e., f ( s , q ) ≤ g ( s , q ) , ficiently computed than [10]. ๏ Focus on characteristic values ๏ Kim et al. (ICDE, 2001) ๏ first, last, greatest, smallest value ๏ Keogh (VLDB, 2002) ๏ minimum/maximum values of subsequences ๏ Complexity in O(| s |) Ulf Brefeld Knowledge Mining & Assessment Group 30

  31. Locality Sensitive Hashing Athitsos et al. (2008), Gionis et al., (1999) ๏ Distance-based hash function ∈ D h : D ! R h s 1 , s 2 ( s ) = dist ( s , s 1 ) 2 + dist ( s 1 , s 2 ) 2 − dist ( s , s 2 ) 2 . 2 dist ( s 1 , s 2 ) s 1 and s 2 randomly use Kim et al. (ICDE, 2001) drawn from database as distance function ๏ Bucket determined by ⇢ 1 : h s 1 , s 2 ( s ) ∈ [ t 1 , t 2 ] h [ t 1 ,t 2 ] s 1 , s 2 ( s ) = 0 : otherwise ๏ Set of admissible intervals T n o [ t 1 , t 2 ] : Pr D ( h [ t 1 ,t 2 ] s 1 , s 2 ( s )) = 0) = Pr D ( h [ t 1 ,t 2 ] T ( s 1 , s 2 ) = s 1 , s 2 ( s )) = 1) Ulf Brefeld Knowledge Mining & Assessment Group 31

  32. Computing Similarities ๏ Remainder needs test for identity ๏ Use outcomes of ๏ Dynamic time warping ๏ Approximate DTW ๏ Locality sensitive hashing (buckets) ๏ … together with similarity threshold Ulf Brefeld Knowledge Mining & Assessment Group 32

  33. Episode Discovery ๏ Apriori-based algorithms ๏ Approach based on Achar et al. (2012) ๏ Distributed implementation scheme (Hadoop) ๏ Two phases ๏ Candidate generation (Mapper) ๏ Counting (Reducer) Ulf Brefeld Knowledge Mining & Assessment Group 33

  34. Empirical Evaluation ๏ DEBS Grand Challenge http://www.orgs.ttu.edu/debs2013/index.php?goto=cfchallengedetails ๏ 8 vs. 8 soccer game recorded by Fraunhofer IIS ๏ In total 33 sensors ๏ 1 sensor per shoe (200Hz) ๏ 1 sensor in the ball (2000Hz) ๏ 15,000 positions per second (3 dimensional) Ulf Brefeld Knowledge Mining & Assessment Group 34

Recommend


More recommend