forecasting the fifa world cup
play

Forecasting the FIFA World Cup Combining goal- and result-based team - PowerPoint PPT Presentation

Forecasting the FIFA World Cup Combining goal- and result-based team ability parameters Pieter Robberechts , Jesse Davis http://people.cs.kuleuven.be/pieter.robberechts Introduction A popular research topic since the '60 Two popular


  1. Forecasting the FIFA World Cup Combining goal- and result-based team ability parameters Pieter Robberechts , Jesse Davis 
 http://people.cs.kuleuven.be/pieter.robberechts

  2. Introduction A popular research topic since the '60 Two popular approaches: 1. Goal-based models Model the number of goals scored by both teams 2. Result-based models Model win-draw-loss outcomes directly

  3. Match outcome prediction Typical approach: Data → Team ratings → Predictions 1. Estimate team abilities based on historical match data 2. Use them to predict future match outcomes

  4. Match outcome prediction Typical approach: Data → Team ratings → Predictions 1. Estimate team abilities based on historical match data 2. Use them to predict future match outcomes Data scraped from: - post WW2 international games from http://eloratings.net - betting odds from http://betexplorer.com/

  5. Match outcome prediction Typical approach: Data → Team ratings → Predictions 1. Estimate team abilities based on historical match data ... Team 2320 2237 2220 2207 .... Strength 2. Use them to predict future match outcomes Two rating systems were explored: - ELO ratings (result-based) - ODM ratings (goal-based)

  6. The ELO rating system 
 A Result-based rating system Given: R H , R A Current home and away team ratings 1 E H = Expected score for the home team 1 + 10 RH − RA 400 S H = { 1 If the home team won Actual score of the home team 0.5 When draw 0 If the home team lost Then: R ′ � H = R H + k ( S H − E H ) Updated rating of the home team

  7. The ELO rating system 
 A Result-based rating system Problem: - Not all games are handled with the same seriousness ‣ Competitiveness factor - Most games are played against weak opponents ‣ Margin of victory Margin of victory weight R ′ � H = R H + k ( S H − E H ) Recentness factor k = k 0 w i (1+ δ ) γ

  8. O ff ense-Defense ratings 
 A Goal-based rating system Given: A ij = Score team j generated against team i Otherwise A ij = 0 Then: O ff ensive rating of team j Defensive rating of team i n n A ij A ji ∑ ∑ o j = d i = d i o i i =1 i =1

  9. O ff ense-Defense ratings 
 A Goal-based rating system Problem: - Large disparities between the number of games played and the strength of the opponents - Teams in di ff erent confederations rarely play each other Solution: Update ratings sequentially For each team: - Pre-game ratings = weighted sum of a team's post game ratings - Post-game ratings = ODM procedure with pre-game ratings as initial ratings

  10. Match outcome prediction 
 Via team rating systems Typical approach: Data → Team ratings → Predictions 1. Estimate team abilities based on historical match data 2. Use them to predict future match outcomes "England wins" "Belgium wins" Elo Elo att def - def att [ 0.43 0.33 0.24 ] Predictor "It's a tie" Home advantage? Two prediction models were explored: - Ordered logit regression (result-based) - Bivariate poisson regression (goal-based)

  11. Tuning the predictive power How accurate are our predictions? 3 possible interpretations: 1. How many games are predicted correctly? → Accuracy 2. How certain was the model about the true outcome? → Logarithmic loss 3. How certain was the model about the true ordered outcome? → Ranked Probability Score (RPS) r − 1 k 1 ∑ ∑ p l − y l )) 2 ( ̂ ( r − 1 k =1 l =1

  12. Tuning the predictive power Dataset Test set Training set Validation set Minimise RPS with L-BFG-S algorithm: Until convergence: For each game ∈ Training set: update_rating(game) Apply best model If game ∈ Validation set: make_prediction(game) End if End for Compute average RPS Update rating and prediction model parameters

  13. Challenge I: Match outcome prediction The models were validated on the 2002, 2006, 2010 and 2014 World Cups X all 2002 2006 2010 2014 Accuracy LogLoss RPS ELO ordered logit ELO bivariate Poisson Random forest Bookmakers ELO+ODM ordered logit ELO+ODM bivariate Poisson ODM ordered logit ODM bivariate Poisson 1 6 2 1 4 3 5 , 9 0 1 2 0 , , , , , 0 0 1 0 0

  14. Challenge I: Match outcome prediction And compared with the 2017 Soccer Prediction Challenge submissions Accuracy RPS Bookmakers ELO ordered logit ELO+ODM ordered logit Berrar et al. Hubá č ek et al. Constantinou Tsokos et al. 5 4 1 9 , 5 0 0 0 , 2 2 0 , , 0 0

  15. Challenge II: Tournament elimination How accurate can we predict the round of elimination of each team in previous World Cups? Accuracy LogLoss RPS 2014 Elo Elo+ODM FiveThirthyEight 2010 Elo Elo+ODM 2006 Elo Elo+ODM 2002 Elo Elo+ODM 5 5 1 4 3 6 1 2 , 2 , , 0 0 0 , , , 0 0 0

  16. Our predictions

  17. Other's predictions Tournament elimination Accuracy LogLoss RPS FiveThirtyEight 0,124 0,531 0,182 0,127 Zeileirs et al. 0,563 0,185 0,594 0,186 0,126 Groll et al. 0,563 0,224 0,132 Our model 0,5 0,201 0,192 UBS

  18. Online interactive https://dtai.cs.kuleuven.be/sports/worldcup18 /

  19. Thanks! Any questions? Interactive at: https://dtai.cs.kuleuven.be/sports/worldcup18 /

Recommend


More recommend