Who Will Win It? An In-game Win Probability Model for Football Pieter Robberechts , Jan Van Haaren and Jesse Davis
" Belgium stuns Japan with exceptional comeback NyTimes
How exceptional was this comeback? We need an in-game win probability model! A win-probability model provides the likelihood that a particular team will win/draw/lose a game based upon a specific game state Japan Belgium 38 min (+ stoppage time) Time remaining 2 0 Score Team strength 1699 Elo points 2009 Elo points Red cards 0 0 Shots on goal 2 3 Possession 42% 58% ....
Popular in other sports ⚾ Baseball 🏉 Football 🏁 Basketball
Popular in other sports A number of relevant use cases. 1. Win Probability Added (WPA) metric ‣ Rate a player's contribution to his team's performance ‣ Measure the risk-reward balance of coaching decisions ‣ Evaluate in-game decision making 2. Improve the fan experience 3. In-game betting 4. Identify "clutch" players
... but not (yet) in football Why? Football is a lot harder to model!
Challenges 1. Dealing with stoppage time You do not know how much time is left 2. ... 3. ... 4. ...
Dealing with stoppage time T = 100 T = 93? 95? Capture the game state in each frame
Challenges 1. Dealing with stoppage time You do not know how much time is left 2. Describing the game state Find a minimal set of features that describe the current state and have the most impact on the game outcome 3. ... 4. ...
Describing the game state 1. Base features • Game Time Event-stream data • Score Differential 2. Team strength features • Elo Rating Differential 3. Contextual features • Number of goals scored so far • Number of yellow cards received • The difference in number of red cards received • Attacking passes • Duel strength
Idea 1: Directly model these probabilities Given: x t : the game state at time t Do: Estimate probabilities that the home team will win the game P ( Y = win | x t ) that the game will end in a tie P ( Y = tie | x t ) that the away team will win the game P ( Y = loss | x t )
Idea 1: Directly model these probabilities Logistic regression classifier Well calibrated = All game states in test set where the model predicts a win probability of 60% About 60% of these games should actually end in a win
Idea 1: Directly model these probabilities Logistic regression classifier Tie Win Loss Struggle to predict ties Poorly calibrated win / loss probabilities
Idea 1: Directly model these probabilities Multiple logistic regression classifiers per time frame Loss Win Struggle to predict ties Tie
Idea 1: Directly model these probabilities Random forest classifier Win Loss Breaks down in late game situations Tie
Challenges 1. Describing the game state Find a minimal set of features that describe the current state and have the most impact on the game outcome 2. Dealing with stoppage time You do not know how much time is left 3. The frequent occurrence of ties Football games are often very close, with a margin ≤ 1 goal 4. Changes in momentum Goals often shift the tone of a game
Idea 2: Modelling the number of goals scored between now and the end of the game Expected number of goals scored after time t y > t , h ∼ Bin ( T − t , θ t , h ) y > t , a ∼ Bin ( T − t , θ t , a ) Scoring intensities estimated from the game state Time remaining Reds Duel Strength Time
Idea 2: Modelling the number of goals scored between now and the end of the game Scoring intensities estimated from the game state θ t , home = invlogit( α t x t , home + β + Ha ) θ t , away = invlogit( α t x t , away + β ) regression coe ffi cients change over time α t ∼ 𝒪 ( α t − 1 , σ 2 )
Results Win Loss Tie Proper probability calibration! 😄
Results How accurately can the model predict the final match outcome? Better Proposed
Fan Engagement A great "story stat"
Quantifying performance under mental pressure An Added Goal Value metric Hypothesis Scoring / conceding a goal has High mental pressure a large impact on win probability Example: Identifying "clutch" goal scorers The total added value that occurred from each of player i’s goals, averaged over the number of games played ∑ K i k =1 3 * Δ P ( win | x t k ) + Δ P ( tie | x t k ) AGVp 90 i = * 90 M i
Quantifying performance under mental pressure An Added Goal Value metric
Conclusion In-game win probability models for football ... • ...are not straightforward to implement • ...have useful applications ‣ fan engagement ‣ football analytics • ...will appear everywhere soon
Recommend
More recommend