Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
A simple model for tennisSummary
Will Andy Murray win Wimbledon 2015?
Robert Johnson, Sporting Advantage Limited
PyDataLondon, 7th July 2015
Robert Johnson Sporting Advantage
2013
2015?
A simple model for tennisSummary
Wimbledon Questions
Murray won in 2013. Will he win again this year?Is Murray on better form this year than in 2013?How do the top players compare in the mens’ tournament?
Robert Johnson Sporting Advantage
A simple model for tennisSummary
Outline
1 A simple model for tennisDataMathematical derivationsImplementation
Robert Johnson Sporting Advantage
A simple model for tennisSummary
DataMathematical derivationsImplementation
Data sources.
Tennis results are freely available:www.tennis-data.co.uk.I used all ATP matches from 2010 until now.Start estimating player abilities from January 2012.
Robert Johnson Sporting Advantage
A simple model for tennisSummary
DataMathematical derivationsImplementation
Deriving the modelMaximum likelihood estimation
Model inspired by Dixon and Coles (1997) and McHale andMorton (2011).Assume each player has an (unknown) ability parameter θ.Model P(Player i defeats player j) using a logistic functionof θi − θj
P(i defeats j) = 11+exp (−(θi−θj ))
Robert Johnson Sporting Advantage
A simple model for tennisSummary
DataMathematical derivationsImplementation
Likelihood function
L(θ, k = 1, . . . ,n) =n∏
k=1
11+exp (−(θi−θj ))
Player indices i and j in the equation above depend onmatch k .
Robert Johnson Sporting Advantage
A simple model for tennisSummary
DataMathematical derivationsImplementation
Dynamic model
Let’s follow the method of Dixon and Coles (1997) andexponentially downweight older matchesGives us the pseudo-likelihood for each time point t :
Lt(θ;φ) =∏k∈At
{1
1+exp (−(θi−θj ))
}φ(t−tk )
where tk is the time match k is played, At = {k : tk < t}and φ is an exponential downweighting function.
Robert Johnson Sporting Advantage
A simple model for tennisSummary
DataMathematical derivationsImplementation
Implementation
Coded that all up in Python last SundayUsed numpy, pandas, scipyNumerically maximised the pseudo-likelihood usingminimize function in scipy.optimizeFitted on a day by day rolling basisPlotted using matplotlib
Robert Johnson Sporting Advantage
A simple model for tennisSummary
DataMathematical derivationsImplementation
Implementation
Care needs to be taken to use the numpy vectorised operationsfor fitting the model
≈ 4500ms to evaluate a single likelihood using simpleloopsIterating over rows of a pandas DataFrame is a bad idea...≈ 2ms to calculate using numpy vectorised operations
Robert Johnson Sporting Advantage
2012 w 2013 w 2014 w 2015 wTime
1
2
3
4
5
6
Est
imate
d a
bili
ty v
s avera
ge A
TP p
layer
(logit
sca
le)
Estimated tennis ability over time (2012-present)
FedererDjokovicMurrayWawrinka
A simple model for tennisSummary
DataMathematical derivationsImplementation
If you trust these estimates...
There appears to be value going long Federer, shortMurray.Screenshot taken from Betfair yesterday evening (6th July).
Robert Johnson Sporting Advantage
A simple model for tennisSummary
DataMathematical derivationsImplementation
Wimbledon this year...
Don’t bet on this!Betting markets are extremely efficient
This discrepancy is due to factors not included in my model
However model suggests Murray and Federer have ≈equal probabilities of winning WimbledonDjokovic is the clear favourite to win
Robert Johnson Sporting Advantage
A simple model for tennisSummary
Summary
Many interesting applications of Data Science to the worldof sportWe provide consultancy on sports modelling: seewww.sporting-advantage.co.uk
Robert Johnson Sporting Advantage
Appendix Further Reading
Bibliography
M. Dixon and S. Coles.Modelling Association Football Scores and Inefficiencies inthe Football Betting Market.Applied Statistics 46, No. 2, pp. 265–280, 1997.
I. McHale and A. Morton.A Bradley-Terry type model for forecasting tennis matchresults.International Journal of Forecasting, 27:619–630, 2011.
Robert Johnson Sporting Advantage