17
The Market for English Premier League (EPL) Odds Guanhao Feng, Nicholas G. Polson, Jianeng Xu Booth School of Business, University of Chicago * April 15, 2019 Abstract We develop a probabilistic model to track and forecast real-time odds for English Pre- mier League (EPL) football games. We show how a difference in Poisson processes (a.k.a. Skellam process) provides a dynamic probabilistic model for the evolution of scores. Ex ante, we show how to calibrate expected goal scoring rates using market- based information on win, lose, draw odds. As the game evolves, we use the current score and our Skellam process to calculate the matrix of final score odds. This enables us to identify real time online betting opportunities relative to our model’s predictions. We illustrate our methodology the EPL game between Everton and West Ham and Sun- derland and Leicester City. Both games illustrate the flexibility of our model and how odds can change quickly as the score progresses. Finally, we conclude with directions for future research. Key words: Market Implied Prediction, Odds, Skellam Process, Soccer Betting, English Premier League. * Nicholas G. Polson is Professor of Econometrics and Statistics at the Booth School of Business, University of Chicago. Guanhao Feng is at Booth School of Business, University of Chicago. Jianeng Xu is at Department of Statistics, University of Chicago. 1 arXiv:1604.03614v1 [stat.AP] 12 Apr 2016

The Market for English Premier League (EPL) Odds · betting of soccer spread bets requires bookmakers to dynamically alter the market odds to prevent arbitrage during the course of

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The Market for English Premier League (EPL) Odds · betting of soccer spread bets requires bookmakers to dynamically alter the market odds to prevent arbitrage during the course of

The Market for English Premier League (EPL) Odds

Guanhao Feng, Nicholas G. Polson, Jianeng Xu

Booth School of Business, University of Chicago∗

April 15, 2019

Abstract

We develop a probabilistic model to track and forecast real-time odds for English Pre-

mier League (EPL) football games. We show how a difference in Poisson processes

(a.k.a. Skellam process) provides a dynamic probabilistic model for the evolution of

scores. Ex ante, we show how to calibrate expected goal scoring rates using market-

based information on win, lose, draw odds. As the game evolves, we use the current

score and our Skellam process to calculate the matrix of final score odds. This enables

us to identify real time online betting opportunities relative to our model’s predictions.

We illustrate our methodology the EPL game between Everton and West Ham and Sun-

derland and Leicester City. Both games illustrate the flexibility of our model and how

odds can change quickly as the score progresses. Finally, we conclude with directions

for future research.

Key words: Market Implied Prediction, Odds, Skellam Process, Soccer Betting, English

Premier League.

∗Nicholas G. Polson is Professor of Econometrics and Statistics at the Booth School of Business, Universityof Chicago. Guanhao Feng is at Booth School of Business, University of Chicago. Jianeng Xu is at Departmentof Statistics, University of Chicago.

1

arX

iv:1

604.

0361

4v1

[st

at.A

P] 1

2 A

pr 2

016

Page 2: The Market for English Premier League (EPL) Odds · betting of soccer spread bets requires bookmakers to dynamically alter the market odds to prevent arbitrage during the course of

1 Introduction

1.1 The betting market for the EPL

Gambling on football is a global industry worth anywhere between $700 billion and

$1 trillion a year1. Spread betting, particularly fixed-odds betting, on the outcome of soccer

matches is rapidly growing in popularity and odds are set via online real time betting

market (Betfair, Bet365 etc.).2 Traditional bookmakers, such as Ladbrokes, also offer odds

on various outcomes of a match. For example, bets can be placed on the final outcomes

(win, lose, draw) as well as goals scored at half-time or full-time. A key feature is that the

odds are updated in real time and as such there is great interest in developing probability

models for the evolution of the games score. Stern (1994) and Polson and Stern (2015)

propose a Brownian motion model for the difference in teams scores and also show how

the market’s based information can be used to calculate the implied volatility of a game. We

build on this approach and develop a model that is tailor-made for the discrete evolution

of the scores of a EPL game. Specifically, we develop a probability model based on the

difference of Poisson processes (a.k.a. Skellam process3).

Various probability models have been proposed to predict the outcome of soccer

matches motivated by the demand for assessing betting opportunities. For example, Dixon

and Coles (1997) use a bivariate Poisson process which has been extended in a number of

ways; see, for example, Karlis and Ntzoufras (2009). Another line of research, asks whether

betting markets are efficient and, if not, how to exploit potential inefficiencies in the bet-

ting market. ? examine the hypothesis that sentimental bettors act like noise traders and

can affect the path of prices in football betting markets. Futhermore, Fitt (2009) applies

the efficient portfolio theory to analyzes the mis-pricing of cross-sectional odds. Online

1See ”Football Betting - the Global Gambling Industry worth Billions.” BBC Sport.2Fractional odds are used in UK, while money-line odds are favored by American bookmakers. Fractional

odds of 2:1 (”two-to-one”) would imply that the bettor stands to make a $200 profit on a $100 stake.3See Barndorff-Nielsen and Shephard (2012) for an introduction to Skellam processes.

2

Page 3: The Market for English Premier League (EPL) Odds · betting of soccer spread bets requires bookmakers to dynamically alter the market odds to prevent arbitrage during the course of

betting of soccer spread bets requires bookmakers to dynamically alter the market odds to

prevent arbitrage during the course of a match and Fitt et al. (2005) models the value of

online soccer spread by modeling goals and corners as Poisson processes.

1.2 Connections with Existing Work

Early models of the number of goals scored by each team (see, e.g. Lee (1997)) use

independent Poisson processes. Later models incorporate a correlation between the two

scores and model the number of goals scored by each team using bivariate Poisson models

(see Maher (1982), Dixon and Coles (1997), and Karlis and Ntzoufras (2009)). Our approach

follows Stern (1994), and instead of modeling the number of goals and the correlation be-

tween sores directly, we will model the score difference (a.k.a. margin of victory).

Soccer scores by their very nature are discrete and not that frequent. Hence they

are not adequately modeled by a continuous-time stochastic process. Instead we adapt

the Poisson process and model the difference in scores (a.k.a. Skellam distribution). In

addition, we define a similar implied volatility measure (see Polson and Stern (2015)) from

our discrete dynamic process.

The rest of the paper proceeds as follows. Section 2 presents our Skellam process

model for tracking the difference in goals scored. We then show how an odds matrix of all

the combinations of scores can be computed using Skellam’s cumulative distribution func-

tion. We also calculate a dynamic, time-varying, implied prediction using real-time online

trading market data for the odds of any score and hence win, lose and draw outcomes. Sec-

tion 3 illustrates our methodology using two EPL game in 2015-2016 between Sunderland

and Leicester and Everton and West Ham. Finally, Section 4 discusses extensions of our

basic model and concludes with directions for future research.

3

Page 4: The Market for English Premier League (EPL) Odds · betting of soccer spread bets requires bookmakers to dynamically alter the market odds to prevent arbitrage during the course of

2 Skellam Process for EPL scores

Let the outcome between the two teams A and B be modeled as a difference, N(t) =

NA(t) − NB(t). Here we can interpret N(t) as the lead of home team A over away team

B and NA(t) and NB(t) denotes the scores of two teams at time point t (0 ≤ t ≤ 1) .

Negative values of N(t) therefore indicate that team A is behind. We assume that the

game begins at time zero with N(0) = 0 and ends at time one with N(1) representing

the final score difference (positive if A wins and negative if B wins). For our analysis we

develop a probabilistic specification of the distribution of N(1) and, more generally, given

N(t) = l where l is the current lead, as the game evolves. Given this probabilistic model,

we can determine an implied prediction of the outcome for the whole game. For example,

ex ante P(N(1) > 0) will provide the odds of team A winning. Half-time scores will

be available from the distribution of N(12) and as the game progresses we can calculate

P(N(1) > 0|N(t) = l) where l is the current goal difference.

2.1 Implied Score Prediction from EPL Odds

The Skellam distribution is derived as the difference between two independent Pois-

son variables (see Skellam (1946), Sellers (2012), and Alzaid and Omair (2010)). We now

show how it can be used to model the point spread distribution in those sports with equal

scored points. Lemma 1 of Karlis and Ntzoufras (2009) shows that Skellam distribution is

not restricted to Poisson difference, and can be extended to the difference of distributions

which have a specific trivariate latent variable structure.

We begin with a model which specifies the lead of home team A over away team B at

time t, N(t) = NA(t)− NB(t) where NA(t) ∼ Poisson(λAt) and NB(t) ∼ Poisson(λBt), as

a Skellam random variable,

(N(t)|λA, λB) ∼ Skellam(λAt, λBt).

4

Page 5: The Market for English Premier League (EPL) Odds · betting of soccer spread bets requires bookmakers to dynamically alter the market odds to prevent arbitrage during the course of

The parameters λA and λB denote the strength of two teams and repeat the expected

scoring rates of two teams, respectively. If we view scores of team A and team B at time

t as two independent Poisson process, λA and λB are expected values of two processes at

t = 1. We can use this model to calculate the probability of any specific score difference

P(N(1) = x) at the end of the game.

To derive the winning odds, we use the law of total probability, the probability mass

function of a Skellam random variable is the convolution of two Poisson distributions:

P(N(t) = x|λA, λB) =∞

∑k=0

P(NB(t) = NA(t)− x|NA(t) = k, λB) ∗P(NA(t) = k|λA)

=∞

∑k=max{0,x}

{e−λBt (λBt)k−x

(k− x)!}{e−λAt (λAt)k

k!}

= e−(λA+λB)t∞

∑k=max{0,x}

(λBt)k−x(λAt)k

(k− x)!k!

= e−(λA+λB)t(

λA

λB

)x/2

I|x|(2√

λAλBt)

where Ir(x) is the modified Bessel function of the first kind (For full details, see Alzaid and

Omair (2010)), thus has series representation

Ir(x) =(x

2

)r ∞

∑k=0

(x2/4)k

k!Γ(r + k + 1).

Besides the probability of home team A winning can be easily calculated using the cumu-

lative distribution function,

P(N(1) > 0|λA, λB) =∞

∑x=1

P(N(t) = x|λA, λB).

In practice, we use an upper bound on the number of possible goals since the probability

of an extreme score difference is always negligible. Unlike the Brownian motion model for

the evolution of the outcome in a sports game (Stern (1994), Polson and Stern (2015)), the

5

Page 6: The Market for English Premier League (EPL) Odds · betting of soccer spread bets requires bookmakers to dynamically alter the market odds to prevent arbitrage during the course of

probability of a draw in our setting is not zero. Instead, P(N(1) = 0) is always a positive

number which depends on the sum and product of two parameters λA and λB and thus

the odds of a draw are non-zero. Hence, two evenly matched teams with large λ’s are more

likely to achieve a draw, compared with two evenly matched teams with small λ’s.

We are also interested in the conditional probability of winning as the game pro-

gresses. At time t (0 ≤ t ≤ 1), let NA(t) and NB(t) denote the numbers of goals already

scored by two teams. In other words, the current lead at time t is l = NA(t)− NB(t) and

so N(t) = l = NA(t)− NB(t). With the property of Poisson process, the model updates

the conditional distribution of the final score difference (N(1)|N(t) = l) by noting that

N(1) = N(t) + (N(1)−N(t)). By properties of a Poisson process, N(t) and (N(1)−N(t))

are independent.

Specifically, conditioning on N(t) = l, we have the identity

N(1) = l + Skellam(λA(1− t), λB(1− t)).

From the above expression, we are now in a position to find the conditional distribution

(N(1) = x|N(t) = l) for every time point t of the game given the current socre. Simply

put, we have the time homogeneously condition

P(N(1) = x|λA, λB, N(t) = l) = P(N(1− t) = x− l|λA, λB),

where λA, λB, l are either given by market expectations or are known at time t . This

equation is easy to verify noting that N(1)− N(t) d= N(1− t) and N(1)− N(t) = x− l.

There are two conditional probabilities of interest. First, the chances of home team A

6

Page 7: The Market for English Premier League (EPL) Odds · betting of soccer spread bets requires bookmakers to dynamically alter the market odds to prevent arbitrage during the course of

winning is

P(Home team A wins|λA, λB, N(t) = l) = P(l + Skellam(λB(1− t), λB(1− t)) > 0)

= P(N(1− t) > −l|λA, λB)

= ∑x>−l

e−(λA+λB)(1−t)(

λA

λB

)x/2

I|x|(2√

λAλBλA(1− t)).

Second, the conditional probability of a draw at time t is

P(Draw|λA, λB, N(t) = l) = P(l + Skellam(λA(1− t), λB(1− t)) = 0)

= P(N(1− t) = −l|λA, λB)

= e−(λA+λB)(1−t)(

λA

λB

)−l/2

I|l|(2√

λAλB(1− t)).

By symmetry, the conditional probability at time t of home team A losing is simply

1−P(Win|N(t) = l)−P(Draw|N(t) = l).

Figure 1 is a discrete version of Figure 1 in Polson and Stern (2015), and illustrates how

the different aspects of the model that we have discussed can be visualized with an simula-

tion example for an EPL game between Everton and West Ham (March 5th, 2016). Section

3 provides with more data details. The outcome probability of first half and updated sec-

ond half are given in the left two panels. The top right panel illustrates a simulation-based

approach to visualizing how the model works in the dynamic evolution of score difference.

In the bottom left panel, from half-time onwards, we also simulate a set of possible Monte

Carlo paths to the end of the game. This illustrates the discrete nature of our Skellam

process.

7

Page 8: The Market for English Premier League (EPL) Odds · betting of soccer spread bets requires bookmakers to dynamically alter the market odds to prevent arbitrage during the course of

−5 −4 −3 −2 −1 0 1 2 3 4 5

Probability of Score Difference − Before 1st Half

Score Difference

Pro

babi

lity

(%)

05

1015

2025

West Ham Wins = 23.03% Everton Wins = 57.47%

Draw = 19.50%

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

−6

−5

−4

−3

−2

−1

01

23

45

67

8

Game Simulations − Before 1st Half

Time

Sco

re D

iffer

ence

−5 −4 −3 −2 −1 0 1 2 3 4 5

Probability of Score Difference − Before 2nd Half

Score Difference

Pro

babi

lity

(%)

05

1015

2025

3035

West Ham Wins = 15.74% Everton Wins = 61.08%

Draw = 23.18%

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

−5

−4

−3

−2

−1

01

23

45

Game Simulations − Before 2nd Half

Time

Sco

re D

iffer

ence

Figure 1: The Skellam process model illustrated by winning margin and game simulations.The top left panel shows the outcome distribution using odds data before the game starts.Each bar represents the probability of a specific final score difference, with it’s color corre-sponding to the result of win/lose/draw. Score differences larger than 5 or smaller than -5are not shown. The top right panel shows a set of simulated Skellam process paths for thegame outcome. The bottom row has the two figures updated using odds data available athalf-time.

2.2 Model Calibration

The goal of our analysis is to show that you can use initial market-based odds of

(win, lose, draw) to calibrate our parameters λA and λB. Suppose that the odds ratios for

8

Page 9: The Market for English Premier League (EPL) Odds · betting of soccer spread bets requires bookmakers to dynamically alter the market odds to prevent arbitrage during the course of

all possible final score outcomes are given by a bookmaker. For example, suppose that we

observe that the odds of final score ending with 2-1 is 3/1. In this case, the bookmaker pays

out 3 times the amount staked by the bettor if the final outcome is indeed 2-1. We convert

this odds ratio to the implied probability of final score being 2-1, using the identity

P(N(1) > 0) =1

1 + odds(team A winning).

The implied probability makes the expected winning amount of a bet equals to 0. In this

case, p = 1/(1 + 3) = 1/4 and the expected winning amount is µ = −1 ∗ (1− 1/4) + 3 ∗

1/4 = 0. We denote the odds ratio as odds(2, 1) = 3 and finally get an odds matrix (O)ij. Its

element oij = odds(i− 1, j− 1), i, j = 1, 2, 3... for all possible combinations of final scores.

The sum of resulting probabilities is larger than 1. This phenomenon is standard in

betting markets (see Dixon and Coles (1997) and Polson and Stern (2015)). The ”excess”

probability corresponds to a quantity known as the ”market vig”. For example, if the sum

of all the implied probabilities is 1.1, then the expected profit of the bookmaker is 10%.

To account for this phenomenon, we scale down the probabilities to make sure that the

resulting sum equals to 1 before estimation.

To determine the parameters λA and λB for the remaining game N(1) − N(t), the

odds ratios from a bookmaker should be calibrated by NA(t) and NB(t). For example, if

NA(0.5) = 1, NB(0.5) = 0 and odds(2, 1) = 3 at half time, these observations actually says

that the odds for the second half score being 1-1 is 3. The calibrated odds∗ for N(1)− N(t)

is calculated using the original odds as well as the current scores and given by

odds∗(x, y) = odds(x + NA(t), y + NB(t)).

At time t (0 ≤ t ≤ 1), we calculate the implied conditional probabilities of score

9

Page 10: The Market for English Premier League (EPL) Odds · betting of soccer spread bets requires bookmakers to dynamically alter the market odds to prevent arbitrage during the course of

differences using odds information

P(N(1) = k|N(t) = l) = P(N(1)− N(t) = k− l) = c ∑i−j=k−l

11 + odds∗(i, j)

where c = ∑k P(N(1) = k|N(t) = l) is a scale factor, l = NA(t) − NB(t), i, j ≥ 0 and

k = 0,±1,±2 . . ..

Again the property of Poisson distribution makes it easy to derive the moments of a

Skellam random variable with parameters λA and λB. The unconditional mean is given by

E(X) = λA − λB and the variance is V(X) = λA + λB. Therefore the conditional moments

in our case is given by

E(N(1)|N(t) = l) = l + (λA − λB)(1− t)

V(N(1)|N(t) = l) = (λA + λB)(1− t)

The above implied probabilities don’t necessarily ensure that E(N(1)|N(t) = l) − l ≤

V(N(1)|N(t) = l), which may lead to negative MM estimates of λ’s. To address this issue,

we calibrate parameters by solving the following optimization problem

(λA, λB)t = arg minλA,λB

{D2V + D2

E}+ γ{(λA)2− + (λB)

2−, }

where we define

DV =V(N(1)|N(t) = l)

1− t− (λA + λB),

DE =E(N(1)|N(t) = l)− l

1− t− (λA − λB).

In addition, E and V are calculated using implied conditional probabilities. We pick γ to

be a large penalty in order to stabilize our estimates across many goals.

10

Page 11: The Market for English Premier League (EPL) Odds · betting of soccer spread bets requires bookmakers to dynamically alter the market odds to prevent arbitrage during the course of

2.3 Implied Volatility

Following Polson and Stern (2015), we can define a discrete version of the implied

volatility of the games outcome as simply

σIV,t =√(λA + λB)(1− t)

The market produces information about λA and λB and therefore for σIV . Any goal scored

is a discrete Poisson shock to the expected score difference (Skellam process) between the

teams. An equivalent model used in option pricing is the jump model of Merton (1976).

In our application, we illustrate the path of implied volatility throughout the course of the

game. We now turn to an empirical illustration of our methodology.

3 Applications

3.1 Everton vs West Ham

We collect the real time online betting odds data from ladbrokes.com for an EPL game

between Everton and West Ham. Table 1 shows how the raw data of odds right before the

game. We need to transform odds to probability. For example, for the outcome 0-0, 11/1

is equivalent to a probability of 1/12. Then we can calculate the marginal probability of

every score difference from -4 to 5 of the game.

Table 2 shows our model implied probability for outcome of score differences before

the game. Different from independent Poisson modeling in Dixon and Coles (1997), our

model is more flexible with the correlation between two teams. The trade-off of flexibility

is that we only know the probability of score difference instead of the exact scores.

Finally we can plot these probability paths in Figure 2 to examine the behavior of

the two teams and predict the final result. By recording real time online betting data

11

Page 12: The Market for English Premier League (EPL) Odds · betting of soccer spread bets requires bookmakers to dynamically alter the market odds to prevent arbitrage during the course of

Home \Away 0 1 2 3 4 5

0 11/1 12/1 28/1 66/1 200/1 450/11 13/2 6/1 14/1 40/1 100/1 350/12 7/1 7/1 14/1 40/1 125/1 225/13 11/1 11/1 20/1 50/1 125/1 275/14 22/1 22/1 40/1 100/1 250/1 500/15 50/0 50/1 90/1 150/1 400/16 100/1 100/1 200/1 250/17 250/1 275/1 375/18 325/1 475/1

Table 1: Original Odds Data from Ladbrokes

Score difference -4 -3 -2 -1 0 1 2 3 4 5

Probability(%) estimate 0.78 2.50 6.47 13.02 19.50 21.08 16.96 10.61 5.37 2.27

Table 2: Empirical estimates of score difference probability (Everton - West Ham)

for every 10 minutes, we can show the evolution of outcome prediction.4 Probability of

win/draw/loss jump for important events in the game: goals scoring and red card penalty.

In such a dramatic game between Everton and West Ham, the winning probability of Ev-

erton arrives almost 90% until the goal in 78th minutes of West Ham. The last-gasp goal

of West Ham officially kills the game and reverses the probability. Moreover, the implied

volatility path is the visualization of the conditional variation of score difference. There

was a jump of implied volatility when Everton lost a player by a red card penalty.

3.2 Sunderland vs Leicester City

Leicester City started the 2015-2016 season as a 5000-1 underdog to win the premier

league. Hence, we take a game from April 9th, 2016 near the end of the season in Figure 3.

4In the example, we ignore the overtime of both 1st half and 2nd half of the game.

12

Page 13: The Market for English Premier League (EPL) Odds · betting of soccer spread bets requires bookmakers to dynamically alter the market odds to prevent arbitrage during the course of

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

010

2030

4050

6070

8090

100

Everton vs. West Ham (SAT, 05 MAR 2016)

Time

Pro

babi

lity

(%)

Impl

ied

Vol

atili

ty

00.

20.

40.

60.

81

1.2

1.4

1.6

1.8

2

Implied Volatility Everton Wins West Ham Wins Draw

Goal:W

Goal:E Goal:E

Goal:W

Goal:W

Red Card:E

Half Time

Figure 2: The betting data of EPL game between Everton and West Ham is fromladbrokes.com. Market implied probabilities (expressed as percentages) for 3 differentresults (Everton wins, West Ham wins and draw) are marked by 3 different colors, whichvary dynamically as the game proceeds. The black solid line shows the evolution of theimplied volatility. The dashed line shows important events in the game, such as goals andred cards. 5 goals in this game are: 13’ Everton, 56’ Everton, 78’ West Ham, 81’ West Hamand 90’ West Ham.

4 Discussion

In this paper, we develop a model that allows for discreteness of goals scored in a

football game. Our Skellam model is also valid for low-scoring sports such as baseball,

hockey or soccer which are categorized by a series of discrete scoring events. Our model

has the advantage of not considering correlation between goals scored by both teams and

the disadvantage of ignoring the sum of goals. On the other hand, for high-scoring sports

13

Page 14: The Market for English Premier League (EPL) Odds · betting of soccer spread bets requires bookmakers to dynamically alter the market odds to prevent arbitrage during the course of

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

010

2030

4050

6070

8090

100

Sunderland vs Leicester City (SUN, 10 APR 2016)

Time

Pro

babi

lity

(%)

Impl

ied

Vol

atili

ty

00.

20.

40.

60.

81

1.2

1.4

1.6

1.8

2

Implied Volatility Sunderland Wins Leicester City Wins Draw

Goal:L Goal:LHalf Time

Figure 3: 2 goals in this game are: 66’ Leicester City and 95’ Leicester City.

such as basketball, the Brownian motion adopted by Stern (1994) is more applicable. In

addition, Rosenfeld (2012) provides an extension of the model to address concerns of non-

normality and uses a logistic distribution to estimate the relative contribution of the lead

and the remaining advantage. One avenue for future research, is to extend the Skellam

assumption to allow for the jumpiness for NFL football which is somewhere in between

these two extremes (see Glickman and Stern (2012), Polson and Stern (2015) and Rosenfeld

(2012) for examples.)

Another area of future research, is studying the index betting. For example, a soccer

games includes total goals scored in match and margin of superiority (see Jackson (1994)).

The latter is the score difference in our model and so the Skellam process directly applies.

14

Page 15: The Market for English Premier League (EPL) Odds · betting of soccer spread bets requires bookmakers to dynamically alter the market odds to prevent arbitrage during the course of

Based on our forecasting model, we can test the inefficiency of EPL sports betting

from a statistical arbitrage viewpoint. For example, Dixon and Pope (2004) presents a de-

tailed comparison of odds set by different bookmakers in relation to the Poisson model

predictions. An early study of “hot hand of market belief” by Camerer (1989) finds that

extreme underdog teams that during a long losing streak are under-priced by the market.

Also, Golec and Tamarkin (1991) test the NFL and college betting markets and find bets on

underdogs or home teams win more often than bets on favorites or visiting teams. Gray

and Gray (1997) examine the in-sample and out-of-sample performance of different NFL

betting strategies by the probit model. They also find the strategy of betting on home-team

underdogs averages returns of over 4 percent, in excess of commissions. In summary, a

Skellam process appears to fit the dynamics of EPL football betting very well and produces

a natural lens to view these market efficiency questions.

15

Page 16: The Market for English Premier League (EPL) Odds · betting of soccer spread bets requires bookmakers to dynamically alter the market odds to prevent arbitrage during the course of

References

Alzaid, A. A. and Omair, M. A. (2010). On the Poisson Difference Distribution Inference

and Applications. Bulletin of the Malaysian Mathematical Sciences Society.

Barndorff-Nielsen, O. E. and Shephard, N. (2012). Basics of Levy processes. Draft Chapter

from a book by the authors on Levy Driven Volatility Models.

Camerer, C. F. (1989). Does the Basketball Market Believe in theHot Hand,’? The American

Economic Review.

Dixon, M. J. and Coles, S. G. (1997). Modelling Association Football Scores and Inefficien-

cies in the Football Betting Market. Journal of the Royal Statistical Society. Series C (Applied

Statistics), 46(2):265–280.

Dixon, M. J. and Pope, P. F. (2004). The Value of Statistical Forecasts in the UK Association

Football Betting Market. International Journal of Forecasting, 20(4):697–711.

Fitt, A. D. (2009). Markowitz Portfolio Theory for Soccer Spread Betting. IMA Journal of

Management Mathematics, 20(2):167–184.

Fitt, A. D., Howls, C. J., and Kabelka, M. (2005). Valuation of Soccer Spread Bets. Journal of

the Operational Research Society, 57(8):975–985.

Glickman, M. E. and Stern, H. S. (2012). A State-Space Model for National Football League

Scores. Journal of the American Statistical Association, 93(441):25–35.

Golec, J. and Tamarkin, M. (1991). The Degree of Inefficiency in the Football Betting Market:

Statistical Tests. Journal of financial economics, 30(2):311–323.

Gray, P. K. and Gray, S. F. (1997). Testing Market Efficiency: Evidence From The NFL Sports

Betting Market. The Journal of Finance, 52(4):1725–1737.

Jackson, D. A. (1994). Index Betting on Sports. The Statistician, 43(2):309.

16

Page 17: The Market for English Premier League (EPL) Odds · betting of soccer spread bets requires bookmakers to dynamically alter the market odds to prevent arbitrage during the course of

Karlis, D. and Ntzoufras, I. (2009). Bayesian Modelling of Football Outcomes: Using the

Skellam’s Distribution for the Goal Difference. IMA Journal of Management Mathematics,

20(2):133–145.

Lee, A. J. (1997). Modeling Scores in the Premier League: Is Manchester United Reallythe

Best? Chance, 10(1):15–19.

Maher, M. J. (1982). Modelling Association Football Scores. Statistica Neerlandica, 36(3):109–

118.

Merton, R. C. (1976). Option Pricing when Underlying Stock Returns are Discontinuous.

Journal of financial economics, 3(1-2):125–144.

Polson, N. G. and Stern, H. S. (2015). The Implied Volatility of a Sports Game. Journal of

Quantitative Analysis in Sports, 11(2):145–153.

Rosenfeld, J. W. (2012). An In-game Win Probability Model of the NBA.

Sellers, K. F. (2012). A Distribution Describing Differences in Count Data Containing Com-

mon Dispersion Levels. 7(3):35–46.

Skellam, J. G. (1946). The Frequency Distribution of the Difference Between Two Poisson

Variates Belonging to Different Populations. Journal of the Royal Statistical Society, 109(3).

Stern, H. S. (1994). A Brownian Motion Model for the Progress of Sports Scores. Journal of

the American Statistical Association, 89(427):1128.

17