1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

Preview:

Citation preview

1

An Intelligence Approach to Evaluation of Sports

Teams by

Edward Kambour, Ph.D.

AgendaAgenda

I. College FootballII. Linear ModelIII. Generalized Linear ModelIV. Intelligence (Bayesian) ApproachV. ResultsVI. Other SportsVII. Future Work

General BackgroundGeneral Background

Goals Forecast winners of future games

Beat the Bookie! Estimate the outcome of unscheduled

games What’s the probability that Iowa would have

beaten Ohio St? Generate reasonable rankings

Major College Football Major College Football

No playoff system “Computer rankings” are an element of

the BCS 114 teams 12 games for each in a season

Linear ModelLinear Model

Rothman (1970’s), Harville (1977), Stefani (1977), …, Kambour (1991), …, Sagarin??? Response, Y, is the net result (point-

spread) Parameter, , is the vector of ratings For a game involving teams i and j,

E[Y] = i - j

Linear Model (cont.)Linear Model (cont.)

Let X be a row vector with

E[Y]=X

1 if

1 if

0 otherwise k

k i

X k j

Regression Model NotesRegression Model Notes

Least Squares Normality, Homogeneity

College Football Estimate 100 parameters Sample size for a full season is about 600 Design Matrix is sparse and not full rank

Home-field AdvantageHome-field Advantage

Generic Advantage (Stefani, 1980) Force i to be home team and j the visiting team Add an intercept term to X Adds one more parameter to estimate UAB = Alabama Rice = Texas A&M

Team Specific Advantage Doubles the number of parameters to estimate

Linear Model IssuesLinear Model Issues

Normality Homogeneity Lots of parameters, with relatively

small sample size Overfitting The bookie takes you to the cleaners!

Linear Model Issues (cont.)Linear Model Issues (cont.)

Should we model point differential A and B play twice

A by 34 in first, B by 14 in the second A by 10 each time

Running up the score (or lack thereof) BCS: Thou shalt not use margin of victory

in thy ratings!

Logistic RegressionLogistic Regression

Rothman (1970s) Linear Model Use binary variable

Winning is all that matters Avoid margin of victory Coin Flips

Logistic Regression IssuesLogistic Regression Issues

Still have sample size issues Throw away a lot of information Undefeated teams

TransformationsTransformations

Transform the differentials to normality Power transformations Rothman logistic transform

Transforms points to probabilities for logistic regression

“Diminishing returns” transforms Downweights runaway scores

Power TransformsPower Transforms

Transform the point-spread Y = sign(Z)|Z|a

a = 1 straight margin of victory a = 0 just win baby a = 0 Poisson or Gamma “ish”

Maximum Likelihood Transform

Maximum Likelihood Transform

1995-2002 seasons

MLE = 0.98

Power -2ln(likelihood)

0.1 52487

0.3 41213

0.5 35128

0.67 32597

0.8 31418

1 31193

Predicting the ScorePredicting the Score

Model point differential Y1 = Si – Sj

Additionally model the sum of the points scored Y2 = Si + Sj

Fit a similar linear model (different parameter estimates)

Forecast home and visitors score H = (Y1 + Y2 )/2, V = (Y2 - Y1)/2

Another Transformation IdeaAnother Transformation Idea

Scores (touchdowns or field goals) are arrivals, maybe Poisson Final score = 7 times a Poisson + 3 times

a Poisson + … Transform the scores to homogeneity

and normality first The differences (and sums) should follow

suit

Square Root TransformSquare Root Transform

Since the score is “similar” to a linear combination of Poissons, square root should work

Transformation

Why k? For small Poisson arrival rates, get better

performance (Anscombe, 1948)

T S k

Likelihood TestLikelihood Test

LRT: No transformation vs. square root with fitted k Used College Football results from 1995-

2002 k = 21 Transformation was significantly better

p-value = 0.0023, chi-square = 9.26

Predicting the Score with Transform

Predicting the Score with Transform

Model point differential

Additionally model the sum of the points scored

Forecast home and visitors score H = ((Y1 + Y2 )/2)2 , V = ((Y2 - Y1)/2)2

Note the point differential is the product

1 21 21i jY S S

2 21 21i jY S S

Unresolved Linear Model Issues

Unresolved Linear Model Issues

Overfitting History

Going into the season, we have a good idea as to how teams will do

The best teams tend to stay the best The worst teams tend to stay the worst

Changes happen Kansas State

Intelligence ModelIntelligence Model

Concept The ratings and home-ads for year t are

similar to those of year t-1. There is some drift from one year to the next.

Model 1

2

where

~ N( , )

t t t

t

0

Intelligence Model (Details)Intelligence Model (Details)

Notation L teams M seasons of data Ni games in the ith season

Xi : the Ni by 2L “X” matrix for season i

Yi : the Ni vector of results for season i

i : the Ni vector of results for season I

Details (cont.)Details (cont.)

Data Distribution: For all i = 1, 2, …, M

2, (independent)i i iN Y X

Details (cont.)Details (cont.)

Prior Distribution

2 21

2 21

2

0N ,

0 0.05

0.25 0N , for 2,...,

0 0.01

2,0.5

i i i M

I0

I

I

I

Details (finally, the end)Details (finally, the end)

The Posterior Distribution of M and -2 is closed form and can be calculated by an iterative method

The Predictive Distribution for future results (transformed sum or difference) is straight-forward correlated normal (given the variance)

ForecastsForecasts

For Scores Simply untransform

E[Z2] = Var[Z] + E[Z]2

For the point-spread Product of two normals

Simulate 10000 results

Enhanced ModelEnhanced Model

Fit the prior parameters Hierarchical models Drifts and initial variances No closed form for posterior and predictive

distributions (at least as far as I know) The complete conditionals are straight-forward,

so Gibbs sampling will work (eventually)

Results(www.geocities.com/kambour/football.html)

Results(www.geocities.com/kambour/football.html)

2002 Final RankingsTeam Rating Home

Miami 72.23 (1.03) 0.21 (0.04)

Kansas St 72.04 (1.04) 0.44 (0.03)

USC 71.95 (1.03) 0.04 (0.03)

Oklahoma 71.85 (1.02) 0.18 (0.03)

Texas 71.57 (1.03) 0.36 (0.03)

Georgia 71.49 (1.03) 0.02 (0.03)

Alabama 71.45 (1.03) -0.09 (0.03)

Iowa 71.30 (1.03) 0.21 (0.04)

Florida St 71.29 (1.02) 0.43 (0.03)

Virginia Tech 71.25 (1.03) 0.12 (0.03)

Ohio St 71.18 (1.03) 0.27 (0.03)

ResultsResults

2002 Final RankingsTeam Rating Home

Miami 72.23 0.21

Kansas St 72.04 0.44

USC 71.95 0.04

Oklahoma 71.85 0.18

Texas 71.57 0.36

Georgia 71.49 0.02

Alabama 71.45 -0.09

Iowa 71.30 0.21

Florida St 71.29 0.43

Virginia Tech 71.25 0.12

Ohio St 71.18 0.27

ResultsResults

2002 Final RankingsTeam Rating Home

Miami 72.23 0.21

Kansas St 72.04 0.44

USC 71.95 0.04

Oklahoma 71.85 0.18

Texas 71.57 0.36

Georgia 71.49 0.02

Alabama 71.45 -0.09

Iowa 71.30 0.21

Florida St 71.29 0.43

Virginia Tech 71.25 0.12

Ohio St 71.18 0.27

Bowl PredictionsBowl Predictions

Ohio St 17Miami Fl (-13) 31 0.8255 0.5228

Washington St 21

Oklahoma (-6.5) 31 0.7347 0.5797

Iowa 21

USC (-6) 30 0.7174 0.5721

NC State (E) 20

Notre Dame 17 0.5639 0.5639

Florida St (+4) 24

Georgia 27 0.5719 0.5320

2002 Final Record2002 Final Record

Picking Winners 522 – 157 0.769

Against the Vegas lines 367 – 307 – 5 0.544

Best Bets 9 – 7 0.563 In 2001, 11 - 4

ESPN College Pick’em(http://games.espn.go.com/cpickem/leader)

ESPN College Pick’em(http://games.espn.go.com/cpickem/leader)

1. Barry Schultz 5830 2. Jim Dobbs 5687 3. Michael Reeves 5651 4. Fup Biz 5594 5. Joe * 5587 6. Rising Cream 5562 7. Intelligence Ratings 5559

Ratings System Comparison(http://tbeck.freeshell.org/fb/awards2002.html)

Ratings System Comparison(http://tbeck.freeshell.org/fb/awards2002.html)

Todd Beck Ph.D. Statistician Rush Institute

Intelligence Ratings – Best Predictors

College Football ConclusionsCollege Football Conclusions

Can forecast the outcome of games Capture the random nature

High variability Sparse design

Scientists should avoid BCS Statistical significance is impossible Problem Complexity Other issues

NFLNFL

Similar to College Football Square root transform is applicable Drift is a little higher than College

Football Better design matrix

Small sample size Playoff

NFL Results(www.geocities.com/kambour/NFL.html)

NFL Results(www.geocities.com/kambour/NFL.html)

2002 Final Rankings (after the Super Bowl)Team Rating Home

Tampa Bay 70.72 0.29

Oakland 70.57 0.28

Philadelphia 70.55 0.10

New England 70.16 0.12

Atlanta 70.13 0.20

NY Jets 70.10 -0.01

Pittsburgh 69.95 0.28

Green Bay 69.92 0.28

Kansas City 69.90 0.51

Denver 69.89 0.50

Miami 69.89 0.49

2002 Final NFL Record2002 Final NFL Record

Picking Winners 162 – 104 – 1 0.609

Against the Vegas lines 135 – 128 – 4 0.513

Best Bets 9 – 8 0.529

NFL EuropeNFL Europe

Similar to College and NFL Square root transform Dramatic drift Teams change dramatically in mid-

season Few teams

Better design matrix

College BasketballCollege Basketball

Transform? Much more normal (Central Limit Theorem)

A lot more games Intersectional games

Less emphasis on programs than in College Football More drift

NCAA tournament

NCAA Basketball Pre-tournament Ratings

NCAA Basketball Pre-tournament Ratings

Team Rating Home

Arizona 100.06 3.97

Kentucky 99.33 4.32

Kansas 95.89 3.85

Texas 93.42 4.44

Duke 92.90 4.66

Oklahoma 90.19 4.31

Florida 90.65 3.99

Wake Forest 88.70 3.65

Syracuse 88.50 3.49

Xavier 87.89 3.37

Louisville 87.88 4.16

NBANBA

Similar to College Basketball Normal – No transformation

A lot more games – fewer teams Playoffs are completely different from

regular season Regular season – very balanced, strong

home court Post season – less balanced, home court

lessened

HockeyHockey

Transform Rare events = “Poissonish”

Square root with k around 1

A lot more games History matters Playoffs seem similar to regular season Balance

SoccerSoccer

Similar to hockey Transform

Square root with low k Not a lot of games Friendlys versus cup play Home pitch is pronounced

Varies widely

Soccer ResultsSoccer Results

Correctly forecasted 2002 World Cup final Brazil over Germany

Correctly forecasted US run to quarter-finals

Won the PROS World Cup Soccer Pool

Future EnhancementsFuture Enhancements

Hierarchical Approaches Conferences

More complicated drift models Correlations Individual drifts Drift during the season Mean correcting drift More informative priors

Recommended