27
The evolution of hockey statistics – an ongoing story Bruce McCurdy Analytics, Big Data, and the Cloud 2012 April 25

The evolution of hockey statistics – an ongoing story Bruce McCurdy Analytics, Big Data, and the Cloud 2012 April 25

Embed Size (px)

Citation preview

Page 1: The evolution of hockey statistics – an ongoing story Bruce McCurdy Analytics, Big Data, and the Cloud 2012 April 25

The evolution of hockey statistics – an ongoing story

Bruce McCurdyAnalytics, Big Data, and the Cloud2012 April 25

Page 2: The evolution of hockey statistics – an ongoing story Bruce McCurdy Analytics, Big Data, and the Cloud 2012 April 25
Page 3: The evolution of hockey statistics – an ongoing story Bruce McCurdy Analytics, Big Data, and the Cloud 2012 April 25

Traditional game summaries

Page 4: The evolution of hockey statistics – an ongoing story Bruce McCurdy Analytics, Big Data, and the Cloud 2012 April 25

1967-68 Plus/minus formally introduced, as well as individual shots on goal / Shooting %

Page 5: The evolution of hockey statistics – an ongoing story Bruce McCurdy Analytics, Big Data, and the Cloud 2012 April 25

1983-84 Goaltender save percentage added

Grant Fuhr

Grant Fuhr

Page 6: The evolution of hockey statistics – an ongoing story Bruce McCurdy Analytics, Big Data, and the Cloud 2012 April 25

1998-99 Time on ice published, opening the door for rate stats

Chris Pronger

Page 7: The evolution of hockey statistics – an ongoing story Bruce McCurdy Analytics, Big Data, and the Cloud 2012 April 25

1998: NHL introduces Zone Time

… but turfs it in 2002. Why?!

Page 8: The evolution of hockey statistics – an ongoing story Bruce McCurdy Analytics, Big Data, and the Cloud 2012 April 25

1998: NHL starts to (sporadically) maintain Real Time Scoring System (RTSS)

Page 9: The evolution of hockey statistics – an ongoing story Bruce McCurdy Analytics, Big Data, and the Cloud 2012 April 25

…but there remain huge problems due to lack of standardization & rink bias

Oilers have twice as many giveaways as Florida … or do they?

Page 10: The evolution of hockey statistics – an ongoing story Bruce McCurdy Analytics, Big Data, and the Cloud 2012 April 25

• Ranking of teams’ RTSS home and away yields results that might as well be randomized for giveaways and takeaways, and very nearly so for hits and blocked shots.

• Whereas the same exercise for Goals For yields a crudely similar ordering home to away.

• Significant home scorer bias in turnover stats. 45% more giveaways and 33% more takeaways by home teams league-wide!

• As a result RTSS is highly unreliable, serving to rank players within a given team but almost useless for comparing players from different clubs.

Page 11: The evolution of hockey statistics – an ongoing story Bruce McCurdy Analytics, Big Data, and the Cloud 2012 April 25

2002-03: NHL introduces play-by-play reports

… though problems remain with accuracy of some data, e.g. shot distance

Page 12: The evolution of hockey statistics – an ongoing story Bruce McCurdy Analytics, Big Data, and the Cloud 2012 April 25

“Stripping” of PxP data allows detailed on-ice analysis of individual playersEven-strength shots / Fenwick / Corsi from timeonice.com

Page 13: The evolution of hockey statistics – an ongoing story Bruce McCurdy Analytics, Big Data, and the Cloud 2012 April 25

Head-to-head match-ups (timeonice.com)

Page 14: The evolution of hockey statistics – an ongoing story Bruce McCurdy Analytics, Big Data, and the Cloud 2012 April 25

Customizable, sortable stats from behindthenet.ca

Available stats: Even strength / powerplay / shorthandedScoring per 60 minutesOn/off ice plus/minus per 60On/off ice shots / Fenwick / Corsi per 60On-ice Sh% / Sv% / PDOQualComp / QualTeamPenalties drawn / takenZoneStart / ZoneFinish

Page 15: The evolution of hockey statistics – an ongoing story Bruce McCurdy Analytics, Big Data, and the Cloud 2012 April 25

• Many stats need to be parsed in terms of positive / negative /neutral game states, e.g.:

• Leading / trailing / tied (score effects are HUGELY important)

• PP / PK / EV • O-zone / D-zone / neutral zone

• Taken in isolation without context, modern stats will be distorted; e.g. “soft minutes” players used in offensive situations should be expected to have positive numbers in things like Relative Corsi

Page 16: The evolution of hockey statistics – an ongoing story Bruce McCurdy Analytics, Big Data, and the Cloud 2012 April 25

"A chance is counted any time a team directs a shot cleanly on-net from within home-plate. Shots on goal and misses are counted, but blocked shots are not (unless the

player who blocks the shot is “acting like a goaltender”). Generally

speaking, we are more generous with the boundaries of home-plate if there is dangerous puck movement immediately preceding the scoring chance, or if the scoring chance is

screened. If you want to get a visual handle on home-plate,

check this image."

Scoring chances

Page 17: The evolution of hockey statistics – an ongoing story Bruce McCurdy Analytics, Big Data, and the Cloud 2012 April 25
Page 18: The evolution of hockey statistics – an ongoing story Bruce McCurdy Analytics, Big Data, and the Cloud 2012 April 25

One weakness to the current method is that “home plate” isn’t best template for scoring area

Another is that scoring chances are just 1’s and 0’s – no extra weight for first class chances as suggested by heat map colour coding

Page 19: The evolution of hockey statistics – an ongoing story Bruce McCurdy Analytics, Big Data, and the Cloud 2012 April 25

Actually, scoring areas …which vary for different types of shots and manpower situations.

Scoring chance model is greatly simplified from this reality.

Page 20: The evolution of hockey statistics – an ongoing story Bruce McCurdy Analytics, Big Data, and the Cloud 2012 April 25

Common SC errors and outcomes• NHL data doesn’t properly record on-ice players• +1 or -1 for selected players• Scoring chance improperly credited (or missed)• +1 or -1 for 10 players• Scoring chance recorded at wrong game time• +1 or -1 for up to 20 players• Scoring chance recorded but for wrong team• +2 or -2 for 10 players

Page 21: The evolution of hockey statistics – an ongoing story Bruce McCurdy Analytics, Big Data, and the Cloud 2012 April 25

Neilson Numbers

• Based on ideas of Roger Neilson• Assignment of individual responsibility on scoring chances for

and against• Requires an extra degree of qualitative judgement over and

above deciding whether a scoring chance has occurred• Eliminates false positives/negatives, however individual numbers

don’t reconcile to team totals• Fewer recording errors than on-ice scoring chances as players are

identified as part of the process• Same system can be used to assign unofficial assists on GF or

errors on GA• Reliant on a knowledgeable scorer, but as with other scoring

chance systems, would work better if 3 or 5 scorers worked independently, then pooled results.

Page 22: The evolution of hockey statistics – an ongoing story Bruce McCurdy Analytics, Big Data, and the Cloud 2012 April 25

Sample box:

Page 23: The evolution of hockey statistics – an ongoing story Bruce McCurdy Analytics, Big Data, and the Cloud 2012 April 25

Zone Start:fad or trend?

Page 24: The evolution of hockey statistics – an ongoing story Bruce McCurdy Analytics, Big Data, and the Cloud 2012 April 25

Possession

• “Hockey is a transition game: offense to defense, defense to offense, one team to another. Hundreds of tiny fragments of action, some leading somewhere, most going nowhere. Only one thing is clear. A fragmented game must be played in fragments. Grand designs do not work. … Before offense turns to defense, or defense to offense, there is a moment of disequilibrium when a defense is vulnerable, when a game’s sudden, unexpected swings can be turned to advantage. It is what you do at this moment, when possession changes, that makes the difference.”

• – Ken Dryden, The Game

Page 25: The evolution of hockey statistics – an ongoing story Bruce McCurdy Analytics, Big Data, and the Cloud 2012 April 25

• “It is noteworthy that in general … our teamwork was considerably above our main contenders. In the game against the Canadian team, the players of the USSR squad made 110 passes, while the Canadians made 60 passes; in the game against Czechoslovakia we made 106 passes, they made 70; in the game against Sweden we made 49 more passes than they did. … This is an indication of quite stable habits and a high culture of playing, a correct understanding of the game by the Soviet players.”

• -- Anatoli Tarasov, Road to Olympus

Page 26: The evolution of hockey statistics – an ongoing story Bruce McCurdy Analytics, Big Data, and the Cloud 2012 April 25

Good pass: plus. Bad pass: minus.

Good clearance: plus. Bad clearance: minus.

Good rush: plus. Bad rush: minus.

Good shoot in: plus. Bad shoot in: minus.

Tarasov Numbers

Page 27: The evolution of hockey statistics – an ongoing story Bruce McCurdy Analytics, Big Data, and the Cloud 2012 April 25

…and many more advanced ideas

• Goals Versus Threshold (GVT)• Defence Independent Goalie Rating (DIGR)• Shot Quality (SQF / SQA)• Preditcted Goals Scored (PGS)• Zone Start Adjusted Corsi (ZSAC)• Etc. …• No time to do them all justice here• Thanks for listening!