102
A Look at Draft By Numbers: Using Data and Analytics to Improve NHL Player Selection Michael Schuckers St. Lawrence Univ., Canton, NY Statistical Sports Consulting, LLC [email protected] @SchuckersM @EmpiricalSports May 2, 2020 c M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 1 / 35

A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

A Look at Draft By Numbers: Using Data andAnalytics to Improve NHL Player Selection

Michael Schuckers

St. Lawrence Univ., Canton, NYStatistical Sports Consulting, LLC

[email protected]

@SchuckersM

@EmpiricalSports

May 2, 2020

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 1 / 35

Page 2: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Hockey Analytics Night in Where?

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 2 / 35

Page 3: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation & Goals

NHL Player Selection

If you look at statistics and point to a column and say, ’We’redrafting this guy’ — have fun. I hope you’re in my division.- Brian Burke, MIT Sloan Sports Analytics Conference, 2013

Source: Article by Dave Feschuk, The Star, March 1, 2013

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 3 / 35

Page 4: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation & Goals

Primary Goals

Paper Take historical data publicly available to teams at the time ofthe draft and build a model to compare with how teams did. Can wedraft better than team’s with just analytics?

Talk Walk through some of the methods used in the paper,particularly Generalized Additive Models (GAM’s) and interactions.

Version of the Paper

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 4 / 35

Page 5: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation & Goals

Primary Goals

Paper Take historical data publicly available to teams at the time ofthe draft and build a model to compare with how teams did. Can wedraft better than team’s with just analytics?

Talk Walk through some of the methods used in the paper,particularly Generalized Additive Models (GAM’s) and interactions.

Version of the Paper

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 4 / 35

Page 6: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation & Goals

Primary Goals

Paper Take historical data publicly available to teams at the time ofthe draft and build a model to compare with how teams did. Can wedraft better than team’s with just analytics?

Talk Walk through some of the methods used in the paper,particularly Generalized Additive Models (GAM’s) and interactions.

Version of the Paper

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 4 / 35

Page 7: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation & Goals

Primary Goals

Paper Take historical data publicly available to teams at the time ofthe draft and build a model to compare with how teams did. Can wedraft better than team’s with just analytics?

Talk Walk through some of the methods used in the paper,particularly Generalized Additive Models (GAM’s) and interactions.

Version of the Paper

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 4 / 35

Page 8: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation & Goals

NHL Entry Draft

Annually at the end of June over two days

Method by which newly eligible players are allocated to NHL teams

Eligible players: 18 years old on or before September 15 and not olderthan 20 years old before December 31 (Worldwide)

Each team (of 31) starts with a pick in each of 7 rounds; picks aretradeable

Team with the worst record has best chance at first pick in first round(lottery among non-playoff teams)

After 1st round, picks are made in order of reg. season finish

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 5 / 35

Page 9: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation

Motivation the 2011 NHL Draft

The 2011 NHL Draft, first six picks

1 EDM: Ryan Nugent-Hopkins (F)

2 COL: Gabriel Landeskog (F)

3 FLA: Jonathan Huberdeau (F)

4 NJD: Adam Larsson (D)

5 NYI: Ryan Strome (F)

6 OTT: Mika Zibanejad (F)

7 WPG: ????

Who should WPG select? Consensus Available (in order):Sean Couturier (F), Dougie Hamilton (D), Ryan Murphy (D),Sven Baertschi (F), Nathan Beaulieu (D), Duncan Siemens (D), MarkScheifele (F)

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 6 / 35

Page 10: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation

Motivation the 2011 NHL Draft

The 2011 NHL Draft, first six picks

1 EDM: Ryan Nugent-Hopkins (F)

2 COL: Gabriel Landeskog (F)

3 FLA: Jonathan Huberdeau (F)

4 NJD: Adam Larsson (D)

5 NYI: Ryan Strome (F)

6 OTT: Mika Zibanejad (F)

7 WPG: ????

Who should WPG select? Consensus Available (in order):Sean Couturier (F), Dougie Hamilton (D), Ryan Murphy (D),Sven Baertschi (F), Nathan Beaulieu (D), Duncan Siemens (D), MarkScheifele (F)

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 6 / 35

Page 11: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation

Motivation the 2011 NHL Draft

The 2011 NHL Draft, first six picks

1 EDM: Ryan Nugent-Hopkins (F)

2 COL: Gabriel Landeskog (F)

3 FLA: Jonathan Huberdeau (F)

4 NJD: Adam Larsson (D)

5 NYI: Ryan Strome (F)

6 OTT: Mika Zibanejad (F)

7 WPG: ????

Who should WPG select? Consensus Available (in order):Sean Couturier (F), Dougie Hamilton (D), Ryan Murphy (D),Sven Baertschi (F), Nathan Beaulieu (D), Duncan Siemens (D), MarkScheifele (F)

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 6 / 35

Page 12: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation

Motivation the 2011 NHL Draft

The 2011 NHL Draft, first six picks

1 EDM: Ryan Nugent-Hopkins (F)

2 COL: Gabriel Landeskog (F)

3 FLA: Jonathan Huberdeau (F)

4 NJD: Adam Larsson (D)

5 NYI: Ryan Strome (F)

6 OTT: Mika Zibanejad (F)

7 WPG: ????

Who should WPG select? Consensus Available (in order):Sean Couturier (F), Dougie Hamilton (D), Ryan Murphy (D),Sven Baertschi (F), Nathan Beaulieu (D), Duncan Siemens (D), MarkScheifele (F)

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 6 / 35

Page 13: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation

Motivation the 2011 NHL Draft

The 2011 NHL Draft, first six picks

1 EDM: Ryan Nugent-Hopkins (F)

2 COL: Gabriel Landeskog (F)

3 FLA: Jonathan Huberdeau (F)

4 NJD: Adam Larsson (D)

5 NYI: Ryan Strome (F)

6 OTT: Mika Zibanejad (F)

7 WPG: ????

Who should WPG select? Consensus Available (in order):Sean Couturier (F), Dougie Hamilton (D), Ryan Murphy (D),Sven Baertschi (F), Nathan Beaulieu (D), Duncan Siemens (D), MarkScheifele (F)

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 6 / 35

Page 14: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation

Motivation the 2011 NHL Draft

The 2011 NHL Draft, first six picks

1 EDM: Ryan Nugent-Hopkins (F)

2 COL: Gabriel Landeskog (F)

3 FLA: Jonathan Huberdeau (F)

4 NJD: Adam Larsson (D)

5 NYI: Ryan Strome (F)

6 OTT: Mika Zibanejad (F)

7 WPG: ????

Who should WPG select? Consensus Available (in order):Sean Couturier (F), Dougie Hamilton (D), Ryan Murphy (D),Sven Baertschi (F), Nathan Beaulieu (D), Duncan Siemens (D), MarkScheifele (F)

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 6 / 35

Page 15: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation

Motivation the 2011 NHL Draft

The 2011 NHL Draft, first six picks

1 EDM: Ryan Nugent-Hopkins (F)

2 COL: Gabriel Landeskog (F)

3 FLA: Jonathan Huberdeau (F)

4 NJD: Adam Larsson (D)

5 NYI: Ryan Strome (F)

6 OTT: Mika Zibanejad (F)

7 WPG: ????

Who should WPG select? Consensus Available (in order):Sean Couturier (F), Dougie Hamilton (D), Ryan Murphy (D),Sven Baertschi (F), Nathan Beaulieu (D), Duncan Siemens (D), MarkScheifele (F)

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 6 / 35

Page 16: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation

Motivation the 2011 NHL Draft

The 2011 NHL Draft, first six picks

1 EDM: Ryan Nugent-Hopkins (F)

2 COL: Gabriel Landeskog (F)

3 FLA: Jonathan Huberdeau (F)

4 NJD: Adam Larsson (D)

5 NYI: Ryan Strome (F)

6 OTT: Mika Zibanejad (F)

7 WPG: ????

Who should WPG select? Consensus Available (in order):

Sean Couturier (F), Dougie Hamilton (D), Ryan Murphy (D),Sven Baertschi (F), Nathan Beaulieu (D), Duncan Siemens (D), MarkScheifele (F)

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 6 / 35

Page 17: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation

Motivation the 2011 NHL Draft

The 2011 NHL Draft, first six picks

1 EDM: Ryan Nugent-Hopkins (F)

2 COL: Gabriel Landeskog (F)

3 FLA: Jonathan Huberdeau (F)

4 NJD: Adam Larsson (D)

5 NYI: Ryan Strome (F)

6 OTT: Mika Zibanejad (F)

7 WPG: ????

Who should WPG select? Consensus Available (in order):

Sean Couturier (F), Dougie Hamilton (D), Ryan Murphy (D),Sven Baertschi (F), Nathan Beaulieu (D), Duncan Siemens (D), MarkScheifele (F)

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 6 / 35

Page 18: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation

Motivation the 2011 NHL Draft

The 2011 NHL Draft, first six picks

1 EDM: Ryan Nugent-Hopkins (F)

2 COL: Gabriel Landeskog (F)

3 FLA: Jonathan Huberdeau (F)

4 NJD: Adam Larsson (D)

5 NYI: Ryan Strome (F)

6 OTT: Mika Zibanejad (F)

7 WPG: ????

Who should WPG select? Consensus Available (in order):Sean Couturier (F), Dougie Hamilton (D), Ryan Murphy (D),Sven Baertschi (F), Nathan Beaulieu (D), Duncan Siemens (D), MarkScheifele (F)

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 6 / 35

Page 19: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation

Motivation the 2011 NHL Draft

The 2011 NHL Draft, first six picks

1 EDM: Ryan Nugent-Hopkins (F)

2 COL: Gabriel Landeskog (F)

3 FLA: Jonathan Huberdeau (F)

4 NJD: Adam Larsson (D)

5 NYI: Ryan Strome (F)

6 OTT: Mika Zibanejad (F)

7 WPG: Mark Scheifele (F)

Who should WPG select? Consensus Available (in order):Sean Couturier (F), Dougie Hamilton (D), Ryan Murphy (D),Sven Baertschi (F), Nathan Beaulieu (D), Duncan Siemens (D), MarkScheifele (F)

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 7 / 35

Page 20: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation

Motivation the 2011 NHL Draft

The 2011 NHL Draft, first six picks

1 EDM: Ryan Nugent-Hopkins (F)

2 COL: Gabriel Landeskog (F)

3 FLA: Jonathan Huberdeau (F)

4 NJD: Adam Larsson (D)

5 NYI: Ryan Strome (F)

6 OTT: Mika Zibanejad (F)

7 WPG: Mark Scheifele (F)

8 PHL: Sean Couturier (F)

9 BOS: Dougie Hamilton (D)

10 MIN: Jonas Brodin (D)

11 COL: Duncan Siemens (D)

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 8 / 35

Page 21: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation

Motivation the 2011 NHL Draft

Consensus that Scheifele pick was a moderate surprise

Some analytics about Couturier >> Scheifele

Post draft: Couturier had 77, 46 GP (Games Played) 1st 2 seasons

Post draft: Scheifele had 7, 4 GP 1st 2 seasons

Today after 8+ seasons: Couturier has 647 GP, 402 Pts, 156 Goals

Today after 8+ seasons: Scheifele has 519 GP, 444 Pts, 180 Goals

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 9 / 35

Page 22: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation

Motivation the 2011 NHL Draft

Consensus that Scheifele pick was a moderate surprise

Some analytics about Couturier >> Scheifele

Post draft: Couturier had 77, 46 GP (Games Played) 1st 2 seasons

Post draft: Scheifele had 7, 4 GP 1st 2 seasons

Today after 8+ seasons: Couturier has 647 GP, 402 Pts, 156 Goals

Today after 8+ seasons: Scheifele has 519 GP, 444 Pts, 180 Goals

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 9 / 35

Page 23: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation

Motivation the 2011 NHL Draft

Consensus that Scheifele pick was a moderate surprise

Some analytics about Couturier >> Scheifele

Post draft: Couturier had 77, 46 GP (Games Played) 1st 2 seasons

Post draft: Scheifele had 7, 4 GP 1st 2 seasons

Today after 8+ seasons: Couturier has 647 GP, 402 Pts, 156 Goals

Today after 8+ seasons: Scheifele has 519 GP, 444 Pts, 180 Goals

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 9 / 35

Page 24: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation

Motivation the 2011 NHL Draft

Consensus that Scheifele pick was a moderate surprise

Some analytics about Couturier >> Scheifele

Post draft: Couturier had 77, 46 GP (Games Played) 1st 2 seasons

Post draft: Scheifele had 7, 4 GP 1st 2 seasons

Today after 8+ seasons: Couturier has 647 GP, 402 Pts, 156 Goals

Today after 8+ seasons: Scheifele has 519 GP, 444 Pts, 180 Goals

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 9 / 35

Page 25: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation

Motivation the 2011 NHL Draft

Consensus that Scheifele pick was a moderate surprise

Some analytics about Couturier >> Scheifele

Post draft: Couturier had 77, 46 GP (Games Played) 1st 2 seasons

Post draft: Scheifele had 7, 4 GP 1st 2 seasons

Today after 8+ seasons: Couturier has 647 GP, 402 Pts, 156 Goals

Today after 8+ seasons: Scheifele has 519 GP, 444 Pts, 180 Goals

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 9 / 35

Page 26: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation

Motivation the 2011 NHL Draft

Consensus that Scheifele pick was a moderate surprise

Some analytics about Couturier >> Scheifele

Post draft: Couturier had 77, 46 GP (Games Played) 1st 2 seasons

Post draft: Scheifele had 7, 4 GP 1st 2 seasons

Today after 8+ seasons: Couturier has 647 GP, 402 Pts, 156 Goals

Today after 8+ seasons: Scheifele has 519 GP, 444 Pts, 180 Goals

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 9 / 35

Page 27: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation

Motivation the 2011 NHL Draft

Illustrate some themes:

Evaluation of players is hard

and it takes time (players develop at different rates from 18yo)

Metrics for assessment are not simple

nor single agreed response metric

In short, noisy and sparse data with slow feedback and

(to teams) results incredibly important

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 10 / 35

Page 28: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation

Motivation the 2011 NHL Draft

Illustrate some themes:

Evaluation of players is hard

and it takes time (players develop at different rates from 18yo)

Metrics for assessment are not simple

nor single agreed response metric

In short, noisy and sparse data with slow feedback and

(to teams) results incredibly important

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 10 / 35

Page 29: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation

Motivation the 2011 NHL Draft

Illustrate some themes:

Evaluation of players is hard

and it takes time (players develop at different rates from 18yo)

Metrics for assessment are not simple

nor single agreed response metric

In short, noisy and sparse data with slow feedback and

(to teams) results incredibly important

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 10 / 35

Page 30: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation

Motivation the 2011 NHL Draft

Illustrate some themes:

Evaluation of players is hard

and it takes time (players develop at different rates from 18yo)

Metrics for assessment are not simple

nor single agreed response metric

In short, noisy and sparse data with slow feedback and

(to teams) results incredibly important

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 10 / 35

Page 31: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation

Motivation the 2011 NHL Draft

Illustrate some themes:

Evaluation of players is hard

and it takes time (players develop at different rates from 18yo)

Metrics for assessment are not simple

nor single agreed response metric

In short, noisy and sparse data with slow feedback and

(to teams) results incredibly important

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 10 / 35

Page 32: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation

Motivation the 2011 NHL Draft

Illustrate some themes:

Evaluation of players is hard

and it takes time (players develop at different rates from 18yo)

Metrics for assessment are not simple

nor single agreed response metric

In short, noisy and sparse data with slow feedback and

(to teams) results incredibly important

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 10 / 35

Page 33: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation

NHL Player Selection

Desjardins (2004-), NHL League Equivalencies (NHLe),

Estimate ρ` where YNHL,t = ρ`Y`,t−1

ratio estimator

for each league, `, e.g. SEL, OHL, QMJHL, NCAA

where Y is Points generally

t is year, t − 1 previous year

can be adjusted further by TOI

allows a measure of league quality

need amount of data

Extended by Rob Vollman, added by age ρ`,age

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 11 / 35

Page 34: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation

NHLe from 2014-15

source:

http://www.hockeyabstract.com/thoughts/updatedtranslationfactors

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 12 / 35

Page 35: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Motivation

NHL Player Selection

Lawrence and Weissbock (2015), Prospect Cohort Success

Nearest neighbors approach

Inputs: League, Age, Points, Height

Generated comparable players (cohort)

Pct Success (> 200 GP) among cohort

Out of sample testing outperformed teams

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 13 / 35

Page 36: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Data

Sources of Information

Response/Outcome metrics (eg TOI in NHL)

Demographics (eg Height and Weight)

Pre-Draft Performance (eg PPG, SVPct)

Scouting (via Central Scouting Service)

Evaluation of draft picks takes time esp. in Hockey

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 14 / 35

Page 37: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Data

Sources of Information

Response/Outcome metrics (eg TOI in NHL)

Demographics (eg Height and Weight)

Pre-Draft Performance (eg PPG, SVPct)

Scouting (via Central Scouting Service)

Evaluation of draft picks takes time esp. in Hockey

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 14 / 35

Page 38: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Data

Sources of Information

Response/Outcome metrics (eg TOI in NHL)

Demographics (eg Height and Weight)

Pre-Draft Performance (eg PPG, SVPct)

Scouting (via Central Scouting Service)

Evaluation of draft picks takes time esp. in Hockey

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 14 / 35

Page 39: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Data

Sources of Information

Response/Outcome metrics (eg TOI in NHL)

Demographics (eg Height and Weight)

Pre-Draft Performance (eg PPG, SVPct)

Scouting (via Central Scouting Service)

Evaluation of draft picks takes time esp. in Hockey

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 14 / 35

Page 40: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Data

Sources of Information

Response/Outcome metrics (eg TOI in NHL)

Demographics (eg Height and Weight)

Pre-Draft Performance (eg PPG, SVPct)

Scouting (via Central Scouting Service)

Evaluation of draft picks takes time esp. in Hockey

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 14 / 35

Page 41: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Data

Sources of Information

Response/Outcome metrics (eg TOI in NHL)

Demographics (eg Height and Weight)

Pre-Draft Performance (eg PPG, SVPct)

Scouting (via Central Scouting Service)

Evaluation of draft picks takes time esp. in Hockey

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 14 / 35

Page 42: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Data

Response metrics

Goals:

Available for all players in corpus

Relevant to performance

Comparable across positions

Length of time? Career? First 5 Years?

Choices:

Time on Ice (TOI)

Games Played (GP)

Cumulative First Seven Years per CBA (Schuckers and Argeris, 2015)

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 15 / 35

Page 43: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Data

Response metrics

Goals:

Available for all players in corpus

Relevant to performance

Comparable across positions

Length of time? Career? First 5 Years?

Choices:

Time on Ice (TOI)

Games Played (GP)

Cumulative First Seven Years per CBA (Schuckers and Argeris, 2015)

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 15 / 35

Page 44: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Data

Response metrics

Goals:

Available for all players in corpus

Relevant to performance

Comparable across positions

Length of time? Career? First 5 Years?

Choices:

Time on Ice (TOI)

Games Played (GP)

Cumulative First Seven Years per CBA (Schuckers and Argeris, 2015)

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 15 / 35

Page 45: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Data

Response metrics

Goals:

Available for all players in corpus

Relevant to performance

Comparable across positions

Length of time? Career? First 5 Years?

Choices:

Time on Ice (TOI)

Games Played (GP)

Cumulative First Seven Years per CBA (Schuckers and Argeris, 2015)

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 15 / 35

Page 46: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Data

Response metrics

Goals:

Available for all players in corpus

Relevant to performance

Comparable across positions

Length of time? Career? First 5 Years?

Choices:

Time on Ice (TOI)

Games Played (GP)

Cumulative First Seven Years per CBA (Schuckers and Argeris, 2015)

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 15 / 35

Page 47: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Data

Response metrics

Goals:

Available for all players in corpus

Relevant to performance

Comparable across positions

Length of time? Career? First 5 Years?

Choices:

Time on Ice (TOI)

Games Played (GP)

Cumulative First Seven Years per CBA (Schuckers and Argeris, 2015)

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 15 / 35

Page 48: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Data

Response metrics

Goals:

Available for all players in corpus

Relevant to performance

Comparable across positions

Length of time? Career? First 5 Years?

Choices:

Time on Ice (TOI)

Games Played (GP)

Cumulative First Seven Years per CBA (Schuckers and Argeris, 2015)

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 15 / 35

Page 49: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Data

Response metrics

Goals:

Available for all players in corpus

Relevant to performance

Comparable across positions

Length of time? Career? First 5 Years?

Choices:

Time on Ice (TOI)

Games Played (GP)

Cumulative First Seven Years per CBA (Schuckers and Argeris, 2015)

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 15 / 35

Page 50: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Data

Demographics

Choices:

Height

Weight

Position (C, F, D, G)

and functions of these.As much as possible taken from draft eligible window.

In this NHL.com is pretty good because they don’t update theirsite.

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 16 / 35

Page 51: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Data

Pre-Draft Performance

Needs:

Data has to be available for players in corpus

Available for nearly every league in draft -1 years

Need to know league drafted from

Choices:

PPG

GAA

GP

Leagues(Liiga, NCAA, OHL, Other, QMJHL, Russia, Russia2, USHL,WHL)

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 17 / 35

Page 52: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Data

Scouting Data

Via Central Scouting Player Rankings

NHL Internal Rankings from Central Scouting Service

Four Groups: (Skaters, Goalies) × (North Amer., Europe)

Convert to single list via Iain Fyffe’s Cescin (2011)

Central Scouting Integrator (Cescin)

NA Skaters = 1.35, NA Goalies = 13.25EU Skaters = 6.27, EU Goalies = 38.18

To get Cescin multiply values from above by CSS Rank

J Quinn 2020 #9 Ranked NA Skater would be CESCIN= 9 × 1.35 = 12.15

Link to Fyffe Article, Link to 2020 CSS Rankings

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 18 / 35

Page 53: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Data

Scouting Data

Via Central Scouting Player Rankings

NHL Internal Rankings from Central Scouting Service

Four Groups: (Skaters, Goalies) × (North Amer., Europe)

Convert to single list via Iain Fyffe’s Cescin (2011)

Central Scouting Integrator (Cescin)

NA Skaters = 1.35, NA Goalies = 13.25EU Skaters = 6.27, EU Goalies = 38.18

To get Cescin multiply values from above by CSS Rank

J Quinn 2020 #9 Ranked NA Skater would be CESCIN= 9 × 1.35 = 12.15

Link to Fyffe Article, Link to 2020 CSS Rankings

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 18 / 35

Page 54: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Data

Scouting Data

Via Central Scouting Player Rankings

NHL Internal Rankings from Central Scouting Service

Four Groups: (Skaters, Goalies) × (North Amer., Europe)

Convert to single list via Iain Fyffe’s Cescin (2011)

Central Scouting Integrator (Cescin)

NA Skaters = 1.35, NA Goalies = 13.25EU Skaters = 6.27, EU Goalies = 38.18

To get Cescin multiply values from above by CSS Rank

J Quinn 2020 #9 Ranked NA Skater would be CESCIN= 9 × 1.35 = 12.15

Link to Fyffe Article, Link to 2020 CSS Rankings

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 18 / 35

Page 55: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Data

Scouting Data

Via Central Scouting Player Rankings

NHL Internal Rankings from Central Scouting Service

Four Groups: (Skaters, Goalies) × (North Amer., Europe)

Convert to single list via Iain Fyffe’s Cescin (2011)

Central Scouting Integrator (Cescin)

NA Skaters = 1.35, NA Goalies = 13.25EU Skaters = 6.27, EU Goalies = 38.18

To get Cescin multiply values from above by CSS Rank

J Quinn 2020 #9 Ranked NA Skater would be CESCIN= 9 × 1.35 = 12.15

Link to Fyffe Article, Link to 2020 CSS Rankings

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 18 / 35

Page 56: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Data

Scouting Data

Via Central Scouting Player Rankings

NHL Internal Rankings from Central Scouting Service

Four Groups: (Skaters, Goalies) × (North Amer., Europe)

Convert to single list via Iain Fyffe’s Cescin (2011)

Central Scouting Integrator (Cescin)

NA Skaters = 1.35, NA Goalies = 13.25

EU Skaters = 6.27, EU Goalies = 38.18

To get Cescin multiply values from above by CSS Rank

J Quinn 2020 #9 Ranked NA Skater would be CESCIN= 9 × 1.35 = 12.15

Link to Fyffe Article, Link to 2020 CSS Rankings

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 18 / 35

Page 57: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Data

Scouting Data

Via Central Scouting Player Rankings

NHL Internal Rankings from Central Scouting Service

Four Groups: (Skaters, Goalies) × (North Amer., Europe)

Convert to single list via Iain Fyffe’s Cescin (2011)

Central Scouting Integrator (Cescin)

NA Skaters = 1.35, NA Goalies = 13.25EU Skaters = 6.27, EU Goalies = 38.18

To get Cescin multiply values from above by CSS Rank

J Quinn 2020 #9 Ranked NA Skater would be CESCIN= 9 × 1.35 = 12.15

Link to Fyffe Article, Link to 2020 CSS Rankings

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 18 / 35

Page 58: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Data

Scouting Data

Via Central Scouting Player Rankings

NHL Internal Rankings from Central Scouting Service

Four Groups: (Skaters, Goalies) × (North Amer., Europe)

Convert to single list via Iain Fyffe’s Cescin (2011)

Central Scouting Integrator (Cescin)

NA Skaters = 1.35, NA Goalies = 13.25EU Skaters = 6.27, EU Goalies = 38.18

To get Cescin multiply values from above by CSS Rank

J Quinn 2020 #9 Ranked NA Skater would be CESCIN= 9 × 1.35 = 12.15

Link to Fyffe Article, Link to 2020 CSS Rankings

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 18 / 35

Page 59: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Data

Scouting Data

Via Central Scouting Player Rankings

NHL Internal Rankings from Central Scouting Service

Four Groups: (Skaters, Goalies) × (North Amer., Europe)

Convert to single list via Iain Fyffe’s Cescin (2011)

Central Scouting Integrator (Cescin)

NA Skaters = 1.35, NA Goalies = 13.25EU Skaters = 6.27, EU Goalies = 38.18

To get Cescin multiply values from above by CSS Rank

J Quinn 2020 #9 Ranked NA Skater would be CESCIN= 9 × 1.35 = 12.15

Link to Fyffe Article, Link to 2020 CSS Rankings

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 18 / 35

Page 60: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Data

Scouting Data

Via Central Scouting Player Rankings

NHL Internal Rankings from Central Scouting Service

Four Groups: (Skaters, Goalies) × (North Amer., Europe)

Convert to single list via Iain Fyffe’s Cescin (2011)

Central Scouting Integrator (Cescin)

NA Skaters = 1.35, NA Goalies = 13.25EU Skaters = 6.27, EU Goalies = 38.18

To get Cescin multiply values from above by CSS Rank

J Quinn 2020 #9 Ranked NA Skater would be CESCIN= 9 × 1.35 = 12.15

Link to Fyffe Article, Link to 2020 CSS Rankings

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 18 / 35

Page 61: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Data

Scouting Data

Via Central Scouting Player Rankings

NHL Internal Rankings from Central Scouting Service

Four Groups: (Skaters, Goalies) × (North Amer., Europe)

Convert to single list via Iain Fyffe’s Cescin (2011)

Central Scouting Integrator (Cescin)

NA Skaters = 1.35, NA Goalies = 13.25EU Skaters = 6.27, EU Goalies = 38.18

To get Cescin multiply values from above by CSS Rank

J Quinn 2020 #9 Ranked NA Skater would be CESCIN= 9 × 1.35 = 12.15

Link to Fyffe Article, Link to 2020 CSS Rankings

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 18 / 35

Page 62: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Data

Join the Data

Large Data Wrangling Effort

Combine Data by Player & Draft Year

Some players ranked by CSS but not drafted

Data in two cohorts

1998-00 (Training), 2001-02 (Test), 2007-08 (Validate)2004-06 (Training), 2007-08 (Test)

Spellings, accents, multiple players same name

I was inefficient at this task at best.

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 19 / 35

Page 63: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Data

Robin Olssons (Born in ’89 or ’90)

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 20 / 35

Page 64: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Model

Model

Warning: Notation ahead

g−1(Yi ) = β0 + β1f1(X1i ) + β2f2(X2i ) + . . .+ βk fk(Xki )

Predict Response Yi (either 1st 7 GP or 1st 7 TOI) for player i

g() is link function, we will use g() = log() via Poisson family

Generalized Additive Model

fj()’s variety of functional forms and fits (Splines, Loess, quadratic)

More flexible relative to regular multiple linear regression

Still linear on the Right Hand Side

And interactions

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 21 / 35

Page 65: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Model

Model

Warning: Notation ahead

g−1(Yi ) = β0 + β1f1(X1i ) + β2f2(X2i ) + . . .+ βk fk(Xki )

Predict Response Yi (either 1st 7 GP or 1st 7 TOI) for player i

g() is link function, we will use g() = log() via Poisson family

Generalized Additive Model

fj()’s variety of functional forms and fits (Splines, Loess, quadratic)

More flexible relative to regular multiple linear regression

Still linear on the Right Hand Side

And interactions

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 21 / 35

Page 66: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Model

Model

Warning: Notation ahead

g−1(Yi ) = β0 + β1f1(X1i ) + β2f2(X2i ) + . . .+ βk fk(Xki )

Predict Response Yi (either 1st 7 GP or 1st 7 TOI) for player i

g() is link function, we will use g() = log() via Poisson family

Generalized Additive Model

fj()’s variety of functional forms and fits (Splines, Loess, quadratic)

More flexible relative to regular multiple linear regression

Still linear on the Right Hand Side

And interactions

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 21 / 35

Page 67: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Model

Model

Warning: Notation ahead

g−1(Yi ) = β0 + β1f1(X1i ) + β2f2(X2i ) + . . .+ βk fk(Xki )

Predict Response Yi (either 1st 7 GP or 1st 7 TOI) for player i

g() is link function, we will use g() = log() via Poisson family

Generalized Additive Model

fj()’s variety of functional forms and fits (Splines, Loess, quadratic)

More flexible relative to regular multiple linear regression

Still linear on the Right Hand Side

And interactions

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 21 / 35

Page 68: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Model

Model

Warning: Notation ahead

g−1(Yi ) = β0 + β1f1(X1i ) + β2f2(X2i ) + . . .+ βk fk(Xki )

Predict Response Yi (either 1st 7 GP or 1st 7 TOI) for player i

g() is link function, we will use g() = log() via Poisson family

Generalized Additive Model

fj()’s variety of functional forms and fits (Splines, Loess, quadratic)

More flexible relative to regular multiple linear regression

Still linear on the Right Hand Side

And interactions

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 21 / 35

Page 69: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Model

Model

Warning: Notation ahead

g−1(Yi ) = β0 + β1f1(X1i ) + β2f2(X2i ) + . . .+ βk fk(Xki )

Predict Response Yi (either 1st 7 GP or 1st 7 TOI) for player i

g() is link function, we will use g() = log() via Poisson family

Generalized Additive Model

fj()’s variety of functional forms and fits (Splines, Loess, quadratic)

More flexible relative to regular multiple linear regression

Still linear on the Right Hand Side

And interactions

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 21 / 35

Page 70: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Model

Model

Warning: Notation ahead

g−1(Yi ) = β0 + β1f1(X1i ) + β2f2(X2i ) + . . .+ βk fk(Xki )

Predict Response Yi (either 1st 7 GP or 1st 7 TOI) for player i

g() is link function, we will use g() = log() via Poisson family

Generalized Additive Model

fj()’s variety of functional forms and fits (Splines, Loess, quadratic)

More flexible relative to regular multiple linear regression

Still linear on the Right Hand Side

And interactions

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 21 / 35

Page 71: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Model

Model

Warning: Notation ahead

g−1(Yi ) = β0 + β1f1(X1i ) + β2f2(X2i ) + . . .+ βk fk(Xki )

Predict Response Yi (either 1st 7 GP or 1st 7 TOI) for player i

g() is link function, we will use g() = log() via Poisson family

Generalized Additive Model

fj()’s variety of functional forms and fits (Splines, Loess, quadratic)

More flexible relative to regular multiple linear regression

Still linear on the Right Hand Side

And interactions

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 21 / 35

Page 72: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Model

Model

Warning: Notation ahead

g−1(Yi ) = β0 + β1f1(X1i ) + β2f2(X2i ) + . . .+ βk fk(Xki )

Predict Response Yi (either 1st 7 GP or 1st 7 TOI) for player i

g() is link function, we will use g() = log() via Poisson family

Generalized Additive Model

fj()’s variety of functional forms and fits (Splines, Loess, quadratic)

More flexible relative to regular multiple linear regression

Still linear on the Right Hand Side

And interactions

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 21 / 35

Page 73: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Model

GAM’s

Relationship between GP and Cescin

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 22 / 35

Page 74: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Model

GAM’s

Multiple Regression

Yi = β0 + β1X1i + β2X2i + . . .+ βkXki

GLM

g−1(Yi ) = β0 + β1X1i + β2X2i + . . .+ βkXki

GAM

g−1(Yi ) = β0 + β1f1(X1i ) + β2f2(X2i ) + . . .+ βk fk(Xki )

NB: There are other differences in estimation for non-Gaussian families

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 23 / 35

Page 75: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Model

Additivity

Typical Linear Multiple Regression

Yi = β0 + β1X1i + β2X2i + . . .+ βkXki

Additivity means each effect adds to the others.Transformation (via g() or g−1())such as log or logistic means that theimpacts are different. In particular, log transformation means amultiplicative effect.

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 24 / 35

Page 76: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Model

Interactions and Indicator Variables

Interactions and Indicators allow for nuanced models.Example: Indicator Variables

xi =

{1 if player i is a Goalie,0 if player i is not a Goalie

(1)

Interactions are variables created by multiplying two variables together.Eg. X7 = X3X4 but allows for great model flexibility

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 25 / 35

Page 77: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Model

Interactions: Example

IG , IF , ID Indicator (Positions), IQMJHL Indicator (QMJHL)

Full (Grossly Hypothetical) Model:

TOI ∼ 400 + 60IG + 0.01Height + 0.2PPG × IF + 0.25Height × IG

− 0.005 × f (Cescin) − 0.03ID × IQMJHL

For G:TOI ∼ 400 + 60 + (0.01 + 0.25)Height − 0.005f (Cescin)

For F:TOI ∼ 400 + 0.01Height + 0.2PPG − 0.005f (Cescin)

For D:TOI ∼ 400 + 0.01Height − 0.005f (Cescin) − 0.03ID IQMJHL

One regression, but three different relationships

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 26 / 35

Page 78: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Model

Interactions: Example

IG , IF , ID Indicator (Positions), IQMJHL Indicator (QMJHL)

Full (Grossly Hypothetical) Model:

TOI ∼ 400 + 60IG + 0.01Height + 0.2PPG × IF + 0.25Height × IG

− 0.005 × f (Cescin) − 0.03ID × IQMJHL

For G:TOI ∼ 400 + 60 + (0.01 + 0.25)Height − 0.005f (Cescin)

For F:TOI ∼ 400 + 0.01Height + 0.2PPG − 0.005f (Cescin)

For D:TOI ∼ 400 + 0.01Height − 0.005f (Cescin) − 0.03ID IQMJHL

One regression, but three different relationships

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 26 / 35

Page 79: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Model

Interactions: Example

IG , IF , ID Indicator (Positions), IQMJHL Indicator (QMJHL)

Full (Grossly Hypothetical) Model:

TOI ∼ 400 + 60IG + 0.01Height + 0.2PPG × IF + 0.25Height × IG

− 0.005 × f (Cescin) − 0.03ID × IQMJHL

For G:TOI ∼ 400 + 60 + (0.01 + 0.25)Height − 0.005f (Cescin)

For F:TOI ∼ 400 + 0.01Height + 0.2PPG − 0.005f (Cescin)

For D:TOI ∼ 400 + 0.01Height − 0.005f (Cescin) − 0.03ID IQMJHL

One regression, but three different relationships

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 26 / 35

Page 80: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Model

Interactions: Example

IG , IF , ID Indicator (Positions), IQMJHL Indicator (QMJHL)

Full (Grossly Hypothetical) Model:

TOI ∼ 400 + 60IG + 0.01Height + 0.2PPG × IF + 0.25Height × IG

− 0.005 × f (Cescin) − 0.03ID × IQMJHL

For G:TOI ∼ 400 + 60 + (0.01 + 0.25)Height − 0.005f (Cescin)

For F:

TOI ∼ 400 + 0.01Height + 0.2PPG − 0.005f (Cescin)

For D:TOI ∼ 400 + 0.01Height − 0.005f (Cescin) − 0.03ID IQMJHL

One regression, but three different relationships

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 26 / 35

Page 81: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Model

Interactions: Example

IG , IF , ID Indicator (Positions), IQMJHL Indicator (QMJHL)

Full (Grossly Hypothetical) Model:

TOI ∼ 400 + 60IG + 0.01Height + 0.2PPG × IF + 0.25Height × IG

− 0.005 × f (Cescin) − 0.03ID × IQMJHL

For G:TOI ∼ 400 + 60 + (0.01 + 0.25)Height − 0.005f (Cescin)

For F:TOI ∼ 400 + 0.01Height + 0.2PPG − 0.005f (Cescin)

For D:TOI ∼ 400 + 0.01Height − 0.005f (Cescin) − 0.03ID IQMJHL

One regression, but three different relationships

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 26 / 35

Page 82: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Model

Interactions: Example

IG , IF , ID Indicator (Positions), IQMJHL Indicator (QMJHL)

Full (Grossly Hypothetical) Model:

TOI ∼ 400 + 60IG + 0.01Height + 0.2PPG × IF + 0.25Height × IG

− 0.005 × f (Cescin) − 0.03ID × IQMJHL

For G:TOI ∼ 400 + 60 + (0.01 + 0.25)Height − 0.005f (Cescin)

For F:TOI ∼ 400 + 0.01Height + 0.2PPG − 0.005f (Cescin)

For D:TOI ∼ 400 + 0.01Height − 0.005f (Cescin) − 0.03ID IQMJHL

One regression, but three different relationshipsc©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 26 / 35

Page 83: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Results

Model Evaluation

Use model to predict TOI (or GP) for each out of sample player incorpus. Rank order players from those.

Calculate Spearman’s Rank Correlation for:

NHL Draft Order vs Actual TOI Order

Predicted TOI Order vs Actual TOI Order

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 27 / 35

Page 84: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Results

Model Evaluation

Use model to predict TOI (or GP) for each out of sample player incorpus. Rank order players from those.

Calculate Spearman’s Rank Correlation for:

NHL Draft Order vs Actual TOI Order

Predicted TOI Order vs Actual TOI Order

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 27 / 35

Page 85: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Results

Results: Corpus Drafted Players

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 28 / 35

Page 86: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Results

Results: Corpus Drafted or CSS Ranked Players

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 29 / 35

Page 87: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Results

Prediction 2016 NHL Draft

Source: Shinzawa 2018 Article

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 30 / 35

Page 88: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Results

Prediction 2016 NHL Draft

Source: Shinzawa 2018 Article

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 31 / 35

Page 89: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Results

Comments

Corpus (which players are included) is important (Cf Table 4 & 5)

This model is not great, lots of ways to improve

Add age and birthdate into modelPsychological metricsAdditional Years of Data, draft -1, draft -2, SVPctAdd Combine Data, Internal Scouting EvaluationTeam quality metricsDifferent methods CART, BART, etc.Gamma link in GAMMore Leagues, drop Other (Bayesian)

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 32 / 35

Page 90: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Results

Comments

Corpus (which players are included) is important (Cf Table 4 & 5)

This model is not great, lots of ways to improve

Add age and birthdate into modelPsychological metricsAdditional Years of Data, draft -1, draft -2, SVPctAdd Combine Data, Internal Scouting EvaluationTeam quality metricsDifferent methods CART, BART, etc.Gamma link in GAMMore Leagues, drop Other (Bayesian)

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 32 / 35

Page 91: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Results

Comments

Corpus (which players are included) is important (Cf Table 4 & 5)

This model is not great, lots of ways to improve

Add age and birthdate into model

Psychological metricsAdditional Years of Data, draft -1, draft -2, SVPctAdd Combine Data, Internal Scouting EvaluationTeam quality metricsDifferent methods CART, BART, etc.Gamma link in GAMMore Leagues, drop Other (Bayesian)

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 32 / 35

Page 92: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Results

Comments

Corpus (which players are included) is important (Cf Table 4 & 5)

This model is not great, lots of ways to improve

Add age and birthdate into modelPsychological metrics

Additional Years of Data, draft -1, draft -2, SVPctAdd Combine Data, Internal Scouting EvaluationTeam quality metricsDifferent methods CART, BART, etc.Gamma link in GAMMore Leagues, drop Other (Bayesian)

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 32 / 35

Page 93: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Results

Comments

Corpus (which players are included) is important (Cf Table 4 & 5)

This model is not great, lots of ways to improve

Add age and birthdate into modelPsychological metricsAdditional Years of Data, draft -1, draft -2, SVPct

Add Combine Data, Internal Scouting EvaluationTeam quality metricsDifferent methods CART, BART, etc.Gamma link in GAMMore Leagues, drop Other (Bayesian)

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 32 / 35

Page 94: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Results

Comments

Corpus (which players are included) is important (Cf Table 4 & 5)

This model is not great, lots of ways to improve

Add age and birthdate into modelPsychological metricsAdditional Years of Data, draft -1, draft -2, SVPctAdd Combine Data, Internal Scouting Evaluation

Team quality metricsDifferent methods CART, BART, etc.Gamma link in GAMMore Leagues, drop Other (Bayesian)

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 32 / 35

Page 95: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Results

Comments

Corpus (which players are included) is important (Cf Table 4 & 5)

This model is not great, lots of ways to improve

Add age and birthdate into modelPsychological metricsAdditional Years of Data, draft -1, draft -2, SVPctAdd Combine Data, Internal Scouting EvaluationTeam quality metrics

Different methods CART, BART, etc.Gamma link in GAMMore Leagues, drop Other (Bayesian)

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 32 / 35

Page 96: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Results

Comments

Corpus (which players are included) is important (Cf Table 4 & 5)

This model is not great, lots of ways to improve

Add age and birthdate into modelPsychological metricsAdditional Years of Data, draft -1, draft -2, SVPctAdd Combine Data, Internal Scouting EvaluationTeam quality metricsDifferent methods CART, BART, etc.

Gamma link in GAMMore Leagues, drop Other (Bayesian)

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 32 / 35

Page 97: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Results

Comments

Corpus (which players are included) is important (Cf Table 4 & 5)

This model is not great, lots of ways to improve

Add age and birthdate into modelPsychological metricsAdditional Years of Data, draft -1, draft -2, SVPctAdd Combine Data, Internal Scouting EvaluationTeam quality metricsDifferent methods CART, BART, etc.Gamma link in GAM

More Leagues, drop Other (Bayesian)

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 32 / 35

Page 98: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Results

Comments

Corpus (which players are included) is important (Cf Table 4 & 5)

This model is not great, lots of ways to improve

Add age and birthdate into modelPsychological metricsAdditional Years of Data, draft -1, draft -2, SVPctAdd Combine Data, Internal Scouting EvaluationTeam quality metricsDifferent methods CART, BART, etc.Gamma link in GAMMore Leagues, drop Other (Bayesian)

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 32 / 35

Page 99: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Results

Return to Themes

Evaluation of players is hard

and it takes time (players develop at different rates from 18yo)

Metrics for assessment are not simple

nor single agreed response metric

In short, noisy and sparse data with slow feedback and

(to teams) results incredibly important

...and it is possible to improve drafting in the NHL via StatisticsModels/Analytics.“. . . [I]t is a numbers game.”

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 33 / 35

Page 100: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Results

Return to Themes

Evaluation of players is hard

and it takes time (players develop at different rates from 18yo)

Metrics for assessment are not simple

nor single agreed response metric

In short, noisy and sparse data with slow feedback and

(to teams) results incredibly important

...and it is possible to improve drafting in the NHL via StatisticsModels/Analytics.“. . . [I]t is a numbers game.”

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 33 / 35

Page 101: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Updates

CESCIN Update

Work with Stat major, Amanda Butterfield (D), St. Lawrence University’20

Join CSS data & Draft Selection data

Larger Data set (2003-2019)

Tweaks to CESCIN values

Ongoing project to build more complete data, eg. 2003

Link to WriteupLink to Data

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 34 / 35

Page 102: A Look at Draft By Numbers: Using Data and Analytics to Improve …myslu.stlawu.edu/~msch/Schuckers_HANIC2020_DbN.pdf · 1 EDM: Ryan Nugent-Hopkins (F) 2 COL: Gabriel Landeskog (F)

Updates

The End

[email protected]

Talkon 2019 Review of Analytics-based NHL Draft WorkMy Other Papers in Hockey

c©M. Schuckers (SLU) HANIC-Schuckers May 2, 2020 35 / 35