13
MONEY BALL DATA MINING IN BASKETBALL PRESENTER: YUGUAN LI PROFESSOR: CAROLINA RUIZ

Money ball data mining in Basketball

  • Upload
    bevan

  • View
    48

  • Download
    0

Embed Size (px)

DESCRIPTION

Money ball data mining in Basketball. Presenter: Yuguan Li Professor: Carolina Ruiz. References:. http :// www.kelvinjiang.com/2011/06/data-mining-nba-players-most-similar-to.html Data Mining the NBA, Players most similar to Jordan June 2011 - PowerPoint PPT Presentation

Citation preview

Page 1: Money ball data mining in Basketball

MONEY BALLDATA MINING IN BASKETBALL

PRESENTER: YUGUAN LIPROFESSOR: CAROLINA RUIZ

Page 2: Money ball data mining in Basketball

REFERENCES:

• HTTP://WWW.KELVINJIANG.COM/2011/06/DATA-MINING-NBA-PLAYERS-MOST-SIMILAR-TO.HTML• DATA MINING THE NBA, PLAYERS MOST SIMILAR TO JORDAN JUNE 2011• HTTP://

WWW.RESEARCHGATE.NET/PUBLICATION/261501109_THE_USE_OF_DATA_MINING_FOR_BASKETBALL_MATCHES_OUTCOMES_PREDICTION

• THE USE OF DATA MINING FOR BASKETBALL MATCHES OUTCOMES PREDICTION D. MILJKOVIĆ, L. GAJIĆ, A. KOVACEVIC, Z. KONJOVIĆ 2010

• HTTP://CITESEERX.IST.PSU.EDU/VIEWDOC/SUMMARY?DOI=10.1.1.26.6422• BRIEF APPLICATION DESCRIPTION ADVANCED SCOUT: DATA MINING AND KNOWLEDGE DISCOVERY IN NBA DATA

IBM T.J. WATSON RESEARCH CENTER 1997• HTTP://VIDEO.MIT.EDU/WATCH/A-STEP-BY-STEP-INTRODUCTION-TO-DATA-MINING-FOR-SPORTS-ANALYSIS-MIKHAIL

-GOLOVNYA-SALFORD-SYSTEMS-7207/

• VIDEO FROM YOUTUBE 2012

Page 3: Money ball data mining in Basketball

MONEY BALL

• A BASEBALL MOVIE, A DATA MINING MOVIE• LEADING QUESTIONS FROM THE MOVIE:

• WHAT DATA DO WE HAVE?• WHAT RESULT DO WE WANT TO PREDICT?

Salary comparison

From google im

age

Page 4: Money ball data mining in Basketball

DATA IN BASKETBALL

• LARGE NUMBER OF VARIOUS GAME STATISTICS AVAILABLE • WWW.NBA.COM• WWW.HOOPDATA.COM

• PLAYER LEVEL: REGULAR SEASON, PLAYOFFS AND ENTIRE CAREER• TEAM LEVEL: WIN AND LOSSES• GAME LEVEL: MOST DETAILED

Page 5: Money ball data mining in Basketball

DATA IN BASKETBALL

All of three figures are analysis of Jeremy Lin from hoopdata

Page 6: Money ball data mining in Basketball

DATA MINING IN BASKETBALL DRAFTING• BACKGROUND KNOWLEDGE: EVERY YEAR, EACH TEAM IN NBA CAN

DRAFT A YOUNG PLAYER FROM UNIVERSITY, THE LOWER RANKED TEAM HAS A HIGHER CHANCE TO PICK FIRST.

• DATA EXPLODING ERA, WE CAN ACCESS TO ALL THE STAT OF A PLAYER IN NCAA LEAGUE.

• LEADING QUESTION:HOW SHOULD A TEAM PICK?

Screenshot from ncaa.com

Page 7: Money ball data mining in Basketball

METHOD ONE: EUCLIDEAN DISTANCE

• SCENARIO ONE: THE ONLY GAP BETWEEN US AND CHAMPION IS MICHAEL JORDAN, SO WE NEED TO TRADE A PLAYER LIKE HIM.

• SOLUTION: WE ARE DISPOSED TO ALL THE CAREER STATS OF PLAYERS AND MJ, WE CAN CALCULATE THE EUCLIDEAN DISTANCE BETWEEN TWO VECTORS OF PLAYER STAT

• WE CAN COMPARE ALL CATEGORIES: POINTS, REBOUND, ASSISTS, STEALS, BLOCKS, ETC. THE SMALLER THE EUCLIDEAN DISTANCE IS, THE MORE SIMILAR TO MJ.

Screenshot from Reference 1

Page 8: Money ball data mining in Basketball

METHOD TWO: COSINE SIMILARITY

• SCENARIO TWO: I WANT TO PICK A SUBSTITUTION FOR ONE OF MY AGED PLAYER.

• COSINE SIMILARITY: MEASURES HOW MUCH THE RATIO OF PLAYER’S STAT DIFFER FROM OTHERS.

• SOLUTION: LOOKS LIKE CLUSTERING IN PLAYERS, WE MAY TAKE POINTS AND REBOUND AS AN EXAMPLE. PLAYER A GETS 20 PTS AND 10 REB PER GAME, B GETS 10 PTS AND 5 REB PER GAME, THEY SHOULD BE CONSIDER THE SAME BECAUSE THE RATIO IS EXACTLY THE SAME, AS THE TRAJECTORIES OF THE VECTORS, AND HENCE THE ANGULAR DIFFERENCE IS ZERO.

Screenshot from Reference 1

Page 9: Money ball data mining in Basketball

METHOD THREE: PEARSON CORRELATION

• TWO PLAYERS WITH IDENTICAL STATISTICS WOULD HAVE A BEST FIT LINE WHERE ALL DATA POINTS LIE PERFECTLY ON THE LINE. AS PLAYERS DIFFER MORE AND MORE, THEIR STATISTICAL DATA POINTS WILL DRIFT FARTHER AWAY FROM THE BEST FIT REGRESSION LINE

Screenshot from Reference 1

Page 10: Money ball data mining in Basketball

NBA DATA MINING APPLICATION: ADVANCED SCOUT

• ADVANCED SCOUT(AS) SEEKS OUT AND DISCOVERS INTERESTING PATTERNS IN GAME DATA. WITH THIS INFORMATION, A COACH CAN ASSESS THE EFFECTIVENESS OF CERTAIN COACHING DECISIONS AND FORMULATE GAME STRATEGIES FOR SUBSEQUENT GAMES.

• EARLY IN 95-96 SEASON, 16 TEAMS ALREADY START TO USE AS AND PROVIDED VERY POSITIVE FEEDBACK, “IT’S LIKE HAVING ANOTHER COACH IN THE TEAM”QUOTE BOB SALMI.

Page 11: Money ball data mining in Basketball

DATA PRE-PROCESSING

• 1. CONSISTENCY CHECK: DETECT ERRORS(MISSING ACTION/ IMPOSSIBLE EVENT) MADE DURING DATA COLLECTION

• 2. TRANSFORMATION: PLAY-SHEET, WHICH IS VERY FAMILIAR AMONG COACHES.

• 3. ENRICHMENT: USE ADDITIONAL INFORMATION TO ADD VALUE OF ANALYSIS.

Page 12: Money ball data mining in Basketball

DATA MINING

• AS USE ATTRIBUTE FOCUSING, WHICH IS VERY LIKE ASSOCIATION RULES: A EVENT E HAS A SERIES OF VALUES: {X1,X2,X3,X4,X5}, E IS INTERESTING TO EXTENT THAT XI OCCURRENCE DEPEND ON XJ

• GET INTERESTING RULES LIKE THIS: WHEN STEVE NASH WAS POINT GUARD, SHAWN MARION MISSED 0%(0) OF HIS JUMP FIELD-GOAL-ATTEMPTS, AND MADE 100%(4) OF HIS JUMP FIELD-GOAL-ATTEMPTS.

Page 13: Money ball data mining in Basketball

•THANK YOU !