Upload
mohinder-dick-pmp
View
132
Download
1
Embed Size (px)
Citation preview
|
Machine Learning In Healthcare
Mohinder Dick
Senior Software Architect
UPMC Enterprises
November 20, 2015
|
Goal
|
©2 0 1 5 UPMC En terp ri ses : P ROPRIETARY3
|
©2 0 1 5 UPMC En terp ri ses : P ROPRIETARY4
|
©2 0 1 5 UPMC En terp ri ses : P ROPRIETARY5
|
©2 0 1 5 UPMC En terp ri ses : P ROPRIETARY6
|
Agenda
• Machine Learning in 6 Slides
• Hackers Guide to Data Science
• How do I get started
• Healthcare use case
• Q & A
©2 0 1 5 UPMC En terp ri ses : P ROPRIETARY7
|
Machine Learning in 6 Slides – Key Terms
Term
Machine Learning
ML Type
Supervised Learning
Linear Algorithm
Target Variable
Feature
Regression
Classification
©2 0 1 5 UPMC En terp ri ses : P ROPRIETARY8
Description
Means improving performance with data
Broad classification of algorithms
Algorithms use data has labeled examples
Predicts based on a linear model
Data point that you are trying to predict
Series of data points on which prediction is made
Predicting a numeric target variable
Predicting a binary or categorical target variable
|
Machine Learning in 6 Slides
©2 0 1 5 UPMC En terp ri ses : P ROPRIETARY9
|
Machine Learning in 6 Slides – Targets and Features
Data Value
Age 40
Gender Female
Race Asian
Hospital Shadyside
Medical Diagnosis V15.82
©2 0 1 5 UPMC En terp ri ses : P ROPRIETARY10
Predict – Length-of-stay (LOS)
Clinical Example
|
Machine Learning in 6 Slides - Features
©2 0 1 5 UPMC En terp ri ses : P ROPRIETARY11
Facial Recognition
Encode the position of the “Haars”
Predict – Is it a face, whose face is it?
|
Machine Learning in 6 Slides
©2 0 1 5 UPMC En terp ri ses : P ROPRIETARY12
|
Machine Learning in 6 Slides - Summary
Term Description
Machine Learning Means improving performance with data
ML Type Broad classification of algorithms
Supervised Learning Algorithms use data has labeled examples
Linear Algorithm Predicts based on a linear model
Target Variable Data point that you are trying to predict
Feature Series of data points on which prediction is made
Regression Predicting a numeric target variable
Classification Predicting a binary or categorical target variable
©2 0 1 5 UPMC En terp ri ses : P ROPRIETARY13
|
Getting Started
|
How do I get started
©2 0 1 5 UPMC En terp ri ses : P ROPRIETARY15
Training • Books
• MOOCs
• Instructor-lead
Data Government
agencies
Vendors
Tools &
Techniques
• Data pipeline
• Spark
• Python
• R
|
How do I get started - Tools
©2 0 1 5 UPMC En terp ri ses : P ROPRIETARY16
Tool Comments
Apache Spark Cluster aware
execution
MLLib for machine
learning
Python Loosely typed
language
Libraries – scikit,
pandas and
numpy.
R Statistical
language, IDE
ecosystem.
|
How do I get started - Tools
©2 0 1 5 UPMC En terp ri ses : P ROPRIETARY17
|
How do I get started – Data Driven Approach
©2 0 1 5 UPMC En terp ri ses : P ROPRIETARY18
Problem• Hospital bed capacity
Data
• Admitting Hospital
• Patient diagnosis and demographics
Analysis
• Data representation
• Data analysis
• Evaluation
Policy• Increase capacity at hospitals with average patient age under 40
|
How do I get started – Machine Learning Pipeline
©2 0 1 5 UPMC En terp ri ses : P ROPRIETARY19
|
Healthcare Use Case
|
Healthcare use case - Actual
Data Value
Age 40
Gender Female
Race Asian
Hospital Shadyside
Medical Diagnosis V15.82
©2 0 1 5 UPMC En terp ri ses : P ROPRIETARY21
Predict – Re-order drugs?
Clinical Example
|
Healthcare use case - Simplified
Data Value
Age 40
Gender Female
Race Asian
Hospital Shadyside
Medical Diagnosis V15.82
©2 0 1 5 UPMC En terp ri ses : P ROPRIETARY22
Predict – Length-of-stay (LOS)
Clinical Example
|
Healthcare Use Case – Feature Extraction
©2 0 1 5 UPMC En terp ri ses : P ROPRIETARY23
|
Healthcare Use Case – Evaluation
Measure Description
R2 Percentage decrease in
model error relative to
guessing
RMSE Root Means Squared Error
Specificity Low rate of false-positives
Sensitivity High rate of true-positives
©2 0 1 5 UPMC En terp ri ses : P ROPRIETARY24
|
Reference
Resource Location
EDX Course list (see data) https://www.edx.org/course
Coursera Course list https://www.coursera.org/courses/?domains=data-
science
Spark Download Page http://spark.apache.org/downloads.html
Anaconda Python Install
Page
http://docs.continuum.io/anaconda/install
R install page https://cran.r-project.org/
R Studio install page https://www.rstudio.com/products/rstudio/download/
Ebay Tech Blog on Spark http://www.ebaytechblog.com/2014/05/28/using-spark-
to-ignite-data-analytics/
©2 0 1 5 UPMC En terp ri ses : P ROPRIETARY25
|
Question?