Upload
sam-putnam-deep-learning
View
336
Download
3
Embed Size (px)
Citation preview
5/25/2017
Transforming Healthcare, Finance, Energy, and Commerce with Machine Learning, Case
Study Included
Sam Putnam Andrew Holson
COO, Snyder Donegan Real Estate Group
CEO/Founder, Deep Learning Consultant,
Enterprise Deep Learning
Intro to Applied ML
Intro to Applied ML
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
What is machine learning?
What kinds of machine learning are there?
What kinds of techniques are used?
5/25/2017
Intro to Real Estate
Intro to Real Estate
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
Why is machine learning important to a real estate agent?
How good does a machine learning model have to be to be helpful?
5/25/2017
The Model
The Model
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
How good is the model?
What was the model that performed the best on the data?
What is the test accuracy?
5/25/2017
Part 1
Intro to Applied ML
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
Intro to Applied ML
5/25/2017
Machine Learning is for X
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
Intro to Applied ML
“Ok Google - Will my house sell for $10m -
Yes or no?”
Regression
Classification
“Hey Siri - What can I sell my house for?”
5/25/2017
Machines learn by example
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
Intro to Applied ML
Supervised Classification
You Show Google a ton of pictures of houses and condos. You point at every single one and say….
“Ok Google. This is a house” or you say… “Ok Google. This is a condo…”
Now, You take a picture of a house. “Ok Google, what is this?”
Google - “That’s a house!”5/25/2017
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
Intro to Applied ML
Supervised Regression
“Hey Siri - the house down the street sold for $15m and it has 3 bedrooms and 2 baths on 15 acres. You tell Siri about a lot of other houses, too. Now, you tell Siri that your house has 4
bedrooms and 2 baths on 15 acres. “Hey Siri - What will my house sell for?” Siri - “$15.5m!”
Machines learn by example
5/25/2017
Machines also learn by recognizing patterns
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
Intro to Applied ML
Unsupervised ClassificationYou Show Google a ton of pictures of houses and condos.
You point at every single one and say…. “Ok Google. This is a house” or you say…
“Ok Google. This is a condo…” [You say nothing]. You notice that Google has grouped the
pictures into two groups. You label one “house” and one “condo”. Now, You take a picture of a house.
“Ok Google, what is this?” Google - “That’s a house!”
5/25/2017
Machines also learn by recognizing patterns
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
Intro to Applied ML
Unsupervised Regression“Hey Siri - the house down the street sold for $15m and it has 3 bedrooms and 2 baths on 15 acres. You tell Siri about a lot of other houses, too. Siri puts the houses on a map. You split the map up by price. Now, you tell Siri that your house has 4
bedrooms and 2 baths on 15 acres. “Hey Siri - What will my house sell for?” Siri - “$15.5m!”
5/25/2017
ML is like Forestry
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
Intro to Applied ML
Linear Regression “transplant one root & let it find the water”
“plant a tree and let the branches develop”
“try out different seeds and pick best tree”
“find a good seed & try more like that”
“transplant a lot of roots and let them grow”Neural Network
Decision Tree
Random Forest
Gradient Boosting Machine
5/25/2017
ML for Healthcare
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
Intro to Applied ML
Is this Cancer in this X-Ray?
How many medical records do we need to be able to provide insurance to people in this region in Africa?
5/25/2017
ML for Finance
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
Intro to Applied ML
How much smoke is coming out of this factory in China today compared to a month ago today?
Was this a code word that someone sent through this Bloomberg terminal just now?
5/25/2017
ML for Energy
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
Intro to Applied ML
How much power is the Western Interconnection going to be using tomorrow at noon?
How long will it take for me to get paid back for hooking my house up to solar?
5/25/2017
ML for Commerce
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
Intro to Applied ML
What is the best news article to put under this news article on our online newspaper?
How much will my house sell for?
5/25/2017
Best practices by Google
http://martin.zinkevich.org/rules_of_ml/rules_of_ml.pdf
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
Intro to Applied ML
5/25/2017
Part 2
Intro to Real Estate
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
Intro to Real Estate
5/25/2017
Pricing real estate in a rural area
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
Intro to Real Estate
The agent can be pretty good
Agents derive price based off of subjective interpretation of prior comparable sales and
personal experience
But sellers can have their own idea5/25/2017
I want a model that
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
Intro to Real Estate
Solidifies my conclusion of what the house will sell for
Successfully convinces the client to ‘shoot’ for that price.
5/25/2017
Part 3
The Model
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
The Model
5/25/2017
Find the most important attributes to use, like # of acres
So the top 10 variables are:
1. acres - # of acres 2. tax_gross_amount 3. assessment_value_town 4. sq_ft_tot_fn (finished) 5. bedrooms_total 6. address - street address 7. baths_total - # of baths 8. year_built - originally built 9. rooms_total - # of rooms 10. garage_capacity - # of garage bays
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
The Model
Feature Selectionhttps://www.kaggle.com/samdeeplearning/xgb-top-10-features/
5/25/2017
We chose 10 features
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
The Model
5/25/2017
Best Fit Line gives OK performance
Error The 10
Mean 26.2%
Median 18.1%
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
The Model
Multiple Linear Regression
5/25/2017
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
The Model
5/25/2017
Gets Better Adding City
Error The 10 The 10 + City
Mean 26.2% 23.9%
Median 18.1% 15.6%
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
The Model
Multiple Linear Regression
5/25/2017
City is a Better Feature than Covenants
Error The 10 The 10 + City
The 10 + Covenants*
Mean 26.2% 23.9% 27.6%
Median 18.1% 15.6% 16.0%
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
The Model
Multiple Linear Regression
5/25/2017
The Model
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
Forest gives better performance
Error Whole XGB Best Linear Regression
Mean 21.4% 23.9%
Median 14.6% 15.6%
Gradient Boosting Machine (XGBoost)
https://www.kaggle.com/samdeeplearning/naive-subsample-1-0-xgb 5/25/2017
The Model
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
Gets better when you randomly pick half of the trees at each boost
Error Whole XGB Half XGB Best Linear Regression
Mean 21.4% 20.5% 23.9%
Median 14.6% 14.0% 15.6%
Gradient Boosting Machine (XGBoost)
5/25/2017
The Model
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
Starts to degrade at one quarter of the trees
Error Whole XGB Half XGB One Quarter XGB
Best Linear Regression
Mean 21.4% 20.5% 20.8% 23.9%
Median 14.6% 14.0% 12.9% 15.6%
Gradient Boosting Machine (XGBoost)
https://www.kaggle.com/samdeeplearning/naive-subsample-0-25-xgb/output5/25/2017
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
The Model
Model Selection - XGBoost
Accuracy 0.5 & 10 Features + City
Best Linear Regression
Mean 18.3% 23.9%
Median 13.4% 15.6%
Gradient Boosting Machine (XGBoost) is Chosen
5/25/2017
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
The Model
Test Performance
Accuracy 0.5 & 10 Features + City
Mean 19.9%
Median 16.7%
Gradient Boosting Machine (XGBoost)
https://www.kaggle.com/samdeeplearning/naive-subsample-5-10-city/5/25/2017
2. Listen
Sam Putnam
1. Search ‘Deep Learning’
5/25/2017
Introduction to Applied Machine Learning & Case Study
The Model
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
The Model
What are we doing tomorrow
Bigger data set
Land Quality, House Quality, Proximity
Confidence Range
5/25/2017
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
The Model
What are we doing tomorrow
5/25/2017
Thank you!
Thank you to Google & others who have published diagrams and photos. Slides are for today only.
Always looking for new members, speakers & locations to host Machine Learning presentations in New Hampshire, Vermont, Boston, and New York.
Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson
5/25/2017