35
5/25/2017 Transforming Healthcare, Finance, Energy, and Commerce with Machine Learning, Case Study Included Sam Putnam Andrew Holson COO, Snyder Donegan Real Estate Group CEO/Founder, Deep Learning Consultant, Enterprise Deep Learning

Transforming Healthcare, Finance, Energy, and Commerce with Machine Learning, Case Study Included - Enterprise Deep Learning

Embed Size (px)

Citation preview

5/25/2017

Transforming Healthcare, Finance, Energy, and Commerce with Machine Learning, Case

Study Included

Sam Putnam Andrew Holson

COO, Snyder Donegan Real Estate Group

CEO/Founder, Deep Learning Consultant,

Enterprise Deep Learning

Intro to Applied ML

Intro to Applied ML

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

What is machine learning?

What kinds of machine learning are there?

What kinds of techniques are used?

5/25/2017

Intro to Real Estate

Intro to Real Estate

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

Why is machine learning important to a real estate agent?

How good does a machine learning model have to be to be helpful?

5/25/2017

The Model

The Model

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

How good is the model?

What was the model that performed the best on the data?

What is the test accuracy?

5/25/2017

Part 1

Intro to Applied ML

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

Intro to Applied ML

5/25/2017

Machine Learning is for X

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

Intro to Applied ML

“Ok Google - Will my house sell for $10m -

Yes or no?”

Regression

Classification

“Hey Siri - What can I sell my house for?”

5/25/2017

Machines learn by example

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

Intro to Applied ML

Supervised Classification

You Show Google a ton of pictures of houses and condos. You point at every single one and say….

“Ok Google. This is a house” or you say… “Ok Google. This is a condo…”

Now, You take a picture of a house. “Ok Google, what is this?”

Google - “That’s a house!”5/25/2017

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

Intro to Applied ML

Supervised Regression

“Hey Siri - the house down the street sold for $15m and it has 3 bedrooms and 2 baths on 15 acres. You tell Siri about a lot of other houses, too. Now, you tell Siri that your house has 4

bedrooms and 2 baths on 15 acres. “Hey Siri - What will my house sell for?” Siri - “$15.5m!”

Machines learn by example

5/25/2017

Machines also learn by recognizing patterns

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

Intro to Applied ML

Unsupervised ClassificationYou Show Google a ton of pictures of houses and condos.

You point at every single one and say…. “Ok Google. This is a house” or you say…

“Ok Google. This is a condo…” [You say nothing]. You notice that Google has grouped the

pictures into two groups. You label one “house” and one “condo”. Now, You take a picture of a house.

“Ok Google, what is this?” Google - “That’s a house!”

5/25/2017

Machines also learn by recognizing patterns

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

Intro to Applied ML

Unsupervised Regression“Hey Siri - the house down the street sold for $15m and it has 3 bedrooms and 2 baths on 15 acres. You tell Siri about a lot of other houses, too. Siri puts the houses on a map. You split the map up by price. Now, you tell Siri that your house has 4

bedrooms and 2 baths on 15 acres. “Hey Siri - What will my house sell for?” Siri - “$15.5m!”

5/25/2017

ML is like Forestry

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

Intro to Applied ML

Linear Regression “transplant one root & let it find the water”

“plant a tree and let the branches develop”

“try out different seeds and pick best tree”

“find a good seed & try more like that”

“transplant a lot of roots and let them grow”Neural Network

Decision Tree

Random Forest

Gradient Boosting Machine

5/25/2017

ML for Healthcare

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

Intro to Applied ML

Is this Cancer in this X-Ray?

How many medical records do we need to be able to provide insurance to people in this region in Africa?

5/25/2017

ML for Finance

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

Intro to Applied ML

How much smoke is coming out of this factory in China today compared to a month ago today?

Was this a code word that someone sent through this Bloomberg terminal just now?

5/25/2017

ML for Energy

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

Intro to Applied ML

How much power is the Western Interconnection going to be using tomorrow at noon?

How long will it take for me to get paid back for hooking my house up to solar?

5/25/2017

ML for Commerce

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

Intro to Applied ML

What is the best news article to put under this news article on our online newspaper?

How much will my house sell for?

5/25/2017

Best practices by Google

http://martin.zinkevich.org/rules_of_ml/rules_of_ml.pdf

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

Intro to Applied ML

5/25/2017

Part 2

Intro to Real Estate

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

Intro to Real Estate

5/25/2017

Pricing real estate in a rural area

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

Intro to Real Estate

The agent can be pretty good

Agents derive price based off of subjective interpretation of prior comparable sales and

personal experience

But sellers can have their own idea5/25/2017

I want a model that

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

Intro to Real Estate

Solidifies my conclusion of what the house will sell for

Successfully convinces the client to ‘shoot’ for that price.

5/25/2017

Part 3

The Model

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

The Model

5/25/2017

Find the most important attributes to use, like # of acres

So the top 10 variables are:

1. acres - # of acres 2. tax_gross_amount 3. assessment_value_town 4. sq_ft_tot_fn (finished) 5. bedrooms_total 6. address - street address 7. baths_total - # of baths 8. year_built - originally built 9. rooms_total - # of rooms 10. garage_capacity - # of garage bays

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

The Model

Feature Selectionhttps://www.kaggle.com/samdeeplearning/xgb-top-10-features/

5/25/2017

We chose 10 features

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

The Model

5/25/2017

Best Fit Line gives OK performance

Error The 10

Mean 26.2%

Median 18.1%

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

The Model

Multiple Linear Regression

5/25/2017

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

The Model

5/25/2017

Gets Better Adding City

Error The 10 The 10 + City

Mean 26.2% 23.9%

Median 18.1% 15.6%

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

The Model

Multiple Linear Regression

5/25/2017

City is a Better Feature than Covenants

Error The 10 The 10 + City

The 10 + Covenants*

Mean 26.2% 23.9% 27.6%

Median 18.1% 15.6% 16.0%

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

The Model

Multiple Linear Regression

5/25/2017

The Model

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

Forest gives better performance

Error Whole XGB Best Linear Regression

Mean 21.4% 23.9%

Median 14.6% 15.6%

Gradient Boosting Machine (XGBoost)

https://www.kaggle.com/samdeeplearning/naive-subsample-1-0-xgb 5/25/2017

The Model

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

Gets better when you randomly pick half of the trees at each boost

Error Whole XGB Half XGB Best Linear Regression

Mean 21.4% 20.5% 23.9%

Median 14.6% 14.0% 15.6%

Gradient Boosting Machine (XGBoost)

5/25/2017

The Model

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

Starts to degrade at one quarter of the trees

Error Whole XGB Half XGB One Quarter XGB

Best Linear Regression

Mean 21.4% 20.5% 20.8% 23.9%

Median 14.6% 14.0% 12.9% 15.6%

Gradient Boosting Machine (XGBoost)

https://www.kaggle.com/samdeeplearning/naive-subsample-0-25-xgb/output5/25/2017

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

The Model

Model Selection - XGBoost

Accuracy 0.5 & 10 Features + City

Best Linear Regression

Mean 18.3% 23.9%

Median 13.4% 15.6%

Gradient Boosting Machine (XGBoost) is Chosen

5/25/2017

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

The Model

Test Performance

Accuracy 0.5 & 10 Features + City

Mean 19.9%

Median 16.7%

Gradient Boosting Machine (XGBoost)

https://www.kaggle.com/samdeeplearning/naive-subsample-5-10-city/5/25/2017

2. Listen

Sam Putnam

1. Search ‘Deep Learning’

5/25/2017

Introduction to Applied Machine Learning & Case Study

The Model

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

The Model

What are we doing tomorrow

Bigger data set

Land Quality, House Quality, Proximity

Confidence Range

5/25/2017

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

The Model

What are we doing tomorrow

5/25/2017

Thank you!

[email protected]

Thank you to Google & others who have published diagrams and photos. Slides are for today only.

Always looking for new members, speakers & locations to host Machine Learning presentations in New Hampshire, Vermont, Boston, and New York.

[email protected]

Introduction to Applied Machine Learning & Case Study Sam Putnam Andrew Holson

5/25/2017