42
Learn. Connect. Explore. Learn. Connect. Explore.

Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

Learn. Connect. Explore.Learn. Connect. Explore.

Page 2: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

How to Become a Data Scientistwith Azure Machine Learning

Saket Suman & Vikas GoyalMicrosoft Consulting Services

Page 3: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

Data is the New Oil..

..And fortunately we have a plenty of it

Page 4: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

Data Scientists unearth the power of Data…

..and we need many more of them

Page 5: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

Agenda (L200)

• Data Science & Data Scientists

• Machine Learning

• Azure Machine Learning (AML)

• Demos

• Q&A

Out of Scope

• R Language

• Model Details

• Model comparisons

• Boring Talk & Demos

Page 6: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

What is Data Science

Machine Learning

Big Data R EDW

Data Mining KDD

Statistics Data Lake F#

IoT Data Hacking

Power BI

Predictive Analysis

Classification

Sentiment Analysis

A/B Testing Significance

Hypothesis Testing

Recommender

Data Exploration

Data Science

HackingSkills

Math & Statistics

Machine Learning

Substantive Expertise

Traditional Research

Danger Zone

Page 7: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

Typical Backgrounds of Data Scientists

Mathematicians/Statisticians

Programmers/Hackers

Data Practitioners

Page 8: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

What is Machine Learning

E TP

https://www.coursera.org

Page 9: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

Industry Verticals using ML

GamesBanking

Engineering &

Telemetry

Healthcare

Spatial

IoT

Retail

Media

Services

Security

Pattern

Mining Social &

Mobile

ML Education

Page 10: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

Machine Learning in a box

Page 11: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

How do we enable Machine Learning

Scoring &

OptmizationEvaluationRetraining

Model

Training

Data

Modeling

Feature &

Algorithm

Engineering

Exploratory

Data

Analysis

Data

Cleansing

Problem

Definition

Yes! Machines do learn…with Statistics & Probability

A Learned Machine auto-programs itself

ML is NOT a 100% precise science

LEARNING = REPRESENTATION + EVALUATION + OPTIMIZATION

Page 12: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

Imagine what machine learning could do for your business.

Churn

analysis

Equipment

Monitoring

Spam

filtering

Market Basket

Analysis

Recommendation

Engines

Disease

Outbreak

Prediction

Image

detection

Forecasting with

Trending

Anomaly

detection

Applications of Machine Learning

Page 13: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

SQL Server

enables

data mining

Computers

work on users

behalf, filtering

junk email

Microsoft Kinect

can watch users

gestures

Microsoft

launches

Azure Machine

Learning

Microsoft

search engine

built with

machine

learning

Bing Maps ships

with ML traffic-

prediction

service

Successful, real-

time, speech-

to-speech

translation

Microsoft & Machine Learning15 years of realizing innovation

John Platt, Distinguished scientist at

Microsoft Research

1999 201220082004 201420102005

Machine learning is pervasive throughout

Microsoft products.“

Page 14: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

Microsoft’s ML Investments

Azure Machine

Learning Studio

SaaS on Azure

Project Sage &

Basket

SaaS on Azure

Microsoft Social

Listening

Sentiment Analysis

Microsoft Bing

Predictions

Search Engine

Cortana

Phone Assistant

Power BI

Forecasting

Prescriptive Analytics

XBOX

Intelligent Gaming

Page 15: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

What is Azure Machine Learning

Designed for Budding & Experienced

Data Scientists

Industry Standard Algorithms & Support

for R Language

Connectivity to HDI for Hadoop

Pay-per-use Predictive Analytics in

Microsoft Azure

Data Scientist Toolkit

Deploy Predictive models to Production

in minutes

Page 16: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

How does Azure ML make Data Science easy

Define

Problem

Extract &

Explore

Data

Feature

Engineering

Create

Model

Score

Model

Create

Apps with

Web

Service

Predict the

Future

Page 17: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

How it all works

The Environments The Team

Azure Portal

ML Studio

ML API service

Azure Ops Team

Data Scientists

Developers

Visual Studio BI Users & Roles

Page 18: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

Microsoft AML Architecture for Analytics

PowerBI/Dashboards

Mobile Apps

Web Apps

HDInsight

Azure Storage

Desktop Data

ML Training ML Scoring

OData Data Consumption

Text Mining

Algorithms &

Scoring

Data Manipulation

& Transformation

Data Exploration

& Statistical

Analysis

Saving Trained

Model & Re-work

Trained Model

Consumption

Scoring

Experiment

Creation

AML Web Service

Creation

Model Testing in

ML Studio

App Creation &

Testing

Page 19: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

What is Predictive Analytics

Supervised Learning

History -> Prediction

Examples -> Labels

Probability

Statistics

Discrete/Continuous

Predict N-class

Algorithm Engineering

Data Engineering

Page 20: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

Important parts of Prediction Datasets

Features

Target

Risks

Page 21: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

Model Training Methods

Page 22: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

Demo

Classification & Regression

Page 23: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with
Page 24: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

Activation Fraud detection

Page 25: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

Breast Cancer Prediction

Page 26: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

Automobile Price Prediction

Page 27: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

Demo

Time Series Prediction

Dataset Source: https://github.com/cmrivers/ebola

Page 28: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

Ebola Prediction

Page 29: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

USD to INR conversion rate ARIMA/STL

Page 30: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

Demo

Text Analytics

Page 31: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with
Page 32: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

Tips for budding Data Scientists

Page 33: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

Data Scientist Jobs

https://data-scientists-count.silk.co/explore/donutchart/collection/data-scientist-sector-breakdownv2/numeric/data-scientists/order/desc/data-scientists

Page 34: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

What Next for ML @ Microsoft

Page 35: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

Q&A

Page 36: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

ReferencesRelated references for you to expand your knowledge on the subjecthttp://datamarket.azure.com/dataset/amla/recommendations

http://datamarket.azure.com/dataset/amla/mba

http://azure.microsoft.com/en-us/services/machine-learning/

http://en.wikipedia.org/wiki/Predictive_analytics

http://blogs.technet.com/b/machinelearning/archive/2014/09/17/extensibility-and-r-support-in-the-azure-ml-platform.aspx

http://blogs.technet.com/b/saketbi/archive/2014/08/20/microsoft-azure-ml-amp-r-language-extensibility.aspx

Predictive Analytics: The power to predict who will click, buy, lie, or Die

by Eric Siegel

technet.microsoft.com/en-in

aka.ms/mva

msdn.microsoft.com/

Page 37: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

Follow us online

Facebookfacebook.com/MicrosoftDeveloper.India

twitter.com/msdevindia

Twitter

Twitter: #AMLTechEd2014

Email:[email protected]@microsoft.com

Page 38: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

Your Feedback is Important

OPTION 3: Feedback stations outside the hall

Fill out evaluation of this session and help shape future events.

OPTION 1 OPTION 2

Replace this space with the

actual QR Code

Page 39: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with
Page 40: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with
Page 41: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with
Page 42: Learn. Connect. Explore. - Microsoftteched2013.blob.core.windows.net/techedpresentation2014/Azure Track B Day 1/How to...Learn. Connect. Explore. How to Become a Data Scientist with

Active learning

OLS

Matrix factorization

Neural net

Latent Dirichlet Allocation

loss functions

Stochastic gradient descent

BFGS

Conjugate gradient

L1 norm L2 norm elastic net regularization

Binary

Numerical

Categorical (via flexible feature-naming and the hash trick)

Can deal with missing values/sparse-features

On the fly generation of feature interactions (quadratic and cubic)

On the fly generation of N-grams with optional skips (useful for word/language data-sets)

Automatic test-set holdout and early termination on multiple passes

bootstrapping

User settable online learning progress report + auditing of the model

http://en.wikipedia.org/wiki/Vowpal_Wabbit