20
Copyright 2014, Quantia Analytics, LLC. All rights reserved Quantia and the Quantia logo are registered trademarks of Quantia Analytics, LLC. 1 Copyright 2014, Quantia Analytics LLC. All rights reserved Quantia and the Quantia logo are registered trademarks of Quantia Analytics, LLC. 1 QUANTIA ANALYTICS Using R on the Azure Platform Seattle UseR Group November 13, 2014 Stephen F Elston, Managing Director Quantia Analytics

Settle UseR 2014-10-13

Embed Size (px)

DESCRIPTION

The slides I used for a presentation to the Seattle UseR group, on using R with Microsoft Azure Machine Learning. The presentation covers and in-depth example of creating and evaluating predictive analytics model mixing native Azure ML modules and R code.

Citation preview

Page 1: Settle UseR 2014-10-13

Copyright 2014, Quantia Analytics, LLC. All rights reservedQuantia and the Quantia logo are registered trademarks of Quantia Analytics, LLC.

1 Copyright 2014, Quantia Analytics LLC. All rights reservedQuantia and the Quantia logo are registered trademarks of Quantia Analytics, LLC.

1

QUANTIA ANALYTICS

Using R on the Azure PlatformSeattle UseR Group

November 13, 2014

Stephen F Elston, Managing Director Quantia Analytics

Page 2: Settle UseR 2014-10-13

Copyright 2014, Quantia Analytics, LLC. All rights reservedQuantia and the Quantia logo are registered trademarks of Quantia Analytics, LLC.

2

Bicycle rental data

Overview of Azure ML Studio

Running R in Azure ML

• Execute R Script module

• I/O

• Graphics and transformation example

Building and evaluating models in Azure ML

Building models in R for Azure ML

• Model and prediction example

• Serialization of R objects

- Evaluating and comparing models

Publishing models as a web service

Agenda

R code and data available at:

https://github.com/Quantia-Analytics/AzureML-Regression-Example/

Page 3: Settle UseR 2014-10-13

Copyright 2014, Quantia Analytics, LLC. All rights reservedQuantia and the Quantia logo are registered trademarks of Quantia Analytics, LLC.

3

Capital Bicycle Rental Data

2 years of hourly data

• 17,379 rows

Forecast hourly demand for rental bikes

• Overestimate is preferred to underestimate

9 predictor variables

• 5 date-time

• 4 weather

• Compute 5 additional predictors

Short time series limits modeling

Citation: Fanaee-T, Hadi, and Gama, Joao, 'Event labeling

combining ensemble detectors and background knowledge',

Progress in Artificial Intelligence (2013): pp. 1-15, Springer Berlin

Heidelberg

Page 4: Settle UseR 2014-10-13

Copyright 2014, Quantia Analytics, LLC. All rights reservedQuantia and the Quantia logo are registered trademarks of Quantia Analytics, LLC.

4

Why Azure ML?

Quickly deploy solutions as web service

Run models in a highly scalable cloud environment

Code and data in a secure environment

Variety of efficient built-in algorithms

Azure ML workflow extensible with R!

Page 5: Settle UseR 2014-10-13

Copyright 2014, Quantia Analytics, LLC. All rights reservedQuantia and the Quantia logo are registered trademarks of Quantia Analytics, LLC.

5

Overview of Azure ML Studio

Overview of Azure ML Studio

• Modules

• Properties and documentation

• Canvas and workflow

Starting an Experiment

Loading data sets

Reading and writing data

Page 6: Settle UseR 2014-10-13

Copyright 2014, Quantia Analytics, LLC. All rights reservedQuantia and the Quantia logo are registered trademarks of Quantia Analytics, LLC.

6

Execute R Script Module in Azure ML

R in the Execute R Script

• Execute pure R code

• Each module is an R environment

• > 350 packages available + install script

• Parallel execution using snow and doSNOW packages

Working between R Studio and Azure ML

• Edit, debug and test in R Studio

• Run R and publish as a web service in Azure ML

Azure ML debugging tips

• Error messages and printed output available in the output.log file

Page 7: Settle UseR 2014-10-13

Copyright 2014, Quantia Analytics, LLC. All rights reservedQuantia and the Quantia logo are registered trademarks of Quantia Analytics, LLC.

7

5 I/O Ports of Execute R Script Module (input)

Dataset1, Dataset2

• Input rectangular Azure data tables

• Produce R data frames in Execute R Script module

• Use maml.mapInputPort() function for ports 1 and 2

Script bundle

• Input zipped R files

• Useful for reference data, utility functions, etc.

• Use familiar source("src/yourfile.R") and load("src/yourData.rdata")

functions

Page 8: Settle UseR 2014-10-13

Copyright 2014, Quantia Analytics, LLC. All rights reservedQuantia and the Quantia logo are registered trademarks of Quantia Analytics, LLC.

8

5 I/O Ports of Execute R Script Module (output)

• Results dataset

• Output an R data frame

• Produces Azure ML rectangular data table

• Use the maml.mapOutputPort() function

• R Device

• R printed and graphics output

Page 9: Settle UseR 2014-10-13

Copyright 2014, Quantia Analytics, LLC. All rights reservedQuantia and the Quantia logo are registered trademarks of Quantia Analytics, LLC.

9

Workflow using Execute R Script

Workflow to visualize bike rental data

Page 10: Settle UseR 2014-10-13

Copyright 2014, Quantia Analytics, LLC. All rights reservedQuantia and the Quantia logo are registered trademarks of Quantia Analytics, LLC.

10

Create and Evaluate an Azure ML Model

Workflow for machine learningData Input

Data Transformations

Model Definition

Train

Split

Score

Evaluate

See tutorials and examples at:

http://azure.microsoft.com/en-us/documentation/services/machine-learning/

Page 11: Settle UseR 2014-10-13

Copyright 2014, Quantia Analytics, LLC. All rights reservedQuantia and the Quantia logo are registered trademarks of Quantia Analytics, LLC.

11

Passing R Objects Between Execute R Script Modules

Azure ML requires output and input to be rectangular tables

• Use R serialize() and unserialize() functions

• Serialize list of R objects as column of a data frame

• Coerce type to integer

• I provide serList() and unserList() functions to get you started

Provides flexibility

• List can contain most any R object

• E.g. separate model calculation from prediction

• WARNING: limit on size of R vector may be a problem

Page 12: Settle UseR 2014-10-13

Copyright 2014, Quantia Analytics, LLC. All rights reservedQuantia and the Quantia logo are registered trademarks of Quantia Analytics, LLC.

12

Workflow integrating R scripts plus

Workflow for decision

forest model

Page 13: Settle UseR 2014-10-13

Copyright 2014, Quantia Analytics, LLC. All rights reservedQuantia and the Quantia logo are registered trademarks of Quantia Analytics, LLC.

13

Workflow integrating R scripts plus

Workflow for decision

forest regression

model with outlier

trimming

Page 14: Settle UseR 2014-10-13

Copyright 2014, Quantia Analytics, LLC. All rights reservedQuantia and the Quantia logo are registered trademarks of Quantia Analytics, LLC.

14

Workflow integrating R scripts plus

Workflow for

experiment with

neural network

regression model

added

Page 15: Settle UseR 2014-10-13

Copyright 2014, Quantia Analytics, LLC. All rights reservedQuantia and the Quantia logo are registered trademarks of Quantia Analytics, LLC.

15

R models in Azure ML

Use serialization between model computation and prediction

Data Input

Data Transformations

Split

Evaluate

Compute Model

Serialize Object

Prediction

Unserialize Object

Execute R Script Modules

Page 16: Settle UseR 2014-10-13

Copyright 2014, Quantia Analytics, LLC. All rights reservedQuantia and the Quantia logo are registered trademarks of Quantia Analytics, LLC.

16

Workflow integrating R scripts plus

Workflow for

experiment with R

support vector

machine model

added

Page 17: Settle UseR 2014-10-13

Copyright 2014, Quantia Analytics, LLC. All rights reservedQuantia and the Quantia logo are registered trademarks of Quantia Analytics, LLC.

17

Publishing Azure ML Model as a Web Service

Publish input and output as web service

• Input to data transformation

• Output from prediction

Web services run in Azure cloud

• Azure platform for native and R models

• Include data cleaning and transformation modules in flow

Page 18: Settle UseR 2014-10-13

Copyright 2014, Quantia Analytics, LLC. All rights reservedQuantia and the Quantia logo are registered trademarks of Quantia Analytics, LLC.

18

Publishing an Azure ML Model as a Web Service

Trained Model

Results Transformation

Data Transformations

PublishedInput

PublishedOutput

Page 19: Settle UseR 2014-10-13

Copyright 2014, Quantia Analytics, LLC. All rights reservedQuantia and the Quantia logo are registered trademarks of Quantia Analytics, LLC.

19

Workflow integrating R scripts plus

Azure ML web

services workflow

ready for promotion to

production

Page 20: Settle UseR 2014-10-13

Copyright 2014, Quantia Analytics, LLC. All rights reservedQuantia and the Quantia logo are registered trademarks of Quantia Analytics, LLC.

20

Execute R Script module runs most any pure R code

R extends Azure ML capabilities

Clean and transform data with R

Use R or native Azure ML models

Integrate R and native modules in Azure ML

• Execute R Script modules in Azure ML work-flow

• Edit, debug and test R code in RStudio

• Execute in Azure ML

• Publish R model in Azure as web service

Summary