Machine learning services with SQL Server 2017

Preview:

Citation preview

Machine Learning

Services with SQL Server 2017MARK TABLADILLO PH.D.

LEAD DATA SCIENTIST, MICROSOFT

JULY 31, 2017

Microsoft

https://marketrealist.imgix.net/uploads/2017/07/Microsoft-Shares-Are-at-an-All-Time-High-

2017-07-24.jpg?w=660&fit=max&auto=format

Microsoft and Open Source

SQL Server 2017 on Linux

Nearly 1/3 of Virtual Machines (IAAS) on Azure are Linux https://news.microsoft.com/bythenumbers/azure-virtual

Purchase of RevolutionR

R Distribution Microsoft R Client

R inside Azure Machine Learning, Power BI, SQL Server, Jupyter

Python inside Azure Machine Learning, SQL Server, Jupyter

Cloud Shell In Azure (preview): yes, we mean Bash

https://azure.microsoft.com/en-us/features/cloud-shell/

Microsoft now the leading contributor on GitHub

Focus

1) to describe major features of this technology for technology

managers;

2) to outline use cases for architects; and

3) to provide demos for developers and data scientists.

SQL Server 2017MAJOR FEATURES

Gartner Review October 2016

SQL Server on Linux

Possible with Drawbridge

Over 1M Docker Downloads

Whitepaper on Linux

https://info.microsoft.com/SQL

Server-on-Linux-Open-source-

enterprise-environment.html

Video – Overview of SQL

Server on Linux

https://channel9.msdn.com/e

vents/connect/2016/101

Microsoft Release Acronyms

CTP RC

Community Technology

Preview

Release Candidate

Versions of Microsoft SQL Server

https://docs.microsoft.com/en-us/sql/sql-server/editions-and-

components-of-sql-server-2017

Enterprise

Many data scientists will use the free developer version (not

intended for production)

Since we are still at RC (Release Candidate):

Free 180 day evaluation version (Enterprise equivalent)

Windows Docker image

Linux Docker image

https://www.microsoft.com/en-us/evalcenter/evaluate-sql-server-2017-

ctp

Data Science & AI

Certifications

https://borntolearn.mslearn.net/b/weblog/posts/microsoft-introduces-

several-new-data-management-amp-analytics-certifications

Team Data Science Process https://github.com/Azure/Microsoft-TDSP

• A statistics programming language

• Data analysis & visualization capabilities

• Majority of data scientists use R

• Thriving user groups worldwide

• Vibrant open Source community

• 10,000 + free algorithms in CRAN

• New and recent grad’s use it

#1Language

Advanced

Analytics

2.5M+Users

Open Biggest

Ecosystem

• Strong ties to academia feeds ever-

growing machine learning capabilities

What is

• Constantly innovating

but, Open Source R is not Enterprise Class

76% of analytic professionals report using R

36% select R as their primary tool

R Usage GrowthRexer Data Miner Survey

2007-2015

InadequateModeling

Performance

??

Lack of Commercial

Support

Complex DeploymentProcesses

Limited Data Scale

Our data science tool that allows you to do high performance analytics on production data, running locally on

your computer.

https://microsoft.github.io/r-server-loan-chargeoff/index.html

https://docs.microsoft.com/en-us/sql/advanced-analytics/getting-started-with-machine-learning-services

O(16)NOPERATIONALIZATION

Classified as Microsoft Confidential

• Turn R analytics Web

services in one line of

code;

• Swagger-based REST

APIs, easy to consume,

with any programming

languages, including R!

• Deploying web service

server to any platform:

Windows, SQL,

Linux/Hadoop

• On-prem or in cloud

• Fast scoring, real time

and batch

• Scaling to a grid for

powerful computing with

load balancing

• Diagnostic and capacity

evaluation tools

• Enterprise

authentication:

AD/LDAP or AAD

• Secure connection:

HTTPS with SSL/TLS 1.2

• Enterprise grade high

availability

Classified as Microsoft Confidential

• Turn R analytics Web

Service in one line of

code;

• Swagger-based REST

APIs, easy to consume,

with any programming

languages, including R!

• Deploying Web Service

server to any platform:

Windows / SQL /

Linux/Hadoop

• On Prem or in Cloud

• Fast scoring, real time

and batch

• Scaling to a grid for

powerful computing with

load balancing

• Diagnostic and capacity

evaluation tools

• Enterprise

authentication: LDAP /

AD/ AAD

• Secure connection:

HTTPS with SSL.TSL1.2

• Enterprise grade High

Availability

Classified as Microsoft Confidential

Data Scientist

Developer

Easy Integration

Easy Deployment

Easy Setup

▪ In-cloud or on-prem

▪ Adding nodes to scale

▪ High availability & load balancing

▪ Remote execution server

Microsoft R Serverconfigured for

operationalizing R analytics

Microsoft R Client

(mrsdeploy package)

Data Scientist

Easy Consumption

publishServiceMicrosoft R Client

(mrsdeploy package)

Classified as Microsoft Confidential

Build the model first Deploy as a web service instantly

Classified as Microsoft Confidential

Function Description

publishServicePublish a predictive function as a Web

Service

deleteService Delete a Web Service

getService Get a Web Service

ListServices List the different published web services

serviceOptionRetrieve, set, and list the different service

options

updateService Updates a Web Service

Classified as Microsoft Confidential

Data Scientist

# Run the following code in R

swagger <- api$swagger()

cat(swagger, file = "swagger.json",

append = FALSE)

Generate Swagger Docs for Web Services

Developer

Popular Swagger Tools:

AutoRest or Code Generator

AutoRest.exe -CodeGenerator

CSharp -Modeler Swagger -

Input swagger.json -

Namespace Mynamespace

Run Swagger tools to generate code

Developer

Write a few code to consume the service

Classified as Microsoft Confidential

Share / Reuse R code / functions• Not just models, a data scientist can share any functional code as a service.

• Other data scientists can explore in the repository to re-use those functions.

Enable Model Management capabilities• A Predictive Web Service = “Model” + “Prediction Script”

• R Server hosts all those services Central Repo of Models

• Each service has a version tag Model Version Control

• All versions are active Model Roll Back (to any version)

• A service can be accessed by any authorized users

• Model reuse

• Model validation and monitoring by QA team

After service is published, I can

test if the service works as

expected right away

Classified as Microsoft Confidential

▪ Built-in remote execute

functions in R Client/R Server

▪ Generate Diff report to

reconcile local and remote

▪ Execute .R script or interactive

R commands

▪ Results come back to local

▪ Generate working snapshots

for resume and reuse

▪ IDE agnostic

R Client

(mrsdeploy package)R Server

configured to

Remote Execute R Scripts

(Support Window Server, Linux

Server, Hadoop )

▪ Execute R Scripts

▪ Snapshot remote env.

▪ Logout remote server

▪ Login remote server

▪ Generate Diff report

▪ Reconcile Environment

Classified as Microsoft Confidential

Snapshot Functions

createSnapshotCreate a snapshot of the remote session (workspace and

working directory)

loadSnapshotLoad a snapshot from the server into the remote session

(workspace and working directory)

listSnapshots Get a list of snapshots for the current user

downloadSnapshot Download a snapshot from the server

deleteSnapshot Delete a snapshot from the server

Remote Objects Management

listRemoteFilesGet a list of files in the working directory of the

remote session

deleteRemoteFileDelete a file from the working directory of the remote

R session

getRemoteFileCopy a file from the working directory of the remote

R session

putLocalFileCopy a file from the local machine to the working

directory of the remote R session

getRemoteObject Get an object from the remote R session

putLocalObjectPut an object from the local R session and load it into

the remote R session

getRemoteWorkspaceTake all objects from the remote R session and load

them into the local R session

putLocalWorkspaceTake all objects from the local R session and load

them into the remote R session

Remote Connection

remoteLoginRemote login to the R Server with AD or admin

credentials

remoteLoginAAD Remote login to R Server server using Azure AD

remoteLogout Logout of the remote session on the DeployR Server.

Remote Execution

remoteExecute Remote execution of either R code or an R script

remoteScript Wrapper function for remote script execution

diffLocalRemote Generate a 'diff' report between local and remote

pause Pause remote connection and back to local

resume Return the user to the 'REMOTE >' command prompt

Classified as Microsoft Confidential

• Turn R analytics Web

Service in one line of

code;

• Swagger-based REST

APIs, easy to consume,

with any programming

languages, including R!

• Deploying Web Service

server to any platform:

Windows / SQL /

Linux/Hadoop

• On Prem or in Cloud

• Fast scoring, real time

and batch

• Scaling to a grid for

powerful computing with

load balancing

• Diagnostic and capacity

evaluation tools

• Enterprise

authentication: LDAP /

AD/ AAD

• Secure connection:

HTTPS with SSL.TSL1.2

• Enterprise grade High

Availability

Classified as Microsoft Confidential

ModelPrepare

SQL

2016

OperationalizeOperationalize

R & ScaleR

Models

CRAN R

Models

AzureML

Web Services

R Server VMs

ModelPrepare

Operationalize

T-SQL/Stored

Procedure

Operationalize

R Server

On PremCloud

Deploy to SQL

Server 2016

Deploy to Hadoop / Linux

Server / Windows Server

Classified as Microsoft Confidential

ModelPrepare

OperationalizeOperationalize

R & ScaleR Models R Models

On Prem

Classified as Microsoft Confidential

ModelPrepare

Operationalize

SQL,

HDFS

R & ScaleR Models

On Prem • R Server

• T-SQL/Stored

Procedure

Classified as Microsoft Confidential

Product Platforms Modeling Operationalization

R Server for Windows Windows Server 2012 - 2016 Same as modeling

R Server for Linux Red Hat Enterprise Linux 6.X and 7.X 7.x

R Server for Linux SUSE Enterprise SLES 11 will support in future release

R Server for Linux Ubuntu 14.04 LTS, 16.04 LTS Same as modeling

R Server for Linux CentOS 6.X and 7.X 7.x

R Server for Hadoop Red Hat and SUSE Enterprise RHEL 6.x and 7.x, SUSE SLES11 RHEL 7.x

Classified as Microsoft Confidential

• Turn R analytics Web

Service in one line of

code;

• Swagger-based REST

APIs, easy to consume,

with any programming

languages, including R!

• Deploying Web Service

server to any platform:

Windows / SQL /

Linux/Hadoop

• On Prem or in Cloud

• Fast scoring, real time

and batch

• Scaling to a grid for

powerful computing with

load balancing

• Diagnostic and capacity

evaluation tools

• Enterprise

authentication: LDAP /

AD/ AAD

• Secure connection:

HTTPS with SSL.TSL1.2

• Enterprise grade High

Availability

Classified as Microsoft Confidential

• Easily scale up a single

server to a grid to handle

more concurrent requests

• Load balancing cross

compute nodes

• A shared pool of warmed

up R shells to improve

scoring performance.

R

Client

Classified as Microsoft Confidential

• Health check node

configuration

• Get system status

• Trace R code execution

• Trace service execution

• Evaluate grid capacity

• Simulate traffic per service

• Configure with # of

concurrent threads or

latency thresholds

Classified as Microsoft Confidential

• Turn R analytics Web

Service in one line of

code;

• Swagger-based REST

APIs, easy to consume,

with any programming

languages, including R!

• Deploying Web Service

server to any platform:

Windows / SQL /

Linux/Hadoop

• On Prem or in Cloud

• Fast scoring, real time

and batch

• Scaling to a grid for

powerful computing with

load balancing

• Diagnostic and capacity

evaluation tools

• Enterprise

authentication: LDAP /

AD/ AAD

• Secure connection:

HTTPS with SSL.TSL1.2

• Enterprise grade High

Availability

Classified as Microsoft Confidential

• Seamless integration

with authentication

solution: LDAP/AD/AAD

• Secure connection:

HTTPS encrypted by TLS

1.2/SSL

• Compliance with

Microsoft Security

Development Lifecycle

R

Client

Classified as Microsoft Confidential

Load Balancer

• Server level HA:

Introduce multiple Web

Nodes for Active-Active

backup / recovery, via

load balancer

• Data Store HA: leverage

Enterprise grade DB, SQL

Server and Postgres’ HA

capabilities

Connect

LinkedIn

SlideShare

Twitter @marktabnet

Abstract

SQL Server 2017 introduces Machine Learning Services with two

independent technologies: R and Python. The purpose of this

presentation is 1) to describe major features of this technology for technology managers; 2) to outline use cases for architects; and 3)

to provide demos for developers and data scientists.

Recommended