Joseph E. Gonzalez [email protected]; Assistant Professor @ UC Berkeley [email protected]; Co-Founder @ Dato Inc. Intelligent Services Serving Machine

Joseph E. [email protected]; Assistant Professor @ UC Berkeley

[email protected]; Co-Founder @ Dato Inc.

Intelligent ServicesServing Machine Learning

Contemporary Learning Systems

TrainingData Models

Big Big

Contemporary Learning Systems

MLlib

Create

MLCLIBSVMVW Oryx 2

BIDMach

TrainingData Model

What happens after we train a model?

Dashboards and Reports

Conference

PapersDrive Actions

TrainingData Model

What happens after we train a model?

Dashboards and Reports

Conference

PapersDrive Actions

Suggesting Itemsat Checkout

Fraud Detection

Cognitive Assistance

Internet ofThings

Low-Latency Personalized Rapidly Changing

TrainData Model

TrainData Model

Actions ServeAd

apt

9

MachineLearning

IntelligentServices

The Life of a Query in an Intelligent Service

Web

Ser

ving

Tie

rUser

ProductInfo

Inte

llige

nt S

ervi

ce

UserData

ModelInfo

LookupModel

FeatureLookup

FeatureLookup

Top-KQuery

Request:Items like x

New PageImages …

Top Items

Content Request

Feedback:Preferred Item

Feedback

μ σ

ρ∑

∫α

βmath

Essential Attributes of Intelligent Services

ResponsiveIntelligent applications

are interactive

AdaptiveML models out-of-date the

moment learning is done

ManageableMany models

created by multiple people

Responsive: Now and AlwaysCompute predictions in < 20ms for complex

under heavy query load with system failures.

Models Queries

Top

K

Features

SELECT * FROMusers JOIN items,click_logs, pagesWHERE …

Experiment: End-to-end Latency in Spark MLlib

To JSON

HTTP Req. Feature Trans.

Evaluate Model

Encode Prediction

HTTP Response4

0

100

200

300

400

4.3 21.8 22.4

137.750.5

172.6

418.7Late

ncy

in M

il-lis

eco

nds

Adaptive to Change at All Scales

Months Rate of Change Minutes

Population Granularity of Data Session

Shoppingfor Mom

Shoppingfor Me




Shoppingfor Mom

Shoppingfor Me

Population Law of Large Numbers Change Slow

Rely on efficient offline retraining

High-throughput Systems

Months




Small Data Rapidly

Changing

Low Latency Online

Learning

Sensitive to feedback bias

Shoppingfor Mom

Shoppingfor Me

The Feedback LoopI once looked at cameras on Amazon …

My Amazon Homepage

Similar cameras and

accessoriesOpportunity for Bandit Algorithms

Bandits present new challenges:• computation overhead• complicates caching + indexing

Management: Collaborative Development

Teams of data-scientists working on similar tasks“competing” features and modelsComplex model dependencies:

Cat PhotoisCat

CutenessPredictor

Cat Classifier AnimalClassifier Cute!

isAnimal

Predictive Services

UC Berkeley AMPLabDaniel Crankshaw, Xin Wang, Joseph Gonzalez

Peter Bailis, Haoyuan, Zhao Zhang, Michael J. Franklin, Ali Ghodsi,

and Michael I. Jordan

Predictive Services




Active Research Project

Velox Model Serving System

Focuses on the multi-task learning (MTL) domain

[CIDR’15, LearningSys’15]

SpamClassification

Content Rec.Scoring

Session 1:

Session 2:

LocalizedAnomaly Detection

Velox Model Serving SystemPersonalized Models (Mulit-task Learning)

[CIDR’15, LearningSys’15]

Input Output

“Separate” model for each user/context.

Personalized Models (Mulit-task Learning)

Split

PersonalizationModel

FeatureModel

Velox Model Serving System[CIDR’15, LearningSys’15]

Hybrid Offline + Online Learning

Split

PersonalizationModel

FeatureModel

Update the user weights online:• Simple to train + more robust model• Address rapidly changing user

statistics

Update feature functions offline using batch solvers• Leverage high-throughput systems

(Apache Spark)• Exploit slow change in population statistics

Hybrid Online + Offline Learning Results

Similar Test Error Substantially Faster Training

User Pref. Change

Hybrid

Offline Full

HybridOffline Full

Evaluating the Model

Split

CacheFeature Evaluation

Input

Evaluating the Model

Split

CacheFeature Evaluation

Input

Feature Caching

Across Users

Anytime Feature

Evaluation

ApproximateFeature Hashing

Feature Caching

Feature Hash Table

Hash input:

Compute feature:

New input:

LSH Cache Coarsening

Hash new input: Use Wrong Value!

LSH hash fn.

New input

Feature Hash Table

LSH Cache Coarsening

Feature Hash Table

Hash new input: Use Value Anyways!

Req. LSH

Locality-Sensitive Hashing:

Locality-Sensitive Caching:

Anytime PredictionsCompute features asynchronously:

if a particular element does not arrive use estimator instead

Always able to render a prediction by the latency deadline

__ __ __

No C

oarse

nin

g

Coarsening + Anytime Predictions

Overl

y C

oars

en

ed More Features

Approx. Expectation

Better

Best

Coarser Hash

Checkout our poster!

Spark Streami

ng SparkSQL

Graph X ML

library

BlinkDB

MLbase Velox

Training Management + Serving

Spark

HDFS, S3, … Tachyon

ModelManag

er

Prediction

ServiceMesos

Part of Berkeley Data Analytics Stack

Predictive Services

UC Berkeley AMPLabDaniel Crankshaw, Xin Wang,



Dato Predictive Services

Elastic scaling and load balancing of docker.io containers

AWS Cloudwatch Metrics and Reporting

Serves Dato Create models, scikit-learn, and custom python

Distributed shared caching: scale-out to address latency

REST management API: Demo?

Production ready model serving and management system

Predictive Services




Caching, Bandits, &Management

Online/Offline LearningLatency vs. Accuracy

Key Insights:

Responsive

Adaptive

Manageable

TrainData

Model

ActionsServeAd

apt

Future of Learning Systems

Thank You

Joseph E. [email protected], Assistant Professor @ UC Berkeley

[email protected], Co-Founder @ Dato

mailto:[email protected]

mailto:[email protected]

Documents

Joseph E. Gonzalez [email protected]; Assistant Professor @ UC Berkeley [email protected]; Co-Founder @ Dato Inc. Intelligent Services Serving Machine