Executive Briefing: What it takes to use machine learning ......Ka9a Streams Akka Streams …...

@deanwampler

Executive Briefing: What it takes to use machine learning in fast data pipelines

@deanwampler

@deanwampler dean@deanwampler.com polyglotprogramming.com/talks

@deanwampler@deanwampler

lbnd.io/fast-data-ebook

Data Streaming, in General

What We’ll Discuss• Batch vs. streaming... and why• Data science vs. data engineering• Serving models in production• CI/CD Systems for ML• Example architecture• Updating Models in Production

@deanwampler

Batch vs. streaming… and why

@deanwampler

EnergyMedical

Finance

… and IoT

Telecom

Mobile@deanwampler

State of the art phone!

EnergyMedical

Finance

… and IoT

Telecom

Mobile

Information value has a half life;

it decays with time

@deanwampler

Data Science vs. Data Engineering

@deanwampler

Software Engineering toolbox

Data Science toolbox

@deanwampler

• Comfortable with uncertainty • Less process oriented• Iterative, experimental

• Uncomfortable with uncertainty• Process oriented• Agile Manifesto• … which does not

mention data!https://derwen.ai/s/6fqt

@deanwampler

Data Scientists Data Engineers

Streaming Imposes New Requirements

@deanwampler

If you run something long enough, all rare problems

eventually happen!

@deanwampler

• Reliability - fault and “surprise” tolerant• Availability - “always on”• Low latency - for some definition of “low”• Scalability - up and down• Adaptability - ideally without restarts

@deanwampler

In other words: Microservices

Browse

AccountOrders

ShoppingCart

API Gateway

Inventory

Serving Models in Production

@deanwampler

A Recent Kubeflow User Survey

@deanwamplerFrom this Kubeflow Overview

• ~60% worry about missed opportunities• ~50% worry about loss of data team

productivity• ~45% worry about slow time-to-market• ~40% worry about customer dissatisfaction

Lack of Tool/Process Integration

@deanwamplerFrom a recent Lightbend survey

•Why did the model reject that loan application?

Can You Answer this Question?

@deanwampler

(After you’ve been sued for discrimination…)

•Which version of the model was used?• How was it trained? •When was this model deployed?•… and other questions you’ll need to answer to

understand what happened…

Which model was it?

@deanwampler

CI/CD for ML?

@deanwamplera.k.a “MLops”

• Version control - for models and code• Automation - builds, tests, quality checks,

artifact management & delivery• Necessary for reproducibility

CI/CD Process Required (1/4)

@deanwampler

• Supports different launch configurations: • “dark” launches• A/B, Canary, and other testing scenarios

CI/CD Process Required (2/4)

@deanwampler

• Auditing• Which model used to score this record?• Which records used to train this model?• Who accessed this model and when?

CI/CD Processes Required (3/4)

@deanwampler

Models Are

@deanwampler

GDPR - What if a customer asks you to delete their data? Do you also delete the

models trained with that data?

•Monitoring• Resource utilization changes?• Quality metrics:•Match performance during training?• Concept drift?

@deanwampler

• AutoML• Data safety and lineage•Model fairness and reproducibility•Model and feature artifact management

https://www.oreilly.com/ideas/9-ai-trends-on-our-radar@deanwampler

What’s Different from Microservice CI/CD?

@deanwampler

• CI/CD prefers deterministic measures of quality. How should you support the extra statistical indeterminacy data science introduces?

What’s Different from Microservice CI/CD?

•Kubeflow - for Kubernetes• SageMaker - for AWS users•MLFlow - from the Spark community•… plus emerging vendors

CI/CD Suites for ML

@deanwampler

Systems - Kubeflow

Example Architectures

@deanwampler

Example Architectures

@deanwampler

Timely Information Integrated with

Your Apps

@deanwamplerDataCenter

Mini-batch,Batch

LowLatencyMicroservices

Ka;aStreamsAkkaStreams

Sessions

Streams

Storage

Device

2. → Model Training3. ← New models

4. ← Model Storage5. → Boot up, historical data

6. → Data Pipeline 7. → Model Serving 8. ← Anomalies

6, 7, 89

MicroserviceMicroserviceMicroservice

DeviceSessionMicroservices

Persistence

Ka+a Cluster

BrokerTelemetry

Ingest Scores

CorrectiveAction

Mini-batch,Batch

Sessions

Streams

Storage

Device

6, 7, 89

Persistence

Ka+a Cluster

BrokerTelemetry

Ingest Scores

CorrectiveAction

Session management, REST microservices

Model Scoring

Model Training

Three groups of functionality

Mini-batch,Batch

Sessions

Streams

Storage

Device

6, 7, 89

Persistence

Ka+a Cluster

BrokerTelemetry

Ingest Scores

CorrectiveAction

Embedded model serving

REST,gRPC,…

LowLatency

Ka9aStreamsAkkaStreams

Sessions

Streams

Storage

Device

Telemetry1

2. → Model Training

5. → Data Pipeline 7. ← Anomalies

5, 7Ingest Scores8

CorrectiveAction

TensorFlow,…

Persistence

Ka+a Cluster

Broker 6Model Serving

Model serving as a service

@deanwampler

TensorFlow Serving

@deanwampler

DataCenter

REST,gRPC,…

LowLatency

Sessions

Streams

Storage

Device

Telemetry1

5, 7Ingest Scores8

CorrectiveAction

TensorFlow,…

Persistence

Ka+a Cluster

Model Serving as a Service• Pros:• A familiar integration

pattern • Decouples “concerns”: AI

tools, scaling, upgrading, …• One system for training

and scoring

@deanwampler

• Cons:• Overhead of invocation,

e.g., REST• ML Pipeline becomes a

unique production work flow

Model Serving as a Service

DataCenter

REST,gRPC,…

LowLatency

Sessions

Streams

Storage

Device

Telemetry1

5, 7Ingest Scores8

CorrectiveAction

TensorFlow,…

Persistence

Ka+a Cluster

@deanwampler

DataCenter

Mini-batch,Batch

Sessions

Streams

Storage

Device

6, 7, 89

Persistence

Ka+a Cluster

BrokerTelemetry

Ingest Scores

CorrectiveAction

Embedded Model Serving• Pros:• Lowest scoring

overhead - interprocess communication only used for model updates• Performance tuning

focuses on one system, the data pipeline

@deanwampler

Embedded Model Serving• Cons:• Model parameters must

be serialized• More complexity• Model serving library

must be “compatible” with training system

DataCenter

Mini-batch,Batch

Sessions

Streams

Storage

Device

6, 7, 89

Persistence

Ka+a Cluster

BrokerTelemetry

Ingest Scores

CorrectiveAction

Updating Models in Production @deanwampler

@deanwampler

• Concept Drift - models grow stale• They have a half life, too• So, periodically retrain, then serve the new

model, ideally without downtime

Model Updates

@deanwampler

• How do you measure model quality?• What’s the trade-off between model

performance vs. retraining cost?• How far back in the data set do you go

when training?

Retraining Considerations

@deanwampler

DataCenter

Mini-batch,Batch

Sessions

Streams

Storage

Device

6, 7, 89

Persistence

Ka+a Cluster

Broker1. Telemetry

9. Ingest Scores

10. Corrective Action

Complex to update embedded models!

DataCenter

REST,gRPC,…

LowLatency

Sessions

Streams

Storage

Device

1. Telemetry1

5, 78. Ingest Scores8

9. Corrective Action9

TensorFlow,…

Persistence

Ka+a Cluster

Broker 66. Model Serving

Model updates can be straightforward

@deanwampler

• Kind of Model• Parameters and hyperparameters•When trained• Data used for training•When deployed, undeployed, etc.•…

Auditing

@deanwampler

• Quality metrics• Serving metrics (how many records, scoring

times…)• Provenance of decision to retrain• The metrics gathered above that were used to

decide when to retrain

Auditing

@deanwamplerMars last Summer

Dusty Milky Way

References• Ideas:• Ben Lorica on 9 AI Trends• Paco Nathan’s Data Governance Talk

@deanwampler

You can get these slides with the links here: polyglotprogramming.com/talks

References• Ideas:• O’Reilly Radar: Data, AI, others• distill.pub• The Algorithm• The Gradient

@deanwamplerpolyglotprogramming.com/talks

References• A few research papers, etc.• Incremental training • an example• Continual learning• Explainability

References• Kubeflow•MLFlow• DVC • AWS SageMaker• Fiddler (explainable AI)

References• General Information about Stream Processing• My O'Reilly Report on Architectures• Streaming Systems Book• Stream Processing with Apache Spark• Designing Data-Intensive Apps book

References• Other Talks• Strata Talk on ML in a Streaming Context• Stream All the Things! (video)• Streaming Microservices with Akka Streams

and Kafka Streams (video)

References• Tutorials• Model serving in streams• Stream processing with Kafka and

microservices

@deanwampler dean@deanwampler.com

polyglotprogramming.com/talks

Questions?

Executive Briefing: What it takes to use machine learning ......Ka9a Streams Akka Streams …...

Documents

Reactive Summit, Montreal QC, 2018 October 24 Making Music … · 2018-10-30 · A Quick Review Of Akka Streams Streams support continuous processing of data in a pipeline Akka Streams

Kiki, principal enterprise architect @Lightbend · Akka Streams Akka Cluster Akka Http. Scalability Cube: 3 dimensions of scalability Y-axis: scale via functional compositi on x-axis:

A reactive platform for real-time asset trading case study ...reactive+platform...scala, akka persistence, akka cluster, akka streams, cassandra (dse), apache kafka, gatling, prometheus,

Architecting Fast Data Applications - TechArch Day · Kafka Storage Akka Streams Kafka Streams... Low Latency Microservices Spark Data Center Streaming, Batch Processing Device Session

Reactive Kafka with Akka Streams

Akka streams kafka kinesis

Akka Streams - From Zero to Kafka

Distributed Data Analytics Thorsten Papenbrock Akka Actor ... · Akka Streams Asynchronous, non-blocking, backpressured, reactive stream classes Akka Http Asynchronous, streaming-first

Journey into Reactive Streams and Akka Streams

Akka streams

Akka HTTP: The Reactive Web Toolkitdownloads.typesafe.com/.../ScalaDaysSF2015/T1_Kuhn_Akka_Strea… · • reactive-streams 1.0.0-RC3 • Akka Streams & HTTP 1.0-M4 • still missing:

Delivering Transformative Reactive systems on OpenShift ... · Why Akka Streams? Akka Streams is a DSL on top of Akka. Akka uses the actor model to create stateful services. Actors

Understanding Akka Streams, Back Pressure, and Asynchronous Architectures

Reactive Stream Processing with Akka Streams

Trends Aﬀecting System Architectures: Mobile …...Ka;a Streams Akka Streams … Sessions Streams Storage Device 1 Telemetry 2. → Model Training 3. ← New models 2, 3 4. ← Model

Exploring Reactive Integrations With Akka Streams, Alpakka And Apache Kafka

Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka

Fresh from the Oven (04.2015): Experimental Akka Typed and Akka Streams

Building Kafka-based Microservices with Akka Streams and ... · Akka Streams, Kafka Streams - libraries for “data-centric micro services”. Smaller scale, but great flexibility

Reactive Streams / Akka Streams - GeeCON Prague 2014