39
DEEP LEARNING w/ Apache SystemML Mike Dusenberry Engineer, Machine Learning & SystemML IBM Spark Technology Center, SF @dusenberrymw Berkeley - 08.24.16

Deep Learning - UC Berkeley - 08.24.16 MWD

Embed Size (px)

Citation preview

Page 1: Deep Learning - UC Berkeley - 08.24.16 MWD

DEEP LEARNINGw/ Apache SystemML

Mike DusenberryEngineer, Machine Learning & SystemML

IBM Spark Technology Center, SF@dusenberrymw

Berkeley - 08.24.16

Page 2: Deep Learning - UC Berkeley - 08.24.16 MWD

DEEP LEARNING

w/ Apache SystemML

1. Backgrounda. Apache Sparkb. Machine Learningc. Declarative Machine Learning

2. Apache SystemML3. Deep Learning

a. Overviewb. Plansc. SystemML-NN

4. Demo5. Questions

Agenda

Page 3: Deep Learning - UC Berkeley - 08.24.16 MWD

Apache Spark

Page 4: Deep Learning - UC Berkeley - 08.24.16 MWD

Apache Spark● System for large-scale data processing on clusters.● Combines ML, SQL, streaming, and other complex analytics.● Extends Scala idioms, as well as R/Python DataFrame idioms to cluster

computing.● APIs for Scala, Java, Python, R.● Simple to use!● Much more information at https://spark.apache.org/.

Page 5: Deep Learning - UC Berkeley - 08.24.16 MWD

Machine Learning

Page 6: Deep Learning - UC Berkeley - 08.24.16 MWD

Machine Learning● Data

○ Multiple “examples”○ Multiple “features” per “example”○ “Label(s)” for each “example” (supervised)

● Model○ Construct/select a model that fits the problem.○ Examples:

■ Linear/Logistic Regression■ SVM■ Neural Networks

● Loss○ An “evaluation” of how well the model fits the data.

● Optimizer○ Minimize “loss” by adjusting model to better fit the data.

-A Neural Algorithm of Artistic Style, L.A. Gatys, A.S. Ecker, M. Bethge-https://github.com/jcjohnson/neural-style

Page 7: Deep Learning - UC Berkeley - 08.24.16 MWD

Declarative Machine Learning

Page 8: Deep Learning - UC Berkeley - 08.24.16 MWD

Laptop

Exploratory Data Analysis Today

8

R

Python

Others

DataScientist

DataR

Python

Others

DataScientist

Page 9: Deep Learning - UC Berkeley - 08.24.16 MWD

Laptop

Exploratory Data Analysis Today

9

R

Python

Others

DataScientist

R

Python

Others

DataScientist

Page 10: Deep Learning - UC Berkeley - 08.24.16 MWD

Current Best Practice for Big Data Analysis

DataScientist

DataScientist

DataScientist

HadoopEngineer

SparkEngineer

MPIEngineer

R

Python

Others

Page 11: Deep Learning - UC Berkeley - 08.24.16 MWD

Laptop

DataScientist

Scale-up

Cluster

R

Python Query Optimization

Others

Vision: Declarative Machine Learning

Page 12: Deep Learning - UC Berkeley - 08.24.16 MWD

Apache SystemML

Page 13: Deep Learning - UC Berkeley - 08.24.16 MWD

Apache SystemML● High-level language

○ DML -> R-like○ PyDML -> Python-like

○ Focus is on matrices and linear algebra.

● Engine○ Compiler/Optimizer○ Lots of optimizations, such as

rewrites.

● Runtime○ Laptop○ Spark○ (also Hadoop)

(DML) (PyDML)

Engine

Page 14: Deep Learning - UC Berkeley - 08.24.16 MWD

Apache SystemML● High-level language

○ DML -> R-like○ PyDML -> Python-like

○ Focus is on matrices and linear algebra.

● Engine○ Compiler/Optimizer○ Lots of optimizations, such as

rewrites.

● Runtime○ Laptop○ Spark○ (also Hadoop)

(DML) (PyDML)

Engine

Page 15: Deep Learning - UC Berkeley - 08.24.16 MWD

SystemML - Example: Logistic Regression (DML)

Page 16: Deep Learning - UC Berkeley - 08.24.16 MWD

SystemML - Example: Sigmoid Function (DML)

Page 17: Deep Learning - UC Berkeley - 08.24.16 MWD

Apache SystemML● High-level language

○ DML -> R-like○ PyDML -> Python-like

○ Focus is on matrices and linear algebra.

● Engine○ Compiler/Optimizer○ Lots of optimizations, such as

rewrites.

● Runtime○ Laptop○ Spark○ (also Hadoop)

(DML) (PyDML)

Engine

Page 18: Deep Learning - UC Berkeley - 08.24.16 MWD

Apache SystemML● High-level language

○ DML -> R-like○ PyDML -> Python-like

○ Focus is on matrices and linear algebra.

● Engine○ Compiler/Optimizer○ Lots of optimizations, such as

rewrites.

● Runtime○ Laptop○ Spark○ (also Hadoop)

(DML) (PyDML)

Engine

Page 19: Deep Learning - UC Berkeley - 08.24.16 MWD

SystemML - Compilation Chain

Page 20: Deep Learning - UC Berkeley - 08.24.16 MWD

Apache SystemML● High-level language

○ DML -> R-like○ PyDML -> Python-like

○ Focus is on matrices and linear algebra.

● Engine○ Compiler/Optimizer○ Lots of optimizations, such as

rewrites.

● Runtime○ Laptop○ Spark○ (also Hadoop)

(DML) (PyDML)

Engine

Page 21: Deep Learning - UC Berkeley - 08.24.16 MWD

Apache SystemML● High-level language

○ DML -> R-like○ PyDML -> Python-like

○ Focus is on matrices and linear algebra.

● Engine○ Compiler/Optimizer○ Lots of optimizations, such as

rewrites.

● Runtime○ Laptop○ Spark○ (also Hadoop)

(DML) (PyDML)

Engine

Page 22: Deep Learning - UC Berkeley - 08.24.16 MWD

Apache SystemML● High-level language

○ DML -> R-like○ PyDML -> Python-like

○ Focus is on matrices and linear algebra.

● Engine○ Compiler/Optimizer○ Lots of optimizations, such as

rewrites.

● Runtime○ Laptop○ Spark○ (also Hadoop)

(DML) (PyDML)

Engine

Page 23: Deep Learning - UC Berkeley - 08.24.16 MWD

SystemML - Architecture (APIs and runtime)

23

Command Line JMLC Spark

MLContextSpark

MLAPIs

High-Level Operators (HOPs)

Parser/Language

Low-Level Operators (LOPs)

Compiler

Runtime

Control ProgramRuntime

ProgBuffer Pool

ParFor Optimizer/Runtime

MR InstSpark Inst

CPInst

Recompiler

Cost-based optimizations

DFS IOMem/FS IO

Generic MR

MatrixBlock Library(single/multi-threaded)

Page 24: Deep Learning - UC Berkeley - 08.24.16 MWD

SystemML - Architecture (APIs and runtime)

24

Command Line JMLC Spark

MLContextSpark

MLAPIs

High-Level Operators (HOPs)

Parser/Language

Low-Level Operators (LOPs)

Compiler

Runtime

Control ProgramRuntime

ProgBuffer Pool

ParFor Optimizer/Runtime

MR InstSpark Inst

CPInst

Recompiler

Cost-based optimizations

DFS IOMem/FS IO

Generic MR

MatrixBlock Library(single/multi-threaded)

Page 25: Deep Learning - UC Berkeley - 08.24.16 MWD

SystemML - Spark API (Python)

Page 26: Deep Learning - UC Berkeley - 08.24.16 MWD

Deep Learning

Page 27: Deep Learning - UC Berkeley - 08.24.16 MWD

Deep Learning● Subfield of machine learning.● Essentially focused on creating

large, complex, nonlinear functions to map from inputs to predictions, and in the process learn complex representations of the data.

● Key: These complex functions are built through a deep composition of simple, modular units.

● = Neural Networks -A Neural Algorithm of Artistic Style, L.A. Gatys, A.S. Ecker, M. Bethge-https://github.com/jcjohnson/neural-style

Page 28: Deep Learning - UC Berkeley - 08.24.16 MWD

Deep Learning - Neural Networks● Class of models● Composition of simple, modular

units, including nonlinear units.● Example units:

○ Core:■ Affine (fully-connected)■ Convolution (1D, 2D, 3D)■ Pooling (max, average)

○ Nonlinearity/Transfer:■ Sigmoid, Tanh, Softmax, ReLU

○ Regularization:■ Dropout, L1, L2

○ Loss:■ Log-loss, Cross-entropy, L1, L2

http://cs231n.github.io/

Page 29: Deep Learning - UC Berkeley - 08.24.16 MWD

Deep Learning - Convolutional Neural Networks● State of the art model class for computer vision tasks.

○ Classification○ Retrieval○ Detection○ Segmentation○ playing Go

● Architecture makes an assumption of images as inputs.

http://cs231n.github.io/

Page 30: Deep Learning - UC Berkeley - 08.24.16 MWD

More Fun...

https://github.com/google/deepdream

Page 31: Deep Learning - UC Berkeley - 08.24.16 MWD

Deep Learning - SystemML-NN Library● Deep learning library written in DML.● Multiple layers:

○ Core:■ Affine, 2D Convolution, Max

Pooling, RNN, LSTM○ Nonlinearity/Transfer:

■ Sigmoid, Tanh, Softmax, ReLU○ Regularization:

■ Dropout, L1, L2○ Loss:

■ Log-loss, Cross-entropy, L1, L2● Multiple optimizers:

○ SGD, SGD w/ momentum, SGD w/ Nesterov momentum, Adagrad, RMSprop, Adam

https://github.com/dusenberrymw/systemml-nn

Page 32: Deep Learning - UC Berkeley - 08.24.16 MWD

Deep Learning - SystemML-NN Library (cont.)

https://github.com/dusenberrymw/systemml-nn

● Each layer type has a simple `forward(...)` and `backward(...)` API.

○ `forward(...)` computes the output of the function based on the inputs.

○ `backward(...)`computes the partial derivatives (gradient) of the inputs to the function w.r.t. some function deeper in the network (usually the loss function at the end).

● Each optimizer has a simple `update(...)` API.

○ `update(...)` adjusts the given parameters based on their partial derivatives.

● Includes test code in DML.○ Gradient checks, unit tests

Page 34: Deep Learning - UC Berkeley - 08.24.16 MWD

Deep Learning - SystemML-NN Library (cont.)

SystemML-NN

SystemMLEngine

Page 35: Deep Learning - UC Berkeley - 08.24.16 MWD

Deep Learning - SystemML-NN Library (cont.)

SystemML-NN

SystemMLEngine

Keras Torch

TensorFlow Caffe

Page 36: Deep Learning - UC Berkeley - 08.24.16 MWD

DEEP LEARNING

w/ Apache SystemML

1. Backgrounda. Apache Sparkb. Machine Learningc. Declarative Machine Learning

2. Apache SystemML3. Deep Learning

a. Overviewb. Plansc. SystemML-NN

4. Demo5. Questions

Agenda Revisited

Page 37: Deep Learning - UC Berkeley - 08.24.16 MWD

Questions?

Page 38: Deep Learning - UC Berkeley - 08.24.16 MWD

Links

● SystemML:○ Website: systemml.apache.org○ Code:

github.com/apache/incubator-systemml○ Deep Learning Library:

github.com/dusenberrymw/systemml-nn○ Email:

[email protected]● Contact:

○ Twitter: @dusenberrymw○ GitHub: github.com/dusenberrymw○ Email: [email protected]

Page 39: Deep Learning - UC Berkeley - 08.24.16 MWD

Thanks!