Deep Learning - UC Berkeley - 08.24.16 MWD

DEEP LEARNINGw/ Apache SystemML

Mike DusenberryEngineer, Machine Learning & SystemML

IBM Spark Technology Center, SF@dusenberrymw

Berkeley - 08.24.16

DEEP LEARNING

w/ Apache SystemML

1. Backgrounda. Apache Sparkb. Machine Learningc. Declarative Machine Learning

2. Apache SystemML3. Deep Learning

a. Overviewb. Plansc. SystemML-NN

4. Demo5. Questions

Agenda

Apache Spark

Apache Spark● System for large-scale data processing on clusters.● Combines ML, SQL, streaming, and other complex analytics.● Extends Scala idioms, as well as R/Python DataFrame idioms to cluster

computing.● APIs for Scala, Java, Python, R.● Simple to use!● Much more information at https://spark.apache.org/.

https://spark.apache.org/

Machine Learning

Machine Learning● Data

○ Multiple “examples”○ Multiple “features” per “example”○ “Label(s)” for each “example” (supervised)

● Model○ Construct/select a model that fits the problem.○ Examples:

■ Linear/Logistic Regression■ SVM■ Neural Networks

● Loss○ An “evaluation” of how well the model fits the data.

● Optimizer○ Minimize “loss” by adjusting model to better fit the data.

-A Neural Algorithm of Artistic Style, L.A. Gatys, A.S. Ecker, M. Bethge-https://github.com/jcjohnson/neural-style

Declarative Machine Learning

Laptop

Exploratory Data Analysis Today

8

R

Python

Others

DataScientist

DataR

Python

Others

DataScientist

Laptop

Exploratory Data Analysis Today

9

R

Python

Others

DataScientist

R

Python

Others

DataScientist

Current Best Practice for Big Data Analysis

DataScientist

DataScientist

DataScientist

HadoopEngineer

SparkEngineer

MPIEngineer

R

Python

Others

Laptop

DataScientist

Scale-up

Cluster

R

Python Query Optimization

Others

Vision: Declarative Machine Learning

Apache SystemML

Apache SystemML● High-level language

○ DML -> R-like○ PyDML -> Python-like

○ Focus is on matrices and linear algebra.

● Engine○ Compiler/Optimizer○ Lots of optimizations, such as

rewrites.

● Runtime○ Laptop○ Spark○ (also Hadoop)

(DML) (PyDML)

Engine





rewrites.


(DML) (PyDML)

Engine

SystemML - Example: Logistic Regression (DML)

SystemML - Example: Sigmoid Function (DML)





rewrites.


(DML) (PyDML)

Engine





rewrites.


(DML) (PyDML)

Engine

SystemML - Compilation Chain





rewrites.


(DML) (PyDML)

Engine





rewrites.


(DML) (PyDML)

Engine





rewrites.


(DML) (PyDML)

Engine

SystemML - Architecture (APIs and runtime)

23

Command Line JMLC Spark

MLContextSpark

MLAPIs

High-Level Operators (HOPs)

Parser/Language

Low-Level Operators (LOPs)

Compiler

Runtime

Control ProgramRuntime

ProgBuffer Pool

ParFor Optimizer/Runtime

MR InstSpark Inst

CPInst

Recompiler

Cost-based optimizations

DFS IOMem/FS IO

Generic MR

MatrixBlock Library(single/multi-threaded)

SystemML - Architecture (APIs and runtime)

24

Command Line JMLC Spark

MLContextSpark

MLAPIs

High-Level Operators (HOPs)

Parser/Language

Low-Level Operators (LOPs)

Compiler

Runtime

Control ProgramRuntime

ProgBuffer Pool

ParFor Optimizer/Runtime

MR InstSpark Inst

CPInst

Recompiler

Cost-based optimizations

DFS IOMem/FS IO

Generic MR

MatrixBlock Library(single/multi-threaded)

SystemML - Spark API (Python)

Deep Learning

Deep Learning● Subfield of machine learning.● Essentially focused on creating

large, complex, nonlinear functions to map from inputs to predictions, and in the process learn complex representations of the data.

● Key: These complex functions are built through a deep composition of simple, modular units.

● = Neural Networks -A Neural Algorithm of Artistic Style, L.A. Gatys, A.S. Ecker, M. Bethge-https://github.com/jcjohnson/neural-style

Deep Learning - Neural Networks● Class of models● Composition of simple, modular

units, including nonlinear units.● Example units:

○ Core:■ Affine (fully-connected)■ Convolution (1D, 2D, 3D)■ Pooling (max, average)

○ Nonlinearity/Transfer:■ Sigmoid, Tanh, Softmax, ReLU

○ Regularization:■ Dropout, L1, L2

○ Loss:■ Log-loss, Cross-entropy, L1, L2

http://cs231n.github.io/

Deep Learning - Convolutional Neural Networks● State of the art model class for computer vision tasks.

○ Classification○ Retrieval○ Detection○ Segmentation○ playing Go

● Architecture makes an assumption of images as inputs.

http://cs231n.github.io/

More Fun...

https://github.com/google/deepdream

Deep Learning - SystemML-NN Library● Deep learning library written in DML.● Multiple layers:

○ Core:■ Affine, 2D Convolution, Max

Pooling, RNN, LSTM○ Nonlinearity/Transfer:

■ Sigmoid, Tanh, Softmax, ReLU○ Regularization:

■ Dropout, L1, L2○ Loss:

■ Log-loss, Cross-entropy, L1, L2● Multiple optimizers:

○ SGD, SGD w/ momentum, SGD w/ Nesterov momentum, Adagrad, RMSprop, Adam

https://github.com/dusenberrymw/systemml-nn



Deep Learning - SystemML-NN Library (cont.)


● Each layer type has a simple `forward(...)` and `backward(...)` API.

○ `forward(...)` computes the output of the function based on the inputs.

○ `backward(...)`computes the partial derivatives (gradient) of the inputs to the function w.r.t. some function deeper in the network (usually the loss function at the end).

● Each optimizer has a simple `update(...)` API.

○ `update(...)` adjusts the given parameters based on their partial derivatives.

● Includes test code in DML.○ Gradient checks, unit tests



Demohttps://github.com/dusenberrymw/systemml-nn/tree/master/examples

https://github.com/dusenberrymw/systemml-nn/tree/master/examples

https://github.com/dusenberrymw/systemml-nn/tree/master/examples


SystemML-NN

SystemMLEngine


SystemML-NN

SystemMLEngine

Keras Torch

TensorFlow Caffe

DEEP LEARNING

w/ Apache SystemML

1. Backgrounda. Apache Sparkb. Machine Learningc. Declarative Machine Learning

2. Apache SystemML3. Deep Learning

a. Overviewb. Plansc. SystemML-NN

4. Demo5. Questions

Agenda Revisited

Questions?

Links

● SystemML:○ Website: systemml.apache.org○ Code:

github.com/apache/incubator-systemml○ Deep Learning Library:

github.com/dusenberrymw/systemml-nn○ Email:

[email protected]● Contact:

○ Twitter: @dusenberrymw○ GitHub: github.com/dusenberrymw○ Email: [email protected]

http://systemml.apache.org

https://github.com/apache/incubator-systemml

https://github.com/apache/incubator-systemml



mailto:[email protected]


https://twitter.com/dusenberrymw

https://github.com/dusenberrymw


Thanks!

Data & Analytics

Deep Learning - UC Berkeley - 08.24.16 MWD