17
Deeplearning4J François Garillot, @huitseeker

DeepLearning4J and Spark: Successes and Challenges - François Garillot

  • Upload
    sparktc

  • View
    92

  • Download
    0

Embed Size (px)

Citation preview

Page 1: DeepLearning4J and Spark: Successes and Challenges - François Garillot

Deeplearning4J

François Garillot, @huitseeker

Page 2: DeepLearning4J and Spark: Successes and Challenges - François Garillot

Neural Networks & Deep Learning

• graphical models w/ inputs and outputs

• represents composition of differentiable functions

• deep learning : expressivity exponential w.r.t depth

Page 3: DeepLearning4J and Spark: Successes and Challenges - François Garillot

Interesting results

• cat paper by Andrew Ng & Goole

• AlexNet by Toronto

• last week CNTK at speech recognition parity with humans

Page 4: DeepLearning4J and Spark: Successes and Challenges - François Garillot

Industrial results

• Autonomous Driving : Drive.ai, Comma.ai + the usual suspects

• Drugs discovery : Deep Genomics (Frey) & Bayer

• Predictive Maintenance : Thales, Bosch

• optimistic pessimism (Moghimi, Manulife Financial Corp.)

Page 5: DeepLearning4J and Spark: Successes and Challenges - François Garillot

DeepLearning in two steps : training, applying

• training tends to require lots of data, (R)

• but applying does not (embedded, etc).

So that applying pre-trained models (Tensorframes) not the technical/business challenge.

Enterprise : have lots of data yourself, what to apply ?

Page 6: DeepLearning4J and Spark: Successes and Challenges - François Garillot

Benchmarks aren't distributed

Page 7: DeepLearning4J and Spark: Successes and Challenges - François Garillot

Training, but how ?

New Amazon GPU instances ?

Page 8: DeepLearning4J and Spark: Successes and Challenges - François Garillot
Page 9: DeepLearning4J and Spark: Successes and Challenges - François Garillot

Deep Learning Training

• Facebook, Amazon, Google, Baidu, Microsoft have this distributed

• But what if you’re not one of them ?

Page 10: DeepLearning4J and Spark: Successes and Challenges - François Garillot

Training, but how ?

Page 11: DeepLearning4J and Spark: Successes and Challenges - François Garillot

Distributing training

• basically distributing SGD (R)

• challenge is AllReduce Communication

• Sparse updates, async communications

Page 12: DeepLearning4J and Spark: Successes and Challenges - François Garillot

Deeplearning4J

• the first commercial-grade, open-source, distributed deep-learning library written for Java and Scala

• Skymind its commercial support arm

Page 13: DeepLearning4J and Spark: Successes and Challenges - François Garillot

Scientific computing on the JVM

• libnd4j : Vectorization, 32-bit addressing, linalg (BLAS!)

• JavaCPP: generates JNI bindings to your CPP libs

• ND4J : numpy for the JVM, native superfast arrays

• Datavec : one-stop interface to an NDArray

• DeepLearning4J: orchestration, backprop, layer definition

• ScalNet: gateway drug, inspired from (and closely following) Keras

Page 14: DeepLearning4J and Spark: Successes and Challenges - François Garillot

Reinforcement learning

Page 15: DeepLearning4J and Spark: Successes and Challenges - François Garillot

Killing the bottlenecks : generic

• swappable net backend : netty -> aeron (Hi Lightbend !)

• better support for binary data : big indexed tablesBinary, columnar, off-heap

• and more (Tamiya Onodera's group @ IBM Japan):http://www.slideshare.net/ishizaki/exploiting-gpus-in-spark

Page 16: DeepLearning4J and Spark: Successes and Challenges - François Garillot

And if you don't care about Deep Learning ?

• Spark-6442 : better linear algebra than breeze, please.(sparse, performant, Java-compatible, and an OK license)

• SystemML got a best paper at VLDB'16, how about helping out on nd4j ?

• ND4J only lacks sparse, but not for long ...

Page 17: DeepLearning4J and Spark: Successes and Challenges - François Garillot

Questions ?