Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Dell EMC Machine Learning Strategy
Jay Boisseau
HPC Technical Strategist
June 22, 2017
Restricted - Confidential
4Restricted - Confidential
Tech Leaders Are Proclaiming Disruption, Revolution…
“Just as electricity 100 years ago transformed industry after industry after industry, I think AI
powered by deep learning will now do the same… It’s hard to think of an industry that will
not be transformed by AI in the next decade. ” – Andrew Ng, former Baidu Chief Scientist
“AI is the most far-reaching technological advancement in our lifetime. It changes every
industry, every company, every thing.” – Jen-Hsun Huang, Nvidia CEO
“Smart machine technologies will be the most disruptive class of innovations over the next
10 years due to their computational power, scalability in analyzing large-scale data sets,
and rapid advances in neural networks.” – Gartner Report
Cognitive computing will become “the largest consumer of computing cycles by 2020” –
Rob High, IBM Watson CTO
“Machine learning is HPC’s 1st consumer killer app” – Jen-Hsun Huang
5Restricted - Confidential
Tech Leaders Are Proclaiming Disruption, Revolution…
“When you get all this data coming in from hundreds of billions of connected
devices and apply to that artificial intelligence, it’s almost like a fourth
industrial revolution and an incredible opportunity for companies to re-
imagine themselves in this digital age….The last 30 years have been
incredible in IT, but the next 30 years will make it look like child’s play.”
– Michael Dell
Sapphire SAP Conference May 16th 2017
6Restricted - Confidential
The Hype Is High…
7Restricted - Confidential
The Hype Is High…
8Restricted - Confidential
But Analysts & Experts Agree on AI Importance
Cognitive/AI
9Restricted - Confidential
AI Has Already Proven Superior to Experts, Other Methods in Many Areas
• In chess and Go and poker
• In sequence analysis and tumor detection
• In retail suggestions and fraud detection
• In intrusion attempt analysis to terrorist threat detection
• In AI agents, autonomous driving, robots, and more
10Restricted - Confidential
AI solves business problem across many verticalsID says in 2018, 75% of enterprise and ISV development will include AI//ML/cognitive in at least one application
11Restricted - Confidential
Machine Learning – Disruptive, Transformative
Artificial Intelligence
Machine Learning
(Statistical)
A B
C
Deep Learning
(Neural Networks)
Artificial Intelligence is the broader concept of
machines being able to carry out tasks in a way
that we would consider “smart”.
Machine Learning is a current application of
AI based around the idea that we should really
just be able to give machines access to data
and let them learn for themselves.
Deep Learning is an area of Machine Learning
research, which has been introduced with the
objective of moving Machine Learning closer to
one of its original goals: Artificial Intelligence
A Neural Network is a computer system designed to work
by classifying information in the same way a human brain
does. It can be taught to recognize, for example, images,
and classify them according to elements they contain.
12Restricted - Confidential
Background: Intelligence from Processing
Machine Learning – condensing data
into a high-dimensional probability
model to be used for:
CLASSIFICATION – using the model
to label or tag the data
INFERENCE – using the model to
deduce probable inputs given some
outputs
JUDGEMENT – summarizing the
content of the probability model
PREDICTION – using the model to
deduce probable outputs given some
inputs
12
13Restricted - Confidential
Background: The Machine Learning Process
13
ML Algorithm
Scoring
Classification
Engine
trained
parameters
or weights (Ŵ)
training data
useful
intelligence
iterate
until
satisfied
real world data
Training Phase Use Phase
The classification
engine implements
the ML model
14Restricted - Confidential
Background: Machine Learning Requires Matrix Math Matrix form of RSS* (minimize the error)
y = true value
H x w = predicted
ɛ = error
14
+=
W = parametersor coefficients
*residual sum of squares
H = training data
15Restricted - Confidential http://scikit-learn.org/
16Restricted - Confidential http://scikit-learn.org/
17Restricted - Confidential
REARCHITECTING THE ARTIFICIAL BRAINDeep Neural Networks Learn Features in Layers
18Restricted - Confidential
Deep Learning – Train Model, Then Inference Against
Training
Computationally
Intensive: massive
data, massive
computations in
neural net
Billion of Tflops per
training run (train a
model)
Can sacrifice precision
(e.g. FP32, FP16) for
more performance
Inference
Less computationally
intensive, but still must be
fast (and often low
power)
Doesn’t require high
precision math, so can
use accelerators like
GPU & FPGA with INT8
support.
Can also be run in Xeon
& Xeon-Phi based
systems.
Scalable
Training
Neural Model
Scalable
Inference
Edge/Users
19Restricted - Confidential
Background: Math Performance is Key
• Most of the recent performance gains by GPUs and KNM is due
to precision optimizations:
• But there is one more optimization step: specialized silicon
– Special 16 bit precision enhancements
– Better internal network, i.e. graph support and more connectivity
– Better use of memory
19
64-bit DP 16-bit HP32-bit SP Some
8 bitPrecision Evolution:
20Restricted - Confidential
POV Point #1: Deep Learning is HPC
• Deep learning training certainly requires HPC techniques (and so
do other ML, and HPDA, techniques)
– HPC is not an acronym for 'parallel 3D PDE-solving time-dependent
applications written using MPI'
– HPC means high performance computing: computing in which the purpose and
design is for greater performance than any mainstream computing (mobile,
desktop, laptop, enterprise server)
• Deep learning is very data-driven, so it is also HPDA
• Note: Dell Ready Solutions & Alliances group includes both HPC and
HPDA solutions, so we're internally aligned
21Restricted - Confidential
POV Point #2: Optimized DL Servers Are Needed
• DL workloads are different than traditional (simulation-based)
HPC workloads, and thus require different optimized servers
– DL is two phased process› training (which includes scoring, a kind of inferencing)
› inferencing
– Neither are dependent on 64-bit computations› can sacrifice precision for performance
› Inferencing can sometimes be accomplished in 8-bit!
– Training requires from very highly parallel processors
– Inferencing can be conducted by very efficient, lightweight processors
22Restricted - Confidential
POV Point #3: We Must Evaluate Multiple, Diverse Optimized Solutions
• DL training solutions:– GPU-based: great at matrix math; V100 now optimize for tensor operations(!) *Volta*
– KNM: extra focus on scaling, variable precision
– Dedicated silicon› Nervana
› Graphcore
› Knupath
› Wave
• DL inferencing solutions– Xeon
– GPUs
– FPGAs
– dedicated silicon (as Google is doing with TPU, Apple with forthcoming neural chip, etc.)
23Restricted - Confidential
Background: What are Frameworks?
Deep Learning Frameworks
TesnorFlow, MxNet, CNTK, Chainer,
Neon, Theano…
Neural network Libraries
cuDNN, cuBLAS, MKL, NCCL…
Hardware
All these frameworks allow deep learning
researchers to build models. They include basic
building blocks like layers which can be connected in
different ways to create a model.
In order to train the deep learning models, the
frameworks work with underlying neural network
libraries such as NVIDIA's cuDNN and Intel's MKL.
These libraries implement operations such as matrix
multiply that are important to deep learning models.
Finally, the models are trained on hardware like
NVIDIA GPUs, Intel's Xeon Phi processor or other
specialized processors.
24Restricted - Confidential
Background: Most AI Frameworks Are Open Source
Key points: All the frameworks are open-source but some of the frameworks
supported by major players are:• TensorFlow: Google
• Mxnet: Amazon
• CNTK: Microsoft
• Intel : Neon
• Apple: Turi
We don’t have to develop the frameworks but we need to develop
the server that can give us the maximum performance.
26Restricted - Confidential
POV Point #4; Customer Success Requires Solutions, Expertise – Not Just Optimized Servers
Machine Learning is changing and evolving weekly. Dell EMC needs to continue to evaluate new solutions—both HW and SW—and gain experience with applications and verticals. So how should we measure and benchmark potential products?
CUSTOMER SUCCESS – to be a broad-based supplier we need all Dell customers to be successful. Therefore, in addition to high performance ML/DL servers, we need solutions that provide:
– Ease of use
– Accuracy
– Hyper-tuning
This will require solutions with software & services to help customer use frameworks effectively
This is what we need to “benchmark”
not just performance of frameworks
27Restricted - Confidential
Dell Activities Leading To Roadmap, New Solutions
• Conducting many, many software and hardware technology assessments & evaluations (e.g. GPUs, KNM, FPGAs, Nervana, Graphcore), and POCs (started 2016), e.g. BitFusion, others
• Benchmarking major frameworks (e.g. TensorFlow, CNTK, MXNet, Caffe2) to study performance & scaling characteristics, to create optimal solutions
• Evaluating potential software & services partners to offer best-in-class solutions
• Collecting customer solutions requirements, experiences, successes to develop complete solutions that provide max ROI
• Finalizing strategy and solutions roadmap – target summer 2017.
28Restricted - Confidential
POV Point #5: It is NOT all about the hyperscalers!
• Same arguments for on-prem infrastructure apply here as for HPC
• No, they do not already have all your enterprise/science data—most corporate data remains on-prem
• Data origination, movement, security policies, etc. all remain challenges for public cloud usage
• Biggest advantage of hyperscalers currently is APIs (not cost)--but that's software and thus manageable
29Restricted - Confidential
POV Point #6: You will want/need DL even if your current workload is mostly simulation
• You will want/need DL to complement your current HPC
efforts when
– there is no simulation option
› no physical laws for your problem
› uncertain/incomplete physics
– simulation accuracy is limited
› accurate (limits of non-linear dynamics simulations)
– for advanced data analysis of simulation output data
30Restricted - Confidential
Summary & Closing Thoughts
• We get that it's hard--and we are evaluating & benching everything we can,
assessing software partners, and mapping solutions for customers and
workloads, and preparing a comprehensive roadmap (Summer 2017)
• DL is another part of a comprehensive IT solutions stack--but a very
important part for data rich problems (perhaps the most important)
• Some things that just weren't feasible before—autonomous driving, (good)
AR, robotics, etc.—will become realities!
• It's early, but we intend to win--and our vast enterprise presence will be
invaluable. So will your expertise and our partners (up next)
• Questions (both ways)?