26
Building High Performance Solutions for Machine Learning and Data Analytics with Hewlett Packard Enterprise Volodymyr Saviak, October 2019

Building High Performance Solutions for Machine Learning ...h41382. · Oil & Gas: Geoengineering, Chemical Engineering, Seismic Exploration, Reservoir Simulation. ... journey) acceleration

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Building High Performance Solutions for Machine Learning ...h41382. · Oil & Gas: Geoengineering, Chemical Engineering, Seismic Exploration, Reservoir Simulation. ... journey) acceleration

Building High Performance Solutions for Machine Learning and Data Analytics with Hewlett Packard EnterpriseVolodymyr Saviak, October 2019

Page 2: Building High Performance Solutions for Machine Learning ...h41382. · Oil & Gas: Geoengineering, Chemical Engineering, Seismic Exploration, Reservoir Simulation. ... journey) acceleration

Introduction

2

Page 3: Building High Performance Solutions for Machine Learning ...h41382. · Oil & Gas: Geoengineering, Chemical Engineering, Seismic Exploration, Reservoir Simulation. ... journey) acceleration

Industry Example WorkloadsGovernment Surveillance, Encryption,

Communications Intelligence

National Labs Scientific & Industrial Research

Weather Weather & Climate Modeling

Financial Services Portfolio Optimization, HF Trading, Global Risk Management

Media & Content Delivery

Digital Content Creation & Distribution

Manufacturing Computer Aided Engineering, Product Design

Oil & GasGeoengineering, Chemical Engineering, Seismic Exploration, Reservoir Simulation

Life Sciences Genomics, Drug Discovery, Bioinformatics, Predictive Medicine

Academic Scientific & Industrial Research

Explosion of data and evolving customer needs for data intensive workloads are driving AI & HPC growth

Big data analytics

Data-intensive

processing

Artificial intelligence

Modeling & simulation

3

Page 4: Building High Performance Solutions for Machine Learning ...h41382. · Oil & Gas: Geoengineering, Chemical Engineering, Seismic Exploration, Reservoir Simulation. ... journey) acceleration

HPC, Big Data Analytics and AI are distinct approaches with some overlaps

4

DeepLearning

MachineLearning

Artificial Intelligence

Big Data Analytics

Predictive Analytics

HPC

Simulation of intelligent behavior

Learn from data and make predictions

Model high level abstractions in data using artificial neural networks

Advanced analytics to forecast future activity, behavior and trends.

Uncover hidden patterns, unknown correlations in large data sets

Simulation & modeling with highly complex mathematical models to gain insights

Page 5: Building High Performance Solutions for Machine Learning ...h41382. · Oil & Gas: Geoengineering, Chemical Engineering, Seismic Exploration, Reservoir Simulation. ... journey) acceleration

Why Today? Reemergence of Machine Learning.

5

Human: filter vs accumulate

Big Data: intensity of human-in-loop analyst

Access (data): enables training

Low Cost HPC: Moore’s low & ½ precision

Page 6: Building High Performance Solutions for Machine Learning ...h41382. · Oil & Gas: Geoengineering, Chemical Engineering, Seismic Exploration, Reservoir Simulation. ... journey) acceleration

Traditional Approach for Intelligent System Creation

6

INPU

T

OU

TPU

T

* - Formula/model created by humans

Page 7: Building High Performance Solutions for Machine Learning ...h41382. · Oil & Gas: Geoengineering, Chemical Engineering, Seismic Exploration, Reservoir Simulation. ... journey) acceleration

Artificial Neural Network Is a Popular Approach to Build AI

7

Dat

a fo

r tra

inin

g&

INPU

T

OU

TPU

T

* - Machines learns to be intelligent by training, based on the data received

Page 8: Building High Performance Solutions for Machine Learning ...h41382. · Oil & Gas: Geoengineering, Chemical Engineering, Seismic Exploration, Reservoir Simulation. ... journey) acceleration

Artificial Neural Networks (ANNs) are inspired by biological systems similar to our brain

NNs are made up of neurons, which are a mathematical approximation to biological neurons

A quick introduction to Neural NetworksThe (artificial) neuron.

8

f(z)

xo

x1

x2

x3

x4

1

y1

Bias = threshold

Inputs

11,0w

12,0w

13,0w

10b

Weights

14,0w

15,0w 1

011

0 bxwz lkk

ljk += −∑

)( 10

10 zfa =

Dendrites

Soma

Axon

Neuron

NucleusAxon Terminals

Page 9: Building High Performance Solutions for Machine Learning ...h41382. · Oil & Gas: Geoengineering, Chemical Engineering, Seismic Exploration, Reservoir Simulation. ... journey) acceleration

MegaFace ChallengeDetail Information available at http://megaface.cs.washington.edu

9

Page 10: Building High Performance Solutions for Machine Learning ...h41382. · Oil & Gas: Geoengineering, Chemical Engineering, Seismic Exploration, Reservoir Simulation. ... journey) acceleration

HPE Accelerates AI with Intel

10

The next section is presented by Intel (slides)

Page 11: Building High Performance Solutions for Machine Learning ...h41382. · Oil & Gas: Geoengineering, Chemical Engineering, Seismic Exploration, Reservoir Simulation. ... journey) acceleration

Breaking barriers between AI Theory and realityPartner HPE with Intel® to accelerate your AI journey

11

SIMPLIFY AIusing community solutions

CHOOSE ANY APPROACHfrom machine to deep learning

DEPLOY AI ANYWHEREwith unprecedented HW choice

TAME YOUR DATAwith a robust data layer

SPEED UP DEVELOPMENTusing open AI software

SCALE WITH CONFIDENCE

on the platform for IT & cloudIntel GPU

Intel AIBuilders

Intel AIDeveloper Program

Intel AIDevCloud

Intel AISolutions

Intel® MKL-DNNIntel® DAAL

Intel® Distribution for Python®

*

*

*

*

*

on Apache* Spark*

Page 12: Building High Performance Solutions for Machine Learning ...h41382. · Oil & Gas: Geoengineering, Chemical Engineering, Seismic Exploration, Reservoir Simulation. ... journey) acceleration

Intel® AI Case study

Foundation Identify Prioritize Consider Organize

Identify prospects internally. You can use 70+ AI solutions in Intel’s portfolio then assess business value of each one

Prioritize projects based on business value & cost to solve with Intel guidance; choose industrial defect detection via DL

Consider ethical, social, legal, security & other risks and mitigation plans with Intel advisors prior to kickoff

Organize internally to get buy-in, support new development philosophy & grow developer talent via Intel AI

Value (H)

Cost (L)Corrosion

L H

DeveloperProgram

Page 13: Building High Performance Solutions for Machine Learning ...h41382. · Oil & Gas: Geoengineering, Chemical Engineering, Seismic Exploration, Reservoir Simulation. ... journey) acceleration

Intel® AI Case study

Data Ingest Store Process ManageIngest streaming data from drones using a popular software tool among the many that run on the CPU

Store data in block storage (for high-performance) in a data lake with guidance from an Intel storage partner

Process data by performing cleanup & integration using popular software tools that run on the CPU

Manage data via a popular framework for distributed computation on your infrastructure

SourceData

TransmitData

IngestData

CleanupData

Integrate Data

StageData

10% 5% 5% 60% 10% 10%

12 weeks

Page 14: Building High Performance Solutions for Machine Learning ...h41382. · Oil & Gas: Geoengineering, Chemical Engineering, Seismic Exploration, Reservoir Simulation. ... journey) acceleration

Intel® AI Case study

Develop Setup Model Test Document

Model development through training a deep neural network using an Intel-optimized DL framework

Test the deep learning model using a control data set to determine if accuracy meets requirements

Document the code, process, and key learnings for future reference

Train(Topology

Experiments)

Train(Tune Hyper-parameters)

TestInference

DocumentResults

30% 30% 20% 10%

DL

Dem

and

Time

Acceleration zone

YOU ARE HERE

CPU zone

12 weeks

Setup compute environment; DL training (~7% of journey) acceleration NOT worthwhile due to high setup time & cost

Page 15: Building High Performance Solutions for Machine Learning ...h41382. · Oil & Gas: Geoengineering, Chemical Engineering, Seismic Exploration, Reservoir Simulation. ... journey) acceleration

Intel® AI Case study

DeployArchitect Implement Scale Iterate

Architect AI deployment with Intel AI Builders

Implement AI in production environment

Scale to more sites & users as demand grows

Iterate on the models with new data over time

Data IngestionInference

Prepare Data

Service Layer

Media Server

Data IngestionData IngestionData Ingestion

InferenceInferenceInference

Prepare Data

Service LayerService Layer

Media ServerMedia Server

Multi-Use Cluster

4 NodesOne ingestion per day, one-day retention

Media Server

Media StoreMedia StoreMedia Store

Media StoreMedia StoreMedia Store

Adv. Analytics

Model StoreModel StoreModel StoreModel StoreLabel StoreLabel StoreLabel StoreLabel Store

110 Nodes8 TB/day per cameras10 cameras3x replication1-year retention4 mgmt nodes

4 Nodes20M framesper day2 NodesInfrequent op3 NodesSimultaneous users3 Nodes10k clips stored

16 Nodes4 Nodes1-year of history

4 NodesLabels for 20M frames

/day

Data Store

1x 2S 61xx 20x 4TB SSD

Training

Training

Per Node1x 2S 81xx

5x 4TB SSD

Per Node1x 2S 81xx

1x 4TB SSD

Drone

Per Drone1x Intel® Core™

processor1x Intel® Movidius™ VPU

10 DronesReal-time object detection and data collection

Software OpenVino™ Toolkit Intel® Movidius™ SDK

DroneDrone

DroneDroneDrone

Remote Devices

TensorFlow* Intel® MKL-DNN

Intermittent use1 training/month for <10 hours

Per Node

Data Ingest Drones

Media Store

TrainingModel Store

PrepareData

Inference Label Store

Media Server

Service Layer

Page 16: Building High Performance Solutions for Machine Learning ...h41382. · Oil & Gas: Geoengineering, Chemical Engineering, Seismic Exploration, Reservoir Simulation. ... journey) acceleration

Solutions & Market

16

Page 17: Building High Performance Solutions for Machine Learning ...h41382. · Oil & Gas: Geoengineering, Chemical Engineering, Seismic Exploration, Reservoir Simulation. ... journey) acceleration

Turnkey Container-Based Platform for AI / ML / DL & Big Data Analytics

17

* EPIC = Elastic Private Instant Clusters

ElasticPlaneTM – Self-service, multi-tenant clusters

IOboostTM – Extreme performance and scalability

DataTapTM – In-place access to data on-prem or in the cloud

Big Data Tools ML / DL Tools Data Science Tools BI / Analytics Tools Bring Your Own

BlueData EPIC® Software Platform

Data Scientists Developers Data Engineers Data Analysts

Public CloudOn-Premises

Compute

Storage

CPU’sGPU’s

NFSHDFS

Page 18: Building High Performance Solutions for Machine Learning ...h41382. · Oil & Gas: Geoengineering, Chemical Engineering, Seismic Exploration, Reservoir Simulation. ... journey) acceleration

What are the benefits of an elastic platform?

18

…………

Scale nodes / resourcesindependently

Add compute nodeswithout repartitioning data

Shift node purposeon-the-fly

In-memory Interactive Batch

Containers enable rapiddeployment and movementof workloads and models

HPE Flexible Capacity forconsumption-based IT

Page 19: Building High Performance Solutions for Machine Learning ...h41382. · Oil & Gas: Geoengineering, Chemical Engineering, Seismic Exploration, Reservoir Simulation. ... journey) acceleration

An elastic architectural quickly shifts workload requirements

19

1

1

2

2

3

3

4

4

5

5

Day 1 Day n

1. Initial cluster deployment(Spark and Batch workloads)

2. Need more CPU and RAM(Impala workloads)

3. Need more storage capacity (archival tier)

4. Adding low-latency, high-IOPs noSQL and Kudu workloads

5. Adding AI model trainingand deep learning workloads

Traditional

Elastic

Page 20: Building High Performance Solutions for Machine Learning ...h41382. · Oil & Gas: Geoengineering, Chemical Engineering, Seismic Exploration, Reservoir Simulation. ... journey) acceleration

Why an elastic architectural approach?Purpose-built nodes and multi-generational clusters

20

Analytic Services

“IoT”Edge Processing of data in motion

“Fast Data”Core Processing of data in motion

“Big Data”Analysis of data at rest

“AI”Deep Learning/Machine Learning

NoSQL

Parallel Data Flow Mgmt

“Data Lake”

Distributed Data Flow MgmtData Acquisition

HPC Storage

Deep Learning

Local Data Mgmt

Container Management

Analytic ServicesModel

ServingModel

Serving

Edge Infrastructure Mgmt

Parallel Analytic Framework

HPC Storage

Different requirements along the data pipeline stages demand different node geometries

Page 21: Building High Performance Solutions for Machine Learning ...h41382. · Oil & Gas: Geoengineering, Chemical Engineering, Seismic Exploration, Reservoir Simulation. ... journey) acceleration

Speech analysis is becoming mainstream

70% of all finance and insurance organizations will be using speech analysis to reduce risk by 2022.

—Gartner

By 2020, customers will manage 85% of their relationship with the enterprise without interacting with a human.

—Gartner

About 30% of searches will be done without a screen by 2020.

—Gartner

25% of 16- to 24-year-olds use voice search on mobile.

—Global Web Index

19% of people use Siri at least daily.

—HubSpot

21

Page 22: Building High Performance Solutions for Machine Learning ...h41382. · Oil & Gas: Geoengineering, Chemical Engineering, Seismic Exploration, Reservoir Simulation. ... journey) acceleration

Case in point: Customer service support center

3000 employeesfor average major bankcall centers in the U.S.

30 callsper agent, per day

$1average cost per minute

4 minutesspent per call

360,000 minutesspent on calls each day

Only 60%satisfaction rate

$86 millionannual spend

22

Page 23: Building High Performance Solutions for Machine Learning ...h41382. · Oil & Gas: Geoengineering, Chemical Engineering, Seismic Exploration, Reservoir Simulation. ... journey) acceleration

Why HPE

23

Page 24: Building High Performance Solutions for Machine Learning ...h41382. · Oil & Gas: Geoengineering, Chemical Engineering, Seismic Exploration, Reservoir Simulation. ... journey) acceleration

One-day interactive workshop to help customers get startedHPE Artificial Intelligence Transformation Workshop

24

Customer benefitsOne-day visual and interactive session allows HPE experts to focus on customers’ needs for next generation AI and help embark on a transformational journey to achieve business goals.

Turn data into action and revenue

- Select and analyze AI, advanced analytics, data use cases

- Identify desired outcomes – automated or manual

- Assess data characteristics – data readiness, requirements

Scope

- Get started on an AI project fast- Align business, data and IT operations teams- Explore opportunities, priorities and select relevant use cases

- Identify dependencies, data sources, level of readiness- Define a high-level roadmap for intelligent data strategy

Page 25: Building High Performance Solutions for Machine Learning ...h41382. · Oil & Gas: Geoengineering, Chemical Engineering, Seismic Exploration, Reservoir Simulation. ... journey) acceleration

Why HPE for AI, HPC, BigData Analytics?

•Best people, best teams World Wide•Best technologies on the market•High quality products designed and made in US•Best customer support

25

Page 26: Building High Performance Solutions for Machine Learning ...h41382. · Oil & Gas: Geoengineering, Chemical Engineering, Seismic Exploration, Reservoir Simulation. ... journey) acceleration

Thank you!Your AI & HPC Solutions Contact: [email protected]