Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Data Analytics and High Performance Computing - a Convergence?
Slide 2
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
— Motivation: Big Data, Data Analytics, and Machine Learning
— Introduction ScaDS Dresden/Leipzig
— Infrastructure for Data Analytics and Machine Learning
Services at HPC
Performance Aspects
— Outlook – Future Perspectives
Outline
Slide 3
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Nice Example ...
Slide 4
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Nice Example: Analyzing 1.1 Billion NYC Taxi and Uber Trips, with a Vengeance (Todd W. Schneider)
— The New York City Taxi & Limousine Commission has released a staggeringly detailed historical dataset covering over 1.1 billion individual taxi trips in the city from January 2009 through June 2015
— http://toddwschneider.com/posts/analyzing-1-1-billion-nyc-taxi-and-uber-trips-with-a-vengeance/
— Maps show every taxi pickup in New York City from 2009–2015
— Brighter regions indicate more taxi activity.
— Green tinted regions represent activity by green boro taxis, which can only pick up passengers in upper Manhattan and the outer boroughs
Taxi pickups
Slide 5
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Taxi Pickups
Slide 6
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Taxi Dropoffs
Slide 7
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Uber vs. Taxi Pickups in Brooklyn
— Between June 2014 and June 2015, the
number of Uber pickups in Brooklyn grew by
525%
— As of June 2015, Uber accounts for more
than twice as many pickups in Brooklyn
compared to yellow taxis
— Rapidly approaching the popularity of green
taxis:
Slide 8
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Brooklyn Monthly Taxi Pickups
— Introduction of the green boro taxi program
in August 2013 dramatically increased the
amount of taxi activity in the outer boroughs
— From 2009–2013, a period during
which migration from Manhattan to Brooklyn
generally increased, yellow taxis nearly
doubled the number of pickups they made in
Brooklyn.
— green taxis quickly overtook yellow taxis so
that as of June 2015, green taxis accounted
for 70% of Brooklyn’s 850,000 monthly taxi
pickups
Slide 9
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Manhattan Monthly Taxi Pickups
— Manhattan, not surprisingly, accounts for by far the largest number of taxi pickups of any borough
— In any given month, around 85% of all NYC taxi pickups occur in Manhattan, and most of those are made by yellow taxis
— Even though green taxis are allowed to operate in upper Manhattan, they account for barely a fraction of yellow taxi activity
— Uber has grown dramatically in Manhattan, notching a 275% increase in pickups from June 2014 to June 2015, while taxi pickups declined by 9% over the same period
— Uber made 1.4 million Manhattan pickups in June 2015
Slide 10
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Travel Time Midtown to JFK / La Guardia
Slide 11
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Snowfall vs. Rain (Based on NYC)
Slide 12
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
NYC Late Night Taxi Index
— We can use the taxi data to draw some inferences about what parts of the city are popular for going out late at night by looking at the percentage of each census tract’s taxi pickups that occur between 10 PM and 5 AM—the time period I’ve deemed “late night.”
— According to the late night taxi index, if you’re looking for a neighborhood with vibrant nightlife, try Williamsburg, Greenpoint, or Bushwick in Brooklyn
— The census tract with the highest late night taxi index is in East Williamsburg, where 76% of taxi pickups occur between 10 PM and 5 AM
Slide 13
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Investment Bankers
— We can isolate all taxi trips that dropped off in that driveway to get a sense of where Goldman Sachs employees—at least the ones who take taxis—come from in the mornings, and when they arrive. Here’s a histogram of weekday drop off times at 200 West Street
— The cabs start dropping off around 5 AM, then peak hours are 7–9 AM, before tapering off in the afternoon
— Presumably most of the post-morning drop offs are visitors as opposed to employees
— If we restrict to drop offs before 10 AM, the median drop off time is 7:59 AM, and 25% of drop offs happen before 7:08 AM
Slide 14
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Cash or Credit
Slide 15
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Update September 2016 (Brooklyn)
Slide 16
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Update March 2018
Slide 17
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
What else could you compare … ???
— 2016 presidential election results for every
neighborhood in the city compared to Lyft’s
market share gain in each neighborhood to
the neighborhood’s voting patterns
— The data shows that, on average, Lyft gained
more market share from Uber in
neighborhoods that voted more heavily for
Hillary Clinton
— Todd W. Schneiders guess is that liberal
voters were in fact more likely to switch from
Uber to Lyft
Slide 18
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
A View on Some Different Data
Slide 19
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Published: 1896
Arbeiten aus dem Kaiserlichen Gesundheitsamte (10. Band)
Slide 21
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Number of Cholera Dead 1892/1893 in Hamburg and Altona
Slide 23
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Information
Decision support
Suggest conclusions
Digitization and Data Analytics
digitization data analytics
Open questions:
What sort of data ?
Which methods from data analytics and machine learning are appropriate?
How to support humans to cope with these amounts of data?
What are the requirements concerning storage?
Which architecture is adequate?
Which processor power is needed?
Slide 24
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
— Logistics
— Traffic
— Science
— Industrial environments
— Wheather
— Finance
— Text
— Business
— Social networks
— ...
Many data and many different forms of data! Big Data?
Sorts of Data
Many data and many different forms of data! Big Data?
Slide 25
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
How large is the amount of data?
Source: IDC’s Digital Universe study, sponsored by EMC, 2014
Big means not a fixed scale!
What is „Large“? ZB = 1021B
B kB MB GB TB PB EB ZB
x1000
= 109 x TB
Slide 26
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Big Data Definition(s)
Volume
Data at Rest
Terabytes to exabytes of existing data to process
Velocity
Data in Motion
Streaming data miliseconds to seconds
to respond
Variety
Data in Many Forms
Structures, unstructured, text,
multimedia
Veracity
Data in Doubt
Uncertainty due to data inconsistency
&incompleteness, ambiguities, latency,
deception, model approximations
More important: extract new content from database
Slide 27
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Where is Big Data coming from?
Event- Analysis
Sensor Data
Mobile Revolution
Slide 28
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Users experience in sciences
Support of complete workflows
Speicher
Computing
User interface
Slide 29
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Science point of view: Data life cycle management
Data(flow) perspective Systems perspective
Slide 30
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Data analytics processing pipeline
Data Collection
Data Integration/ Aggregation
Analysis/ Modeling
Inter-pretation
Extraction/ Cleaning/
Annotation
Volume
Veracity
Velocity
Variety
… P
riva
cy
Hu
ma
n
Inte
ract
ion
Value
Often ¾ of total effort to get pipeline running Get human in the loop!
Slide 31
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Success of machine learning methods heavily depends on quality and quantity of data
Many machine learning methods are known before „Big Data“
Some machine learning methods are only successfully applicable with a certain amount of
training data
Today training data are available due to
digitization
powerful hardware (artificial training data)
Example Deep Learning
Success results on large amount of training data
Data only available due to thorough digitization
Processing of large amount of data only possible due to hardware evolution
Machine Learning and Big Data
Slide 32
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Where is Big Data coming from?
Scientific Data
Simulation and scientific applications produce large amount of data
Climate models: combine many external data
(geoinformation, measurements, detailed models, ...)
High-energy physics: Many measured values in a
short time
Quelle: CERN
Slide 33
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Requirements from the Users perspective
— Data must be managed, annotated and curated
to extract their potential
— Many research communities do not have the
necessary tools to transform ever-growing data
into scientific knowledge
In science: Not just “big players” – Long Tail of Science
Large Collaborations (e.g. @Cern)
DNA sequencing
And many more!!!
Engineering
Transportation
Slide 34
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Goal:
— Catalog the unique genetic endowment and diversity present in all living bats
— In order to:
understand the molecular basis of their unique adaptations
link genotype with phenotype
uncover their evolutionary history
better understand, promote, and conserve bats.
Real world example: Platinum genome assemblies – The Bat1K project
Collaboration with MPI-CBG: Gene Myers, Martin Pippel
Slide 35
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Multiple DNA sequencing technologies
Avg. Length Application
20 - 40kb
PacBio long reads
Full Chromosomes
Hi-C read pairs
1. Genome Assembly based on noisy long reads
2. Scaffolding: order and orient contigs by using multiple sequencing technologies with increasing long-range information
Contig
150 - 400kb
Bionano Optical Maps
50 - 200kb
10x Genomics read clouds
CMAP Multi Mb
Slide 36
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Assembly pipeline: runtime
Read Patching
Genome Assembly
Error Correction
Scaffolding
Detect and correct sequencing artifacts within PacBio reads, e.g. chimers, missed adapters, low quality read segments
Calculate local alignments between patched reads, followed by several overlap scrubbing phases and generation of an overlap graph. Contigs are generated by touring the overlap graph.
Correct base errors and haplotype phasing by using PacBio reads and 10x read clouds.
Order and orient contigs into Chromosomes by using Bionano optical maps and long-range Hi-C read pairs.
Mapping of complex workflows not trivial to keep overall
performance some very long and short tasks in workflow
Slide 37
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Disruptive Changes due to Hardware Innovation (1994-2018)
Phone of 1994 (Nokia 2110)
Monochrome display (4x13 chars)
20 physical keys
Micro controller for user interface
125 phone book entries
(SMS)
---
---
---
→ Smartphone of 2018 (Galaxy S9)
→ Super AMOLED display, 2960 x 1440 pixel
→ 5.8-inch touchpad
→ Octa-Core 2,7 GHz, 1,7 GHz
→ 4 GB RAM, 64 GB memory
→ Permanent Internet connectivity
→ 8.0 MP front, 12.0 MP rear camera
→ WIFI, Bluetooth, NFC, location (GPS, Glonass,
Beidou, Galileo)
→ compass, barometer, accelerometer, gyroskope
Slide 38
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Today’s phones
see (multiple cameras, ambient light sensor)
hear (multiple microphones)
feel (touchscreen, accelerometer)
are aware of their position, orientation, and 6-axis movement in
3D space (GPS, compass, barometer, gyroskope, accelerometer)
are permanently connected to the Internet (including Cloud services
and other devices)
cannot taste or smell (yet)
Past Hardware Innovation
Slide 39
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
What else can we expect?
— faster CPUs and GPUs
— faster network connectivity
— better auto-connection with more devices
— better cameras
— 3D displays and cameras
— wireless charging
— lower power consumption
More evolution less revolution!
Past Hardware Innovation
Slide 40
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
— How to turn transistors into performance?
Processor Power
Slide 41
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Processor Power
— Power consumption per processor
(socket) reached manageable limit
— Hardly any frequency increase in recent
years
⇒ Almost no more IPC improvement
— Transistor count still increasing
⇒ More parallelism
Slide 42
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
HPC Hardware
HPC systems
— High complexity of large scale systems
— Multiple hardware options
‘heavy’ nodes – large RAM and fast CPUs
Accelerators (hybrid systems)
Fast interconnect
Com Commodity systems
— ‚Isolated‘ systems, but interconnects
(internet/network)
— Already some level of parallelism (will
continue in future)
Supercomputer
100,000+ Cores
Cluster
1000+ Cores
shared memory distributed memory + network
Server
~ 12-24+ Cores
Notebook
2-8 Cores
Mobile
2-8 Cores
Slide 43
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
— Exponential growth
— Increase in parallelism
— Different architectures
— Next Level:
1 ExaFlop/s
(1018 Floating
Point Operationen
per second)
— Currently, energy
consumption is next hurdle
HPC – Top 500
http://www.top500.org/ Vector
SMP Accelerator
Cluster, Commodity
Slide 44
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
The „Big Data paradigm“: Why is not everything in HPC – or parallel computing … Architecture Data driven applications are not easily mapped on HPC architectures Data Sources
— HPC applications also source of Big Data, e.g. large scale simulations
— Streaming applications were not well covered in the past by HPC architectures
— limiting factors are not always of technical nature Storage
— Intermediate and temporary storages need to be organized by users
— lack of sophisticated computing middlewares for data management and organization
HPC and Data Analytics
ScaDS Dresden/Leipzig
Slide 46
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Motivation
Domain perspective:
— Specifics of data/information: formats,
content, error handling
— Combine theory-driven models with
experimental data (e.g. simulation vs. exp.)
— Often knowledge not well formalized (“in the
experts head“)
— Little or no HPC background
HPC perspective
— Adoption of workloads to larger
infrastructures
— Optimization of workloads / (parallel)
application to provided infrastructure
— Support for use of hard and software layer
(parallel programming, filesystems,
communication), but not on content
— Little or no domain knowledge
Domain Scientist HPC Expertise
Slide 47
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Motivation
Slide 48
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
— Most important: bring experts together to investigate
requirements of data-intensive applications
and derive solution
— Connect experts and application domain scientists
Motivation
Domain Scientist HPC Expertise Service Center
Slide 49
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
— Competence Center for collaborative
Big Data driven research
— Established 2014 in Saxony
(TU Dresden, U. Leipzig,
MPI-CBG, IÖR, HZDR, UZF)
ScaDS Dresden/Leipzig
Motivation
Domain Scientist HPC Expertise Service Center
Slide 50
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
National big Data Competence center and Associated Partners
Focal point for new research activities
Specialists from computer & domain sciences
Collaborative big data research
Slide 51
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Still growing network of interested parties: Contacts to industry and academia
Slide 52
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Success Story ScaDS DResden/Leipzig
National & international outreach & visibility 200 keynotes/talks worldwide , 4 successful summer schools, 30 proven experts in guest program, 3 successful Big Data in Industry workshops
Many project aquisitions > 11 Mio Euro (Exploids, BIGGR, TIQ-Graph, KOBRA, MASI, GERDIE, EMUDIG4.0..)
Strong scientific output and competence (>200 publications) i.a. Big Graph Analytics, Sierra Platinum , CTS, data intensive workflows for HPC, settlement recognition in historic maps, Interactive Multi-Scale Visualization...
Service Center for Big Data with high impact Numerous interdisciplinary big data application projects and industry collaborations & transfer in industry
Successful training and education program “Big-Data-Schwerpunkt”: lectures/ seminars/ trainings/ PhD seminars Hundreds of Graduates with Big Data Expertise (Master) >10 PhDs in Big Data close to finishing
Slide 53
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Currently running phase 2 of center with evolved research program
— Connect to many
application areas
— Service Center as
integrative component
— strong network of
internationally
recognized experts (PI)
Slide 54
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Scalable Architectures (Prof. Dr. Wolfgang E. Nagel)
— Integration of hardware features into application layer
— Agile provisioning of Big Data environments for analytics
— Performance investigations of community and general frameworks
Hardware-based Data Security (Prof. Dr. Martin Bogdan)
— Architecture and implementation of a verification system for
efficient key exchange in secure communication
(e.g. IoT applications)
Scalable and secure Data Platforms
www.scads.de
Scalable and Secure Data Platforms
Scalable Architectures Hardware-based Data Security
Slide 55
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
— Information extraction from partially structured data
(Prof. Dr. Wolfgang Lehner)
— Graph based and privacy preserving data integration
(Prof. Dr. Erhard Rahm)
— Analytics of dynamic graph data (Prof. Dr. Erhard Rahm)
— Intelligent text analysis (Dr. Martin Potthast)
— Information extraction on super genomes
(Prof. Dr. Peter Stadler)
— Data analytics for process models (Prof. Dr. Bogdan Franczyk)
Big Data Integration & Analytics
Big Data Integration and Analytics
Big Data Integration Data Analytics
Slide 56
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Scalable Visual Analytics (Prof. Dr. Gerik Scheuermann)
— Integration of annotated data in super genome browser and connection
to visual analysis
— Modular extension for new data types
Immersive Visual Interaction (Prof. Dr. Stefan Gumhold, Prof. Dr. Raimund Dachselt)
— Methods for cross-scale and ensemble visualization
— Methods development for data interaction on immersive
large-scale displays and in virtual reality
— Interactive visual analysis using large scale
Scalable and secure Data Platforms
Visual Analytics
Scalable Visual Analytics Immersive Visual Interaction
ZIH and HPC / Data Analytics Infrastructure at TU Dresden
Slide 58
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
— Central Scientific Unit at TU Dresden
— Running computing and communication
infrastructure for the university
— Development of algorithms and methods:
Cooperation with users from all departments
— Providing infrastructure and qualified service
for scientists all over Saxony
— Dresden CUDA Center for Excellence
— Dresden Intel® Parallel Computing Center (IPCC)
— Competence center for „Parallel Computing and Software Tools“
— Competence center for Big Data – ScaDS Dresden/Leipzig
Center for Information Services and HPC (ZIH)
Slide 59
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
— Research topics
Scalable software tools to support the optimization of applications for HPC systems
Data intensive computing and data life cycle
Performance and energy efficiency analysis for innovative computer architectures
Distributed computing and cloud computing
Data analysis, methods and modelling in life sciences
Parallel programming, algorithms and methods
— Pick up and preparation of new concepts, methods,
and techniques
— Teaching and Education
Areas of Expertise
Slide 60
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Data Center and High Performance Computing for Saxony
Data center design innovation
— Innovative cooling
— Energy efficiency
— Reliability
Open for research collaborations
New hardware (10 Mio. €) for machine learning and Big Data
Slide 61
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
— HPC system design expertise
— Focussing on data intensive tasks for more than 15 years
— HRSK-I
Two machines, one HPC, one Throughput, one Capability
Lots of tape drives to move data in and out (SGI CXFS), almost 2 GB/s to tape in 2006
— HRSK-II
Island concept with HPC and Througput
High I/O bandwidth
HDD+SSD file system, 100 GB/s to disk and lots of IOPS
— And now: HPC-DA – new hardware (10 Mio. €) for machine learning and Big Data
Data Intensive Computing at ZIH
Slide 62
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Data Center and High Performance Computing for Saxony
Bull Cluster (Taurus)
Petaflop cluster Bull, bullx DLC B720/R400
— ~ 44,000 cores Intel
— 256 GPUs Nvidia Tesla K80 +
— 44 GPUs: Nvidia Tesla K20
— 136 TB RAM, >5 PB scratch file system
Extensions for Machine Learning (10Mio. € extension)
— 22 nodes IBM Power9 CPU (44 cores), 6 Nvidia V100 per node
— NVLink between GPUs and CPUs with 100 GB/s bi-directional bandwidth
— 612 nodes (x86-64) within Taurus (Data Analytics Island) have a high-bandwidth connection
to the NVMe-based storage component with up to 1.5 TB/s bandwidth
Slide 63
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
HPC-DA extension towards extremely fast I/O
— Redesigned one compute island of HRSK II
— Strong focus on highest bandwidth and low latency
— 612 CPU compute nodes
— 22 new Machine Learning Nodes IBM AC922
Each: 2x Power-9 CPUs,
6x NVIDIA V100 GPUs, NVLink
Is extended to 32 Nodes (192 V100), acceptance in preparation
— 90 NVME storage nodes (2 PB PCI2, 2TB/s)
Each node with 8 3,2 TB PCIe x4 NVME cards
Dual-link EDR IB, NVME over fabric
— 10 PB Object Storage with 50 GB/s bandwitdth
HPC-DA Extensions 2018/19
New Data
Analytics Island
612 CPU Nodes
(24 core Haswell)
Island Switch
90 NVMe
Storage Nodes
(2 PB PCIe NVME)
22 IBM AC922
ML Nodes
(2 Power9 CPUs,
6 NVIDIA V100) 2
TB/s
1,5 TB/s
0,4 TB/s
Core
Switches
500 GB/s
Core Switch Core Switch
Other
Compute
Nodes
10 PB
Object
Storage 50 GB/s
500 GB/s
Slide 64
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
— Installation is open to scientists from all over Germany whose HPC and Big Data application
cases can benefit from HPC-DA
— Overall more than 35000 Cores, additionally:
2 petabytes of flash memory (bandwidth of about 2 terabytes/s)
Object storage of 10 petabytes
IBM Power-9 nodes (22), each with six Nvidia V100 GPUs, closely connected to the fast storage systems
— Scalable virtual research environments tailored to user requirements
— Project proposals can be submitted:
https://tu-dresden.de/zih/hochleistungsrechnen/zugang/hpc-da
HPC and Data Analytics (HPC-DA)
Slide 65
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Reading through 1 PB of data: 500 Seconds Move data to Dresden via DFN: 36 TB/hour Archiving 1 PB: 6 hours
Data Center and High Performance Computing for Saxony
Slide 66
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
What do Users do on our Machines?
Slide 67
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Data and HPC – Convergence Patterns at ZIH
Virtual Research Environments
HPC HPC HTC NVRAM ML
Memory Virtualization
Compute Virtualization
classical HPC
Lustre Memory Memory …
Flink YARN …
Federation
Abstraction,
Services
Compute
Memory
Simulation Analysis Throughput
Streams, Data
Memory
Compute
Legend
HPC-DA: Hardware
HPC-DA: Software
HRSK-II
Slide 68
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Data and HPC – Convergence Patterns at ZIH
Virtual Research Environments
HPC HPC HTC NVRAM ML
Memory Virtualization
Compute Virtualization
classical HPC
Lustre Memory Memory …
Flink YARN …
Federation
Abstraction,
Services
Compute
Memory
Simulation Analysis Throughput
Streams, Data
Memory
Compute
Legend
HPC-DA: Hardware
HPC-DA: Software
HRSK-II
Classic HPC
Slide 69
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Data and HPC – Convergence Patterns at ZIH
Virtual Research Environments
HPC HPC HTC NVRAM ML
Memory Virtualization
Compute Virtualization
classical HPC
Lustre Memory Memory …
Flink YARN …
Federation
Abstraction,
Services
Compute
Memory
Simulation Analysis Throughput
Streams, Data
Memory
Compute
Legend
HPC-DA: Hardware
HPC-DA: Software
HRSK-II
Add system features via
virtualization layer
Slide 70
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Data and HPC – Convergence Patterns at ZIH
Virtual Research Environments
HPC HPC HTC NVRAM ML
Memory Virtualization
Compute Virtualization
classical HPC
Lustre Memory Memory …
Flink YARN …
Federation
Abstraction,
Services
Compute
Memory
Simulation Analysis Throughput
Streams, Data
Memory
Compute
Legend
HPC-DA: Hardware
HPC-DA: Software
HRSK-II
Enhance software stack up to
complete unique software settings
Slide 71
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
— Provisioning of required environments
(Hadoop, Spark, Flink, ML-frameworks, …)
— Big Data session created on demand
— Run directly as analytics service at
HPC site
— Adoptable to other frameworks/applications
Provision of data analytics@HPC
Slide 72
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Beyond usual HPC job scheduling
— Running jobs and workflows on this
infrastructure is complex
— Provide SW environments and
— templates for primary use cases
Complex scheduling @ heterogeneous hardware
Slide 73
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
NVME leases
— Provision NVME devices according to optimal
access model (parallel FS, read-only FS, HDFS,
database, raw block devices, …), several
templates to start from
— Either on NVME host or mounted to compute
nodes via NVME-over-fabrics
— Flexible NVME device assignment, not tied to
one compute node as in burst buffers
— Users allocate NVME devices exclusively for
their working data set as “NVME lease” over
medium-term periods (days to weeks)
Provisioning of NVME nodes
Evacuate and restore
— NVME leases may be evacuated to the object
storage and restored later
Performance Investigations
Slide 75
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
— Stage-in and Stage-out of complete research environments
— Analysis Tools
BYO
— Our own projects
ADA-FS, HP-DLF, NextGenIO
Vampir, ProPE, Score-P
ScaDS Dresden/Leipzig
Versatile Support
Slide 76
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
cuBLAS on V100 (First shot on Taurus) – V100 GPU performance at 6.4 TFlops sustained (fixed Hz)
Linear Algebra Performance Insights on Power9 and NVIDIA V100
Slide 77
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
cuBLAS on V100 (First shot on Taurus) – Copying data and its impact on performance
Linear Algebra Performance Insights on Power9 and NVIDIA V100
Slide 78
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
cuBLAS on V100 (Best results on Taurus) – Sustained performance @ 7.0 TF and ~5.7 TF (incl. copy)
Linear Algebra Performance Insights on Power9 and NVIDIA V100
Slide 79
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
cuBLAS on V100 (Best results on Taurus) – Close-up reveals performance pattern (every 64 operands)
Linear Algebra Performance Insights on Power9 and NVIDIA V100
Slide 80
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
cuBLAS on V100 (Best results on Taurus) – Repetitive pattern also visible for the data copy case
Linear Algebra Performance Insights on Power9 and NVIDIA V100
Slide 81
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
cuBLAS on V100 with varying precision – Tensor cores: matrix dimension should be multiple of 8
Linear Algebra Performance Insights on Power9 and NVIDIA V100
Slide 82
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
cuBLAS on V100 with varying precision – Fall-back to non-tensor math if dim. not multiple of 8
Linear Algebra Performance Insights on Power9 and NVIDIA V100
Slide 83
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
cuBLAS on V100 with varying precision – Fall-back to non-tensor math if dim. not multiple of 8
Linear Algebra Performance Insights on Power9 and NVIDIA V100
Early results on Power9 and V100
Slide 85
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Taurus vs. Power9 + V100 (no TensorCores)
Performance Insights on Power9 and NVIDIA V100
Slide 86
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Performance differences
V100 + Power9
— ~ 7.8 TF sustained per GPU (double)
— ~ 6.5 TF sustained with copy
V100 + Taurus
— 7.0 TF
— ~5.7 TF with copy
Similarities
— Performance jumps for problem size > 5.500 (?)
Performance Insights on Power9 and NVIDIA V100
Slide 87
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
hgemm on Power9 (+TensorCores) and CUDA 9.2 Peak: ~90 TFlops, with memory Transfer: ~60 TFlops
Performance Insights on Power9 and NVIDIA V100
Slide 88
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
hgemm on Power9 with TensorCores and CUDA 10.0
Performance Insights on Power9 and NVIDIA V100
CUDA 10 required to exceed 100 TFlops. Jump @28000?
Slide 89
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
hgemm on Power9 with TensorCores and CUDA 10.0
Performance Insights on Power9 and NVIDIA V100
CUDA 10 required to exceed 100 TFlops. Jump @28000?
Outlook – Future Perspective
Slide 91
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Machine Learning at scale only successful
— if there are enough data to learn with
— if data is understood
— if data are quality data
— Our topics are essential for ML/AI
Outreach to support Big Data and Machine Learning communities in Germany
Germany-wide offer: HPC-DA infrastructure and expertise
ScaDS Dresden/Leipzig is an important part of the Big Data / AI community in Germany
Big Data and Machine Learning in Germany
Slide 92
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Long term continuation of ScaDS Dresden/Leipzig as one Germany wide center for data analytics and artificial intelligence
Extensions towards center for AI – ScaDS.AI Dresden/Leipzig
Data centric research
(Big Data) AI algorithms
Knowledge representation
ScaDS Dresden/Leipzig: Big Data Competence & Service Center
ScaDS.AI Dresden/Leipzig
Slide 93
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Extensions towards center for AI – ScaDS.AI Dresden/Leipzig
ScaDS Dresden/Leipzig: Big Data Competence & Service Center
Knowledge
AI Foundations
Applied AI
Basic methodical research for AI
Formal methods for content description and semantics
Application into domain fields
Slide 94
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Extensions towards center for AI – ScaDS.AI Dresden/Leipzig
Knowledge
AI Foundations
Applied AI
Machine
Learning for
Graph Data
Neuro-inspired
AI-Methods
Privacy-Preserving
Machine Learning AI-Driven 3D
Reconstruction
Explanations
for Trusted AI
ScaDS Dresden/Leipzig: Big Data Competence & Service Center
Slide 95
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Extensions towards center for AI – ScaDS.AI Dresden/Leipzig
ScaDS Dresden/Leipzig: Big Data Competence & Service Center
Knowledge
AI Foundations
Applied AI
Knowledge aware
computing Knowledge Graphs
for AI Scalable Training
Data Acquisition Conversational AI
Slide 96
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Extensions towards center for AI – ScaDS.AI Dresden/Leipzig
ScaDS Dresden/Leipzig: Big Data Competence & Service Center
Knowledge
AI Foundations
Applied AI AI for Security
Data Science
for Biomedical
Applications
Solving social
problems by AI
Hyperspectral
Imaging
Slide 97
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Extensions towards center for AI – ScaDS.AI Dresden/Leipzig
ScaDS Dresden/Leipzig: Big Data Competence & Service Center
Knowledge
AI Foundations
Applied AI
Slide 98
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Extensions towards center for AI – ScaDS.AI Dresden/Leipzig
Proposed extension has been approved by external reviewers and extension of center is going to be implemented starting Q4/2019 as
Big Data / AI competence center!
Slide 99
Data Analytics and High Performance Computing - a convergence? Wolfgang E. Nagel
Thank You!
Acknowledgements: Holger Brunst Robert Dietrich René Jäkel Michael Kluge Andreas Knüpfer Ulf Markwardt Hartmut Mix Eric Peukert Erhard Rahm Robert Schöne Sunna Torge … and many more