INTEL HPC DEVELOPER CONFERENCE FUEL YOUR INSIGHT · INTEL® HPC DEVELOPER CONFERENCE FUEL YOUR...

Preview:

Citation preview

INTEL® HPC DEVELOPER CONFERENCE FUEL YOUR INSIGHT

INTEL® HPC DEVELOPER CONFERENCE FUEL YOUR INSIGHT

JUPYTER: PYTHON, JULIA, C, AND MKL HPC BATTERIES INCLUDED Oleg Mikulchenko

Intel Corporation

November 2016

INTEL® HPC DEVELOPER CONFERENCE FUEL YOUR INSIGHT

JUPYTER: PYTHON, JULIA, C, AND MKL HPC BATTERIES INCLUDED Oleg Mikulchenko

Intel Corporation

November 2016

4

Agenda

§  Motivation (“Insight, not numbers”)

§  Use case 1: Python, Julia, and C in Jupyter

§  Use case 2: MKL in Jupyter (on Knights Landing)

§  Use case 3: Knights Landing acceleration and SW infrastructure

§  Call for actions

MOTIVATION

5

“The purpose of computing is insight, not numbers.” Richard Hamming

Need to define a strategy how to get from numbers to insight

6

Motivation

7

Numbers to Insight Strategy

Get most insight

Make productive application development

Make productive software development

Use optimized high performance software

Use optimized high performance hardware

8

Numbers to Insight Strategy

Get most insight

Make productive application development

Make productive software development

Use optimized high performance software

Use optimized high performance hardware

C/C++, FORTRAN Libraries/Blocks:Intel© MKL, IPP, TBB, DAAL ,…

9

Numbers to Insight Strategy

Get most insight

Make productive application development

Make productive software development

Use optimized high performance software

Use optimized high performance hardware

C/C++, FORTRAN Libraries/Blocks:Intel© MKL, IPP, TBB, DAAL ,…

Productivity languages: Python, Julia, Matlab,…

10

Numbers to Insight Strategy

Get most insight

Make productive application development

Make productive software development

Use optimized high performance software

Use optimized high performance hardware

C/C++, FORTRAN Libraries/Blocks:Intel© MKL, IPP, TBB, DAAL ,…

Productivity languages: Python, Julia, Matlab,…

Frameworks/toolboxes:Caffe, Theano/Keras, Tensor Flow,…

11

Numbers to Insight Strategy

Get most insight

Make productive application development

Make productive software development

Use optimized high performance software

Use optimized high performance hardware

C/C++, FORTRAN Libraries/Blocks:Intel© MKL, IPP, TBB, DAAL ,…

Productivity languages: Python, Julia, Matlab,…

Frameworks/toolboxes:Caffe, Theano/Keras, Tensor Flow,…

Live books:Jupyter

12

Numbers to Insight Strategy

Get most insight

Make productive application development

Make productive software development

Use optimized high performance software

Use optimized high performance hardware

C/C++, FORTRAN Libraries/Blocks:Intel© MKL, IPP, TBB, DAAL ,…

Productivity languages: Python, Julia, Matlab,…

Frameworks/toolboxes:Caffe, Theano/Keras, Tensor Flow,…

Live books:Jupyter

13

Numbers to Insight Strategy

Get most insight

Make productive application development

Make productive software development

Use optimized high performance software

Use optimized high performance hardware

C/C++, FORTRAN Libraries/Blocks:Intel© MKL, IPP, TBB, DAAL ,…

Productivity languages: Python, Julia, Matlab,…

Frameworks/toolboxes:Caffe, Theano/Keras, Tensor Flow,…

Live books:Jupyter

API

Ker

nels

Jupyter: Intro in 2 minutesThe Jupyter Notebook is an interactive computing environment that enables users to author notebook documents (web applications) that include:

§  Live code (Julia, Python, R, and more: 50+ languages, language agnostic)

§  Interactive widgets

§  Plots

§  Narrative text

§  Equations

§  Images

§  Videos

14

Jupyter: Intro in 2 minutesThe Jupyter Hub is a multiuser version of the notebook, designed for centralized deployments:

§  Pluggable authentication

§  Collaboration with others trough the Linux

§  Deployments for all users on the centralized servers (on-site or off-site)

§  Container (Docker) friendly – facilitate scaling and process isolation

§  Code meets data – locate notebooks at data location

§  Very popular for many Deep Learning Frameworks

§  Likely, HPC ready

15

USE CASE: JULIA, PYTHON, AND C IN JUPYTER

16

Jupyter is a web

Header and GUI

Text Headers

Code

Outputs

Text

17

High level preparation

Prepare arguments

Do parallel

Put content

18

C call in one line

C call

Conversion to Julia array

C Function

LibraryFormatArguments

Fast, small overhead

19

High level processing in Julia

Define function

Call function

Fast, almost as C (~0.5x)

20

High level plotting/ processing in Python

Get insights on run length

distribution

21

Jupyter+Julia+C+Python Example: Takes off

§  Highly interactive work inside Jupyter to explore model features

§  Julia for glue, productive and fast custom processing

§  C library for low level, heavy duty, fastest computation

§  Python for productive processing (excellent libraries) and plotting

§  Choice of language and mix of language – up to user, agnostic

§  All together – clean, productive work, fast computing, get insights

22

USE CASE: MKL IN JUPYTER (ON KNIGHTS LANDING)

23

MKL Usage Scenarios §  MKL (Math Kernel Libraries) – high performance libraries for basic functions

–  BLAS, LAPACK, FFT, vector functions (VML), and vector random number generators (VSL)

–  C/C++ and FORTRAN APIs

§  (Tentatively) Most productive usage – Intel Python Distribution with Continuum (Anaconda) with hooks to MKL, IPP, DAAL

–  Seamlessly installs Intel Python Jupyter toolbox (and use as above)

–  Near native (C, Fortran) performance

§  Direct/API call of MKL function from Jupyter can be beneficial

–  Getting true native performance

–  Controlling more low level details

24

MKL Function Call From Jupyter Example: C API §  MKL parallel vector random number generators for Monte Carlo simulators

§  MKL VSL functions has examples – easy wrap up MKL function in C

25

MKL Function Call From Jupyter Example: C API

26

MKL Function Call From Jupyter Example:

27

MKL Function Call From Jupyter Example:

28

MKL Function Call From Jupyter Example: Data

Yield plot Correlation plot

Quick sanity Check - Checked

Final plot - insights

Parameter loop – 10 sims, 1e12 bits each,< 1 hour on KNL

29

USE CASE: KNIGHTS LANDING: ACCELERATION AND SOFTWARE INFRASTRUCTURE

30

31

KNL Xeon PHI CPU Experience (Real, Hands On) §  KNL is a general purpose CPU – any SW development you can ran on Xeon/I7 you

can run also on KNL

–  Web browsing, Eclipse, Intel Parallel Studio, …, whatever you name it

§  Serial tasks run on KNL slower than on Xeon/I7, but not terribly slow

§  Highly parallel tasks run several times faster than on Xeon/I7 (same optimized SW)

§  Binary compatible with Intel-64

–  Download for Xeon/I7, run on KNL

–  Compile on/for Xeon/I7, run on KNL

§  For typical OMP tasks (as above, MKL), ~3x acceleration is observed on KNL vs Xeon/I7 Haswell out of the box , more can be done by tuning (AVX512, etc.)

§  Intel full Python distribution, Julia, Jupyter seamlessly installed on KNL J32

KNL Xeon PHI CPU Experience – Simple Tuning

§  Use explicitly MCDRAM memory only for an application

–  Use NUMA control: “numactl --membind 1 app_name”

–  That gives ~7x acceleration by just 3 more words (for parallel Monte Carlo example as above with MKL, for different tasks results may vary, as shown in a presentation from NERSC)

§  Use SIMD friendly functions from MKL

–  With NUMA control for examples as above that provides ~9x acceleration

33

Call for Actions

§  Jupyter is evolving – keep eyes and help its next reincarnation – Jupyter Lab

–  More interactive widgets for interactive work and applications

–  More integration with OS/Shell

–  IDE for code debug – removing needs for other IDE ?

§  A bit better support for Xeon-Phi is needed

§  What else needed from HPC ?

–  HPC community feedback is welcome

34

THANK YOU FOR YOUR TIMEOleg Mikulchenko

oleg.mikulchenko@intel.com

www.intel.com/hpcdevcon

INTEL® HPC DEVELOPER CONFERENCE FUEL YOUR INSIGHT

Recommended