49
HPC and Big data convergence Serge G. Petiton, Université Lille 1 and CNRS, France, Nahid Emad, University of Versailles, France, Laurent Bobelin, Universite François Rabelais, France,

HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

HPCandBigdataconvergence

Serge G. Petiton, Université Lille 1 and CNRS, France,Nahid Emad, University of Versailles, France, Laurent Bobelin, Universite François Rabelais, France,

Page 2: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

Outline

• Introduction

• BigDataand/versusHPC

• ThePageRankmethodandeigenvectorcomputations

• TheMap-Reducemethodanddistributedalgorithms

• ConvergencebetweenHPCandBigData

• Conclusion

1stFJGworkshopApril5th

Page 3: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

Outline

• Introduction

• BigDataand/versusHPC

• ThePageRankmethodandeigenvectorcomputations

• TheMap-Reducemethodanddistributedalgorithms

• ConvergencebetweenHPCandBigData

• Conclusion

1stFJGworkshopApril5th

Page 4: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

Introduction• HPC

– simulation:sizeofdataincreasing,newdomains,stability,….

– globalreductionproblemsandcommunicationproblems

– andallthewell-knownproblemsdiscussedthisweekatCSE2017…..

• BigData:

– enoughdatatoavoidstatisticalmodels?

– dataranking

– map-reduceoperations(indexing,….)

• HPCandBigdata:convergence?Cooperation?Mutualexchanges?

– Jointedproblems:energyconsumption,I/Ominisation,resilience,.......

– Nevertheless,theintrinsicbehaviorsarequitedifferent

”convergence”HPC-Big Data:Whatlanguages,programmingparadigms,systems… forwhatmethods,sciencesandfuturedevelopments

April5th 1stFJGworkshop

Page 5: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

Outline

• Introduction

• BigDataand/versusHPC

• ThePageRankmethodandeigenvectorcomputations

• TheMap-Reducemethodanddistributedalgorithms

• ConvergencebetweenHPCandBigData

• Conclusion

1stFJGworkshopApril5th

Page 6: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

Aggregation of Google search data to estimate currentflu activity in near real-time

Statistic model free - Big Data: flu epidemic

Page 7: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

1stFJGworkshop

AYutong Lu’sslide

StackoftheTianhe supercomputers:

April5th

MPI/OPenMP +hadoop/Sparkonsupercomputersoftwarestack

Page 8: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

1stFJGworkshop

HaadoopandSparkOnTianhe 2

WithMPIandOpenMP

April5th

Askingalsoforsmartermiddlewareonsuchmachine)

Page 9: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

April5th 1stFJGworkshop

BUT Nowadays HPC and Big Data platforms are different:• HPC Storage = expensive and interconnected, Big Data = local, cheap

disks• HPC and Big Data platforms are not of the same scale:

billion of cores vs. thousands of machines*

HPC tries to make full use of CPU/GPU, Big Data isfocused on I/O.

Nowadays HPC and Big Data relies on different theoreticalground:• HPC performs most of the time a mix of task and data parallelism• Inter-tasks communications are really different.• Big Data have a tendency to treat all nodes as equals.• MapReduce for example has good scaling property when all nodes perform a

Map simultaneously.• Big Data relies on statistics and good probabilistic properties of data

distribution, while HPC code explicitly distribute data

Things are fast changing and some Big Data environment (Hadoop for example) now include more support for task parallelism, but there's still an enormous gap between the two.

Page 10: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

1stFJGworkshop

WhatHPC/BigDataApplications?Andwhatprogrammingparadigms?AndwhataboutDataRankingandHPCwith”BigData”?

April5th

Nowadays HPC and Big Data stacks are different:

• Performances: Big Data exhibits larger latency compared to HPC. BD scaling implies often larger tasks.

• HPC is optimized since almost 50 years, 10 years for BigData• HPC code is often optimized for its hardware platform, while BigData is not that

deep into the architecture• Languages and philosophy used in both may be different

Many projects have failed due to the complexity of integrating both stacks.

Some projects so far have addressed the problem – mainly based on Hadoop + MPI with no real large adoption:

• Integrating HPC tools into Hadoop environment (Hamster, ...): most of the time not accurate due to application latency, disappointing results

• Deploying Hadoop on Demand on HPC resources (HoD,...): most of the time not accurate because of hardware requirements

Page 11: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

Outline

• Introduction

• BigDataand/versusHPC

• ThePageRankmethodandeigenvectorcomputations

• TheMap-Reducemethodanddistributedalgorithms

• ConvergencebetweenHPCandBigData

• Conclusion

1stFJGworkshopApril5th

Page 12: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

Flu epidmic : Proposed PageRank-like model (Nahid Emad)

PageRank-like model PageRank modelAn individual in a social graph A webpage in a web graphe

A virus A walker

Propagation of the virus Promenade of the walker

PageRank of an individual is the probability to be infected by the virus in the course of epidemic

PageRank of a specific page is the probability of the presence of the walker on the page

Withamathematicalformalismanddominanteigenvectorcomputation

Weranktheindividualwithrespecttotheviruspropagation

Page 13: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

Mathematical formalismFrobenius theorem à λ=1 is the largest eigenvalue of the matrix P.Then, there is a stationary distribution for the final state of epidemicspread: Px=x.xi the probability that individual i be infected during epidemicx= (x1, x2 ,.., xn) the stationary distribution (infection vector) for thewhole population is independent of starting distribution and verifiesPx=x.The impact of infection vector x in social graph is similar to that ofPageRank vector in web graph.

Problem Solution

Dangling individual add a loop to itself

Small worldnon-uniqueness of ranking vector

add a jumping vector to the random virus propagation process

Page 14: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

Computational algorithms

A = αP + (1-α)vzT

A is disease transition matrixv is the teleportation vectorz is the vector (1, …, 1)T

α (<1) damping factor1-α jumping rate; the probability for the virus to jump from any individual to any other individual in a social graph.

Then,computationofthedominanteigenvectorofA(verylarge,non-Hermitianandverysparse)

Page 15: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

Computational algorithms: MIRAM

initial guess w3

Projection of(Pn) to Km3

Solving (Pm3)

Return in ¢n

initial guess w1

Projection of(Pn) to Km1

Solving (Pm1)

Return in ¢n

initial guess w2

Projection of(Pn) to Km2

Solving (Pm2)

Return in ¢n

Asynchronous information exchange

With n=1011, c=103, m=103, if iter=100, b =10, MIRAM needs:iter ´ b =100 Exa flop (103 peta flop) and b ´ 10=100 peta bytes

Page 16: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

ExperimentsStochastic simulation using the infection vector

Time series of infection in an 7010-node power-law social graph ba, with ν= 0.2, δ= 0.24and x=5

Page 17: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

Experiments

Convergencebehaviorforthe281903X281903Stanfordmatrix,α=0:85

Numberofmatrixvectorproductsforthe281903281903Stanfordgraph

Page 18: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

• 2 kg – more bacteria than human cells (60.1016)• An unknown organ: intestinal microbiota• Amount of sequence generated has increased 109 times in

20 years.

Big Data: microbiota (Nahid Emad with J.-M. BATTO - INRA MGP)

bacteria

protists

archae

virus

fungus

Page 19: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

Big Data: microbiota

bacteria =~3000genes

parasites=~6000genes

Virus=~50genes

100 – 1000 individualsU

p to

10

mill

ions

of

gene

s

DNA preparation->Get Sequences->Compare to reference->Counting & analyzing

Page 20: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

Big Data: microbiota

MetaProf→energy efficiency multiplied by 4.7 with the GPU implementation

genes

gene

s

Correlationmatrix

gene

s

Countingmatrix

Matrix106 genesby800samples

Samples

Principal Coordinates Analysis applied on the matrices of distances between samples,concentrating the major variations in the samples in a small space implies using manylinear algebra techniques. SVD computation of very large matrices

Numerical methods/algorithms & HPC techniques have to be defined/adapted to increase data-scalability.

Page 21: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

April5th 1st FJG workshop

Google

PageRankisoftenpartofalargerApplication

Page 22: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

April5th 1st FJG workshop

Indexing : MAP-REDUCE algorithm

Page 23: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

April5th 1st FJG workshop

Pagerank ranks all the pages : Computation of a non-Hermitian sparse matrix of size several billions, and growing

The ranking is done before the queries

Indexing : MAP-REDUCE algorithm

Page 24: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

April5th 1st FJG workshop

Pagerank ranks all the pages : Computation of a non-Hermitian sparse matrix of size several billions, and growing

The ranking is done before the queries

Indexing : MAP-REDUCE algorithm

Itexistsapplicationsmixingasynchronouslydifferentapproaches:• HPC• Indexing• DataOntology• …....

• Wehavedistributedasynchronousparallelcomputation

Page 25: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

Outline

• Introduction

• BigDataand/versusHPC

• ThePageRankmethodandeigenvectorcomputations

• TheMap-Reducemethodanddistributedalgorithms

• ConvergencebetweenHPCandBigData

• Conclusion

1stFJGworkshopApril5th

Page 26: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

Applicationsmixinglinearalgebra(orHPC-likecomputation,includingPageRank,…)andMap-Reduce(orotherclassical“bigdata”algorithms

• Insequence• Inparallel,asynchrously• Iteratively

ItexistsnewapproachesforMap-Reduce(Yarn,Tez,…)

ItexistsnewlanguageanprogrammingparadigmsforHPC

Bothhavetomixdistributedandparallelcomputation,asynchronouslyornot

Wehavetotargetpost-petascale machineandfutureexascale computing

Wehavehavetoproposesolutionadaptedtosupercomputeranddatacenter

April5th 1stFJGworkshop

Page 27: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

April5th 1stFJGworkshop

Geoscience

Page 28: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

Kirchhoff

April5th 1stFJGworkshop

Page 29: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

Kirchhoff

April5th 1stFJGworkshop

Map-Reduce,orothersbigdataindexingorontologyanalysis

Page 30: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

1stFJGworkshop

Masnida Amami, Ali Setayesh, Nasrin Jaberi. Distributed Computing of Seismic Imaging Algorithms (Iran)

Rizvandi N B, Boloori A J, Kamyabpour N, et al.: MapReduce implementation of prestack Kirchhoff time migration (PKTM) on seismic data (Australia)

SparkversiondevelopedbyresearchersfromNanjingUniversity(China

April5th

Map-ReduceusetooptimizeI/OandcommunicationsforsomeKirchhoffmigrationmethod

Nevertheless,theMap-ReduceisdonebeforetheKirchhoffcomputationitself.

WhataboutcomputationalmethodsmixingMap-Reduce(orotherstypicalBigDataAlgorithms)and”classical”HPCcomputation(LinearAlgebra,FFT,convolution,…..)duringcomputation

• Mixinglanguage• Datastructure• Plateforms:clouds,grids,supercomputers,...• Systems,scheduling,....• I/O.....• .....

StartingfromtheBigdataside:

Firstexperiments,forsomeversionsofKirchhoff:

Page 31: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

1stFJGworkshopApril5th

Yarn:containersContainersareclustersofnodes

Page 32: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

1stFJGworkshopApril5th

Page 33: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

1stFJGworkshopApril5th

Page 34: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

Hardwarenodes

Node Manager’s containers

Node Managers

ResourceManager

Distributedandparalleltwolevelprogramming

WithDAG

Page 35: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

1stFJGworkshopApril5th

Page 36: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

1stFJGworkshopApril5th

Page 37: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

1stFJGworkshopApril5th

Page 38: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

1stFJGworkshopApril5th

TEZ:DAGofMapReducetasks/jobs

Page 39: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

Outline

• Introduction

• BigDataand/versusHPC

• ThePageRankmethodandeigenvectorcomputations

• TheMap-Reducemethodanddistributedalgorithms

• ConvergencebetweenHPCandBigData

• Conclusion

1stFJGworkshopApril5th

Page 40: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

Some elements on YML

• YML1 Framework is dedicated to develop and run parallel and distributed applications on Cluster, clusters of clusters, and supercomputers (schedulers and middleware would have to be optimized for more integrated computer – cf. “K” and OmnRPC for example).

• Independent from systems and middlewares– The end users can reused their code using another middleware– Actually the main system is OmniRPC3

• Components approach– Defined in XML– Three types : Abstract, Implementation (in FORTRAN, C or C++;XMP,..),

Graph (Parallelism)– Reuse and Optimized

• The parallelism is expressed through a graph description language, named Yvette (name of the river in Gif-sur-Yvette where the ASCI lab was). LL(1) grammar, easy to parse.

• Deployed in France, Belgium, Ireland, Japan (T2K, K), China, Tunisia, USA (LBNL, TOTAL-Houston).

1stFJGworkshop

Page 41: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

Graph (n dimensions)of components/tasksYML

Begin nodeEnd nodeGraph node

Dependence

parcompute tache1(..);notify(e1);

//compute tache2(..); migrate matrix(..);notify(e2);

//wait(e1 and e2);Par (i :=1;n) do

parcompute tache3(..);

notify(e3(i));//if(I < n)then

wait(e3(i+1);compute tache4(..);notify(e4);

endif;//compute tache5(..); control robot(..);notify(e5); visualize mesh(…) ;end par

end do par//

wait(e3(2:n) and e4 and e5);compute tache6(..);compute tache7(..);

end par

Generic component node

April5th 1stFJGworkshop

Page 42: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

AbstractComponent<?xml version="1.0"?><componenttype="abstract" name="prodMat"description=“Matrix

MatrixProduct"><params><param name="matrixBkk"type="Matrix"mode="in"/><param name="matrixAki« type="Matrix"mode="inout"/><param name="blocksize"type="integer"mode="in"/></params>

</component>

Future:

<param name= "conv"type=" graph_param_float"mode= "inout"/>

1stFJGworkshop

Enduserswouldbeabletoaddsomeexpertisehere

<libraries=ScaLAPACk <NAG<PETSC>

Page 43: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

Implementation Component<?xml version="1.0"?><componenttype="impl"name="prodMat"abstract="prodMat"description="Implementation

componentofaMatrixProduct"><impl lang="CXX"><header/><source><![CDATA[

int i,j,k;double**tempMat;//Allocationfor(k=0;k<blocksize ;k++)for(i=0;i<blocksize ;i++)

for(j=0;j<blocksize ;j++)tempMat[i][j]=tempMat[i][j]+matrixBkk.data[i][k]*matrixAki.data[k][j];

for(i=0;i<blocksize ;i++)for(j=0;j<blocksize ;j++)

matrixAki.data[i][j]=tempMat[i][j];//Desallocation

]]></source>

<footer /></impl>

</component>1stFJGworkshop

Endusersmayaddsomeexpertisehere(we’llseeexample)

<impl lang="XMP"nodes="CPU:(5,5)"libs=""><distribute>

<param template="block,block "name="A(100,100)" align="[i][j]:(j,i)" /><param template="block"name="Y(100);X(100)"align="[i]:(i,*)"/>

</distribute>

Aimplementationcomponentmaybedevelopedusingaparallellanguage(suchasXMP)

Page 44: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

Graph component of Block Gauss-Jordan Method

44

1stFJGworkshop

Page 45: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

1stFJGworkshopApril5th

Tez and YML :• Share (more or less) the same DAG/Graph of Tasks vision• Each task in both may be parallel• Shares the same structure, with a two-part scheduler: one DAG-based part and

one responsible for resource allocation.• As Tez is maintaining a container pool, it can provide YML a stable platform.

It may not be perfect, but at least it can provide a prototype to have a better view of the inherent problems of scheduling

Page 46: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

Graph (n dimensions)of components/tasksYML

Begin nodeEnd nodeGraph node

Dependence

parcompute tache1(..);notify(e1);

//compute tache2(..); notify(e2);

reduce tache8(…) ; notify(e8);//

wait(e1 and e2);Par (i :=1;n) do

parcompute tache3(..); notify(e3(i));

//if(I < n)then

wait(e3(i+1);compute tache4(..);notify(e4);

endif;//

wait (e8);map tache9(…..);

end parend do par

//wait(e3(2:n) and e4 and e5);compute tache6(..);compute tache7(..);

end par

Generic component node

April5th 1stFJGworkshop

Page 47: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

parcompute tache1(..);notify(e1);

//compute tache2(..); notify(e2);

mapreduce tache8(…) ; notify(e8);//

wait(e1 and e2);Par (i :=1;n) do

parcompute tache3(..); notify(e3(i));

//if(I < n)then

wait(e3(i+1);compute tache4(..);notify(e4);

endif;//

wait (e8);mapreduce tache9(…..);

end parend do par

//wait(e3(2:n) and e4 and e5);compute tache6(..);compute tache7(..);

end par1stFJGworkshop

Withinformations onthe“implementationcomponents”suchas“reduce”or“map”,,orothers

Itwouldbepossibletoavoidkeywordssuchas“compute”,“mapreduce”,“migration”orothersandtospecifythatontheabstractcomponents,butitwillbelessclearfortheend-userand/orprogrammer

Then,wemayhavegraphsoftask/componentmixingHPCandBidDatacomputation.

Thedifficultiesaretoefficientlyschedulethesetask/componentsandminimizecommunicationsandenergyconsumptionsandtoahaveagoodresilience

April5th

Page 48: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

Outline

• Introduction

• BigDataand/versusHPC

• ThePageRankmethodandeigenvectorcomputations

• TheMap-Reducemethodanddistributedalgorithms

• ConvergencebetweenHPCandBigData

• Conclusion

1stFJGworkshopApril5th

Page 49: HPC and Big data convergence - events.science-japon.orgevents.science-japon.org/hpc17/slides/Serge Petiton... · HPC and Big data convergence Serge G. Petiton, UniversitéLille 1

Conclusion• Adaptedhighlevelprogrammingparadigmshavetobedeveloped,

allowingschedulerstomanageboththegraphsofcontrol-tasksandthedataflowgraphs;andallowingmixingHPCandBigDatacomputation

• Multi-levelprogrammingissharedbyHPCandBigData(graphadistributed”parallel”computation)

• ConvergenceofHPCandBigDatacomputationisanimportantchallenge:theimportantchallengeistoproposelanguage,programmingparadigmsandschedulerswhichwillallowtobeadaptedtoallthedifferentapproachesofthisconvergence.

• Applicationswithhugedatamodifiedduringcomputationaskingforoperationssuchasmap-reduceinparallelofclassicHPC wouldbemoreandmorepresentforthefutureimportantexascale challenges.

• Highperformanceartificialintelligence(askingforHPCandhugenumberofdata)wouldusesuchprogrammingparadigms

1stFJGworkshopApril5th