Upload
lamdung
View
220
Download
0
Embed Size (px)
Citation preview
HPC platformsat
Grenoble - Rhône-Alpes
Pierre Neyron LIG/CNRS2015-04-07
Outline
● HPC platforms at Inria Grenoble-RA– Beyond Inria: the academic HPC ecosystem
– Focus on the regional centers: mésocentre
– Focus on Grid'5000
– Quick comparison chart
Pyramid of the academic HPC platforms
1015
Tier-0Europe
1014-15
1012-14
1 Pflops
100 Tflops
Local facilities
Tier-1centres nationaux
Tier-2 mésocentres
SIC Cluster
com
putin
g po
wer
(flo
ps)
10 Tflops
Grenoble - Rhône-Alpes
http://www.genci.fr/fr/content/un-ecosysteme-a-l-echelle-europeennehttp://www.prace-ri.eu/prace-resources/http://www.genci.fr/fr/content/calculateurs-et-centres-de-calculhttp://calcul.math.cnrs.fr/spip.php?rubrique6
#: 6
#: 3
#: >30See the nextpresentation
Tier-0/Tier-1:Project calls, ~1/year
(www.edari.fr)
HPC academic communityNational → Grenoble / Lyon
● Groupe calcul, calcul.math.cnrs.fr (mailing list)
● Maison de la simulation, www.maisondelasimulation.fr
– Grenoble: MaiMoSine, www.maimosine.fr, grenoble-calcul.imag.fr
– Lyon : Centre Blaise Pascal, www.cbp.ens-lyon.fr, [email protected]
● Mésocentres (federating resources at campus level / prepare to Tier-1/0)
– Grenoble : CIMENT, ciment.ujf-grenoble.fr
– Lyon : FLMSN (PSNM / P2CHPD), flmsn.univ-lyon1.fr
→Good places to seek for computing resources/time as well as information (help), trainings, ...
Focus on CIMENTciment.ujf-grenoble.fr
● 14 clusters, 6780 cores● Total: 120 Tflops● 24 TB RAM, 1.5 PB storage
(Irods data store)
Froggy:● 3216 cores● Scratch Lustre 90TB● 88Tflops● + 18 GPUs (21Tflops)● 1 fatnode 512GB RAM● 1 visualization node● IB FDR● PUE 1.18
Luke● Big data cluster● Sequential jobs
Froggy
Grenoble's HPC resources and knowledge sharing since 1998.Open to academics from Grenoble and more.New projets accepted all year round (review once/year).Project based fair-sharing access to the resources (Froggy).Grid middleware for bag of tasks campaigns (Cigri).
Grid'5000 at a glancewww.grid5000.fr
● « A large-scale and versatile testbed for experiment-driven research in all areas of computer science, with a focus on parallel and distributed computing including Cloud, HPC and Big Data »
→ Experimental validation of models, algorithms...
● Reconfigurability & deep control
Users deploy their own experimentation platform!→ (Hardware-aaS)
● Change/setup/tune your own HPC stack● Control/monitor your own cloud computing infrastructure● Test/benchmark your own BigData storage protocols
Trade-offs for more resources
distance to specific needs(distance to the administrator)
Team'smachines
Mésocentres
Lyon
Grenoble
SIC Cluster
Grenoble - Rhône-Alpes
Com
puta
tion
pow
er /
siz
e of
pla
tfor
m (
log)
Grid'5000 vs. classic HPC platformsC
ompu
tatio
n po
wer
/ si
ze o
f pl
atfo
rm (
log)
distance to specific needs(distance to the administrator)
Team'smachines
The user is theadministrator
Mésocentres
Lyon
Grenoble
SIC Cluster
Grenoble - Rhône-Alpes
SIC cluster
PrimaryUsage
HPCFor Inria Grenoble-RA teams
Production workloads
HPCAll Grenoble academic community
Production workloads
ExperimentationDistributed Comp. Research/National
Production workloads « allowed »
Platformspecificities
Closest to the teamsBasic functionnalities
Young platform
Optimized HPC software stack, many HPC software via modules
Grid middleware for « bag of tasks »
Hardware-aaS / ReconfigurationBasic functionalities for HPC
Users deploy their own platform
Machines Classic servers, mix of reused and new → 250 cores
HPC optimized hardware ~7000 coresFroggy : 3200 cores cluster
GPUs, Xeon PHIs
11 sites, ~1000 nodes, ~6000 coreVarious hardware, some HPC,
GPUs, Xeon PHIs
Network Basic ethernet network (1Gbps) IinfiniBand up to FDR (Froggy) Ethernet 1GE, some 10GE,some InfiniBand DDR /QDR
Storage NFS homedir 4GB (Inria)(same as workstations)
NFS scratch 2TB
NFS homedir 30GBLustre scratch 90TB
Irods global storage 700TB
NFS homedir 25GBStorage5K
Dfs5k (ceph, ...)
Accounts Inria iLDAP CIMENT LDAP Grid'5000 LDAP
Access SSHwww from frontends only
SSH + 1 visualization nodewww from frontends only
SSH/RestAPIwhitelisted www from any node
OS Same as workstationsLinux / Fedora 20
Linux / RedHat or Debian/appli with modules for HPC software
Default env is Linux DebianUsers deploy any OS/software
Resources management
OAR (1 for all machines) OAR (1/cluster)CIGRI for grid campaignsWalltime: < 4days (froggy)
OAR, kadeploy,... (1/site)Adv. Reservations / Interactive jobs
Walltime: <2h or night+we
Support SIC team / helpdesk inria [email protected]→ helpdesk (GLPI) + MaiMoSine
mailing list + bugzilla [email protected]
“Local” platforms' comparison chart
Grenoble - Rhône-Alpes
CIMENT charter: https://ciment.ujf-grenoble.fr/wiki-pub/images/7/79/Charte_CIMENT_Utilisateurs_v1_2013.pdfGrid'5000 charter:https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter