27
Introduction to the Argonne Training Program on Extreme-Scale Computing (ATPESC) Paul Messina Director of Science Argonne Leadership Compu7ng Facility Argonne Na7onal Laboratory

Introduction to the Argonne Training Program on Extreme ...extremecomputingtraining.anl.gov/files/2014/08/Day... · Introduction to the Argonne Training Program on Extreme-Scale Computing

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Introduction to the Argonne Training Program on Extreme ...extremecomputingtraining.anl.gov/files/2014/08/Day... · Introduction to the Argonne Training Program on Extreme-Scale Computing

Introduction to the Argonne Training Program on Extreme-Scale Computing

(ATPESC)

Paul  Messina  Director  of  Science  Argonne  Leadership  Compu7ng  Facility  Argonne  Na7onal  Laboratory    

Page 2: Introduction to the Argonne Training Program on Extreme ...extremecomputingtraining.anl.gov/files/2014/08/Day... · Introduction to the Argonne Training Program on Extreme-Scale Computing

§  Welcome  §  A  few  words  about  Argonne  Na8onal  Laboratory  §  Mo8va8on  of  the  ATPESC  §  The  curriculum  §  Logis8cs  

Outline

Page 3: Introduction to the Argonne Training Program on Extreme ...extremecomputingtraining.anl.gov/files/2014/08/Day... · Introduction to the Argonne Training Program on Extreme-Scale Computing

Argonne – a part of DOE National Laboratory System

Page 4: Introduction to the Argonne Training Program on Extreme ...extremecomputingtraining.anl.gov/files/2014/08/Day... · Introduction to the Argonne Training Program on Extreme-Scale Computing

Argonne’s mission: To provide science-based solutions to pressing global challenges

Energy Science

Nuclear and National Security

Environmental Sustainability

Use-Inspired Science and Engineering…

…Discovery and Transformational Science and Engineering

Major User Facilities Science and Technology Programs

Page 5: Introduction to the Argonne Training Program on Extreme ...extremecomputingtraining.anl.gov/files/2014/08/Day... · Introduction to the Argonne Training Program on Extreme-Scale Computing

AVIDAC: Argonne's Version of the Institute's Digital Arithmetic Computer: 1949-1953

“Moll”  Flanders,  Director  Jeffrey  Chu,  Chief  Engineer  

5  

§  AVIDAC:  based  on  prototype  at  the  Ins8tute  for  Advanced  Study  in  Princeton  

§  Margaret  Butler  wrote  AVIDAC’s  interpre8ve  floa8ng-­‐point  arithme8c  system  §  Memory  access  >me:  15  microsec  §  Addi>on:  10  microsec  §  Mul>plica>on:  1  millisec  

§  AVIDAC  press  release:    100,000  8mes  as  fast  as  a  trained  “Computer”  using  a  desk  calculator  

Page 6: Introduction to the Argonne Training Program on Extreme ...extremecomputingtraining.anl.gov/files/2014/08/Day... · Introduction to the Argonne Training Program on Extreme-Scale Computing

Early work on computer architecture

Margaret  Butler  helped  assemble  the  ORACLE  computer  with  ORNL  Engineer  Rudolph  Klein.  In  1953,  ORACLE  was  the  world’s  fastest  computer,  mul>plying  12-­‐digit  numbers  in    .0005  seconds  (2Kop/s).  Designed  at  Argonne,  it  was  constructed  at  Oak  Ridge.  

6  

Page 7: Introduction to the Argonne Training Program on Extreme ...extremecomputingtraining.anl.gov/files/2014/08/Day... · Introduction to the Argonne Training Program on Extreme-Scale Computing

§  Today’s  most  powerful  supercomputers  have  complex  hardware  architectures  and  soQware  environments  –  and  even  greater  complexity  is  on  the  horizon  from  next-­‐genera>on  and  exascale  systems  –  

§  The  scien8fic  and  engineering  applica8ons  that  are  tackled  with  these  systems  are  themselves  complex  

§  There  is  a  cri8cal  need  for  specialized,  in-­‐depth  training  for  the  computa8onal  scien8sts  poised  to  facilitate  breakthrough  science  and  engineering  using  these  systems  

Motivation for the ATPESC

Page 8: Introduction to the Argonne Training Program on Extreme ...extremecomputingtraining.anl.gov/files/2014/08/Day... · Introduction to the Argonne Training Program on Extreme-Scale Computing

The DOE Leadership Computing Facility §  Collabora>ve,  mul>-­‐lab,  DOE/SC  

ini>a>ve  ranked  top  na>onal  priority  in  Facili&es  for  the  Future  of  Science:  A  Twenty-­‐Year  Outlook.  

§  Mission:  Provide  the  computa>onal  and  data  science  resources  required  to  solve  the  most  important  scien>fic  &  engineering  problems  in  the  world.    

§  Highly  compe>>ve  user  alloca>on  program  (INCITE,  ALCC).  

§  Projects  receive  100x  more  hours  than  at  other  generally  available  centers.  

§  LCF  centers  partner  with  users  to  enable  science  &  engineering  breakthroughs  (Liaisons,  Catalysts).  

Page 9: Introduction to the Argonne Training Program on Extreme ...extremecomputingtraining.anl.gov/files/2014/08/Day... · Introduction to the Argonne Training Program on Extreme-Scale Computing

Leadership Computing Facility systems

Argonne  LCF   Oak  Ridge  LCF  

System   IBM  Blue  Gene/Q   Cray  XK7  

Name   Mira   Titan  

Compute  nodes    49,152   18,688  

Node  architecture   PowerPC,  16  cores   AMD  Opteron,  16  cores  NVIDIA  K20x  (Kepler)  GPU  

Processing  Units   786,432  Cores   299,008  x86  Cores  +    18,688  GPUs  

Memory  per  node,  (gigabytes)   16   32  +  6  

Peak  performance,  (petaflops)   10 27

Page 10: Introduction to the Argonne Training Program on Extreme ...extremecomputingtraining.anl.gov/files/2014/08/Day... · Introduction to the Argonne Training Program on Extreme-Scale Computing

§  Mira  –  BG/Q  system  –  49,152  nodes  /  786,432  cores  –  786  TB  of  memory  –  Peak  flop  rate:  10  PetaFLOPs  –  3,145,728  hardware  threads  

§  Vesta  (T&D)  -­‐  BG/Q  system  –  2,048  nodes  /  32,768  cores  –  32  TB  of  memory  –  Peak  flop  rate:  420  TF  

§  Tukey  –  Nvidia  system    –  100  nodes  /  1600  x86  cores/  200  M2070  GPUs  –  6.4  TB  x86  memory  /  1.2  TB  GPU  memory  –  Peak  flop  rate:  220  TF  

§  Storage  –  Scratch:  28.8  PB  raw  capacity,  240  GB/s  bw  (GPFS)  –  Home:  1.8  PB  raw  capacity,  45  GB/s  bw  (GPFS)  –  Storage  upgrade  planned  in  2015  

10  

10 Petaflops Blue Bene/Q - Mira

Page 11: Introduction to the Argonne Training Program on Extreme ...extremecomputingtraining.anl.gov/files/2014/08/Day... · Introduction to the Argonne Training Program on Extreme-Scale Computing

Mira’s Ecosystem

11  11  

Vesta  (Dev)  2  racks/16K  cores  416  TF  

Mira  48  racks/768K  cores  10  PF  

Networks  –  100Gb  (via  ESnet,  internet2  UltraScienceNet,  )  

IB  Switch  

I/O  

I/O  

Infiniband  Switch  Com

plex  

(3)  DDN  12Ke  couplets  –  Home  –  1.8  PB  (raw),  45  GB/s      

(16)  DDN  12Ke  couplets  –  Scratch  –  28  PB  (raw),  240  GB/s  

(1)  DDN  12Ke  –  600  TB  (raw),  15  GB/s      

Tukey  (Viz)  100  nodes/1600  cores  200  NVIDIA  GPUs  220  TF    

Cetus    4  racks/64K  cores  832  TF  

I/O  

Tape  Library  –  16  PB  (raw)  

Page 12: Introduction to the Argonne Training Program on Extreme ...extremecomputingtraining.anl.gov/files/2014/08/Day... · Introduction to the Argonne Training Program on Extreme-Scale Computing

BlueGene/Q Compute Chip

§  360  mm²    Cu-­‐45  technology    (SOI)  –   ~  1.47  B  transistors    

§  16  user  +  1  service  processors    – plus  1  redundant  processor  – all  processors  are  symmetric  – L1  I/D  cache  =  16kB/16kB  – L1  prefetch  engines    

§  Crossbar  switch  – Connects  cores  via  L1P  to  L2  slices  

§  Central  shared  L2  cache  – 32  MB  eDRAM  – 16  slices    

§  Dual  memory  controller    – 16  GB  external  DDR3  memory  – 42.6  GB/s    

§  Chip-­‐to-­‐chip  networking  – Router  logic  integrated  into  BQC  chip  – DMA, remote put/get, collective operations – 11  network  ports  

§  External  IO      – PCIe  Gen2  interface  

System-­‐on-­‐a-­‐Chip  design  :  integrates  processors,    memory  and  networking  logic  into  a  single  chip  

12  

Page 13: Introduction to the Argonne Training Program on Extreme ...extremecomputingtraining.anl.gov/files/2014/08/Day... · Introduction to the Argonne Training Program on Extreme-Scale Computing

CORAL Joint NNSA & SC Leadership Computing Acquisition Project

Leadership  Computers  run  the  most  demanding  DOE  mission  applica>ons  and  advance  HPC  technologies  to  assure  con>nued  US/DOE  leadership  

Objec8ve  -­‐  Procure  3  leadership  computers  to  be  sited  at  ANL,  ORNL,  and  LLNL  in  CY17-­‐18  

Approach Competitive process - one RFP (issued by LLNL) leading to 3 computer procurement contracts For risk reduction and to meet a broad set of requirements, 2 architectural paths will be selected Each lab manages and negotiates its own computer procurement contract, and may exercise options to meet their specific needs

Understanding that long procurement lead time may impact architectural characteristics and designs of procured computers

Sequoia  (LLNL)            

Mira  (ANL)  Titan  (ORNL)      

Current  DOE  Leadership  Computers  

Page 14: Introduction to the Argonne Training Program on Extreme ...extremecomputingtraining.anl.gov/files/2014/08/Day... · Introduction to the Argonne Training Program on Extreme-Scale Computing

ALCF-3 Project Description

§  Acquire  and  deploy  a  75  –  200PF  Leadership  Compu8ng  system  for  the  Department  of  Energy’s  Office  of  Science  Leadership  Compu8ng  Facility  –  Procure  the  compute  and  storage  system  –  Prepare  site  for  system  power  and  cooling  needs  –  Select  and  prepare  Early  Science  Program  code  teams  for  the  system  

–  Prepare  for  general  users  –  Deploy,  test,  and  accept  the  system  integrated  in  the  ALCF  environment  

14  

Page 15: Introduction to the Argonne Training Program on Extreme ...extremecomputingtraining.anl.gov/files/2014/08/Day... · Introduction to the Argonne Training Program on Extreme-Scale Computing

§  Gerty  Cori  was  the  first  American  woman  to  receive  the  Nobel  Prize  in  Science  (bio-­‐chemistry)  

§  Cori  will  be  a  Cray  XC  system  whose  nodes  are  based  on  Intel’s  next-­‐genera8on  Intel®  Xeon  Phi™  processor  –-­‐  code-­‐named  “Knights  Landing”    –  on-­‐package  high  bandwidth  memory  –  3  teraFLOPS  of  double-­‐precision  peak  performance  per  single  socket  node  

§  The  Knights  Landing  processor  will  have  72cores,  each  with  mul8ple  hardware  threads    

§  Cori  is  cheduled  for  delivery  in  mid-­‐2016  

15  

NERSC’s Cori* System: on the path to exascale *Gerty Cori was the first American woman to receive the Nobel Prize in Science

Page 16: Introduction to the Argonne Training Program on Extreme ...extremecomputingtraining.anl.gov/files/2014/08/Day... · Introduction to the Argonne Training Program on Extreme-Scale Computing

§  Architectures  –  Pete  Beckman  §  Programming  models  and  languages  –  Rusty  Lusk  and  Rajeev  

Thakur  §  Numerical  algorithms  and  soQware  -­‐-­‐  Lois  McInnes  and  Lori  

Diachin  §  Toolkits  and  frameworks  –  Kalyan  Kumaran  and  Scop  Parker  §  Visualiza8on  and  data  analysis  –  Mike  Papka  and  Joe  Insley  §  Data-­‐intensive  compu8ng  and  I/O  –  Rob  Ross  and  Rob  

Latham  §  Community  codes  and  soQware  engineering  –  Katherine  

Riley  and  Anshu  Dubey  

Curriculum tracks/sessions and their leaders

Page 17: Introduction to the Argonne Training Program on Extreme ...extremecomputingtraining.anl.gov/files/2014/08/Day... · Introduction to the Argonne Training Program on Extreme-Scale Computing

§  Purpose:  present  addi8onal  topics  that  will  probably  be  relevant  to  your  research  at  some  point  in  your  career  –  but  in  any  case  interes8ng  

§  Nine  dinner  talks  

Dinner talks

Page 18: Introduction to the Argonne Training Program on Extreme ...extremecomputingtraining.anl.gov/files/2014/08/Day... · Introduction to the Argonne Training Program on Extreme-Scale Computing

§  Many  lectures  every  day,  followed  by  evening  hands-­‐on  sessions  

§  Ideally  we  would  cover  all  topics  in  more  depth  but  the  result  would  be  a  six-­‐week  program  –  But  few  people’s  schedules  would  allow  them  to  par>cipate  

§  Note  the  8:30  a.m.  star8ng  8me,  dinner  at  5:30  p.m.  right  aQer  the  end  of  the  aQernoon  lectures,  evening  sessions  

§  Slides  will  be  posted  online  as  soon  as  available  

Yes, the ATPESC is an intense program

Page 19: Introduction to the Argonne Training Program on Extreme ...extremecomputingtraining.anl.gov/files/2014/08/Day... · Introduction to the Argonne Training Program on Extreme-Scale Computing

§  This  training  program  was  made  possible  by  funding  from  the  Research  Division  of  the  Advanced  Scien8fic  Compu8ng  Research  program  of  the  Department  of  Energy’s  (DOE)  Office  of  Science  

§  The  funding  is  for  three  years  –  The  training  program  will  be  offered  again  in  2015  –  And  we  may  ask  for  funding  for  2016ff  

§  Help  us  improve  the  training  program  –  Track  evalua>ons  –  Overall  program  evalua>on  –  Conversa>ons  or  emails  to  any  of  us  

Thank you, DOE Office of Advanced Scientific Computing Research (ASCR)

Page 20: Introduction to the Argonne Training Program on Extreme ...extremecomputingtraining.anl.gov/files/2014/08/Day... · Introduction to the Argonne Training Program on Extreme-Scale Computing

§  Help  us  improve  the  training  program  –  Track  evalua>ons  –  Overall  program  evalua>on  –  Conversa>ons  or  emails  to  any  of  us  

§  Please  fill  out  the  online  evalua8on  surveys  on  each  track  and  the  overall  program  –  at  the  end  of  each  track,  you  will  receive  an  email  from  [email protected]  with  a  link  to  that  track’s  evalua>on  

–  Respond  by  the  morning  of  the  next  day  to  be  eligible  for  the  prize  raffle  

–  Chel  Lancaster  is  coordina>ng  the  evalua>ons  and  will  be  available  to  answer  ques>ons  or  help  

Surveys

Page 21: Introduction to the Argonne Training Program on Extreme ...extremecomputingtraining.anl.gov/files/2014/08/Day... · Introduction to the Argonne Training Program on Extreme-Scale Computing

§  Suggested  reading  before  the  ATPESC  §  Par8cipant  introduc8ons  

– Details  during  dinner  

Suggestions from last year’s surveys adopted this year

Page 22: Introduction to the Argonne Training Program on Extreme ...extremecomputingtraining.anl.gov/files/2014/08/Day... · Introduction to the Argonne Training Program on Extreme-Scale Computing

§  Posi8on:  Ph.D.  student,  applied  mathema8cs,  University  of  CIncinna8  

§  Research  background:  –  Solu>on  of  ellip>c  PDEs  with  singulari>es  –  Mathema>cal  sosware  

§  Research  interests:  –  Parallel  computer  architectures  

§  Personal  interests  –  Sailing  

§  Personal  background:  –  Had  lived  in  four  countries  by  the  >me  I  was  14  years  old  

Paul Messina

Page 23: Introduction to the Argonne Training Program on Extreme ...extremecomputingtraining.anl.gov/files/2014/08/Day... · Introduction to the Argonne Training Program on Extreme-Scale Computing

§  All  lectures  and  hands-­‐on  sessions  in  Rembrandt  room  

§  All  meals  in  Utrillo  room  on  ground  floor  –  Lunch  and  dinner  presenta>ons  will  be  in  this  room  

§  Other  rooms  in  ground  floor  might  be  used  as  needed  

§  Wi-­‐fi  SSID  in  this  building  is  Argonne  §  Password  is  argonne14  

Logistics

Page 24: Introduction to the Argonne Training Program on Extreme ...extremecomputingtraining.anl.gov/files/2014/08/Day... · Introduction to the Argonne Training Program on Extreme-Scale Computing

Ga

llery H

all

(Seco

nd

Flo

or

)

The collection of meeting room

s on the second floor feature high-speedInternet access, netw

orking capabilities, excellent sound insulation and work

well w

ith any type of meeting. Beautiful view

s of the golf course are framed

by large window

s with light-dim

inishing draperies.

No

tes:

Gallery Hall (Second Floor)

Scale In Feet

0 5 10

Matisse

Rembrandt Renoir

Corot Van Gogh Cezanne20’

20’-4”20

’19’-31/2”

20’

20’

20’

20’-4”

40’-8

30’

40’-8

61’

Restroom

RoomNam

eReception

ClassroomTheater

Hollow

SquareU-Shape

Dining

CeilingHeight

SquareFootage

Gallery

Hall

Rembrandt

300180

26064

52200

10’9”2,400

Renoir175

90160

4836

11010’9”

1,200 Cezanne

45 30

4024

18 40

10’9” 400

VanG

ogh 45

30 40

2418

4010’9”

400 Corot

45 30

4024

18 40

10’9” 400

Matisse

45 30

4024

18 40

10’9” 400

26

25

Diagram of Meeting Rooms: Second Floor

Rembrandt  

Page 25: Introduction to the Argonne Training Program on Extreme ...extremecomputingtraining.anl.gov/files/2014/08/Day... · Introduction to the Argonne Training Program on Extreme-Scale Computing

With high-speed Internet access, netw

orking capabilities and special electrical configurations,G

alleryH

all is the perfect place for all yourtraining needs. This facility provides a collection of m

eeting rooms

featuring excellent sound insulation and convenient loading access.

Ga

ller

y H

all

(Fir

stF

loo

r)

No

te

s:

Gallery Hall (First Floor)

Scale In Feet

0 5 10

RobertWood PicassoRestroomRestroom

Patio

Storage

Storage

Chagall

GauguinVermeerUtrillo

20’

20’

20’

20’

20’

22’-2”

40’

30’-4”

40’

29’-11”

40’

29’-11”

Entrance

RoomNam

eReception

ClassroomTheater

Hollow

SquareU-Shape

Dining

CeilingHeight

SquareFootage

Gallery

Hall

Gauguin

17590

16048

36110

10’1,200

Vermeer

17590

16048

36110

10’1,200

Utrillo

17590

16048

36110

10’1,200

Chagall 45

30 40

2418

40 10’

400Picasso

4530

4024

18 40

10’ 400

RobertWood

4530

4024

18 40

10’ 400

24

23

Diagram of meeting rooms: Ground floor

Picasso  

Utrillo  

Page 26: Introduction to the Argonne Training Program on Extreme ...extremecomputingtraining.anl.gov/files/2014/08/Day... · Introduction to the Argonne Training Program on Extreme-Scale Computing

§  Local  arrangements  –  Cheryl  Zidel  –  Ashley  Boyle  –  Ginny  Doyle  –  (some>mes  they  will  be  in  Picasso  room  on  ground  floor)  

§  Surveys    –  Chel  Lancaster  

§  Compu>ng  issues  –  Ray  Loy  –  Robert  Scou  –  Adam  Scovel  –  Others  TBD  

Whom to ask for help

Page 27: Introduction to the Argonne Training Program on Extreme ...extremecomputingtraining.anl.gov/files/2014/08/Day... · Introduction to the Argonne Training Program on Extreme-Scale Computing

§  Thanks  in  advance  to  all  of  you  for  taking  two  weeks  of  your  summer  to  par8cipate  in  this  program  

§  Ques8ons?  

Summary