Upload
adrina
View
44
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Basic High Performance Computing. Kenton McHenry. XSEDE. Extreme Science and Engineering Discovery Environment http://www.xsede.org Collection of networked supercomputers PSC Blacklight NCSA Forge SDSC Gordon SDSC Trestles NICS Kraken TACC Lonestar TACC Ranger Purdue Steele. XSEDE. - PowerPoint PPT Presentation
Citation preview
National Center for Supercomputing ApplicationsUniversity of Illinois at Urbana-Champaign
Basic High Performance Computing
Kenton McHenry
XSEDE• Extreme Science and Engineering Discovery
Environment• http://www.xsede.org
• Collection of networked supercomputers• PSC Blacklight• NCSA Forge• SDSC Gordon• SDSC Trestles • NICS Kraken• TACC Lonestar• TACC Ranger• Purdue Steele
XSEDE• Extreme Science and Engineering Discovery
Environment• http://www.xsede.org
• Collection of networked supercomputers• Supported by NSF• Follow up to TeraGrid
• NCSA Ember• …
Allocations
• Startups• Around 30,000 CPU hours• For experimentation• Can apply any time per year• Only 1 such allocation per user
• Research• 1 million+ CPU hours• Research plan• Can apply for only during certain periods in the year• Very competitive• Humanities related work makes up a very small amount of those
given out
ECS
• Extended Collaborative Support Services• Time from XSEDE support staff• Ask for in allocation request• Must justify
Logging In
• Linux• SSH
• ember.ncsa.illinois.edu
• Head node vs. worker nodes
Space
• Local scratch• Temporary space during a programs execution• Cleared as soon as the process finishes
• Global scratch• Temporary user space• Untouched files are cleared periodically (e.g. weeks)
• Mass store• Long terms storage• Tapes
Executing Code
• Naively or Embarrassingly Parallel• Problem allows for a number of independent tasks that can be
executed separately from one another• No special steps needed to synchronize steps or merge results
• e.g. MPI or Map Reduce
Executing Code
• Step 1: Write your code on a non-HPC resource• For the Census project this involved months of research and
development• Construct to have only a command line interface• Support flags for:
• Setting input data (either folder or database)• Setting output location (either folder or database)• Customizing the execution and/or selected a desired step
• We had 3 steps
Executing Code
• Step 1: Write your code on a non-HPC resource• Step 2: Organize data
• Perhaps subfolders for each job• Move to global scratch space to avoid GridFS bottlenecks
Executing Code
• Step 1: Write your code on a non-HPC resource• Step 2: Organize data• Step 3: Create scripts to execute jobs
• Scripts• Portable Batch System (PBS)• [Example]
Executing Code
• Step 1: Write your code on a non-HPC resource• Step 2: Organize data• Step 3: Create scripts to execute jobs• Step 4: Run scripts
Execute
$ qsub 00889.pbs
This job will be charged to account: abc267950.ember
$ for f in *.pbs; do qsub $f; done
Monitor$ qstat
Job id Name User Time Use S Queue---------------- ---------------- ---------------- -------- - -----267794.ember v15 ccguser 75:11:48 R gridchem 267795.ember v16 ccguser 75:09:20 R gridchem 267796.ember v17 ccguser 75:13:01 R gridchem 267870.ember c4-ts1-freq ccguser 279:03:2 R gridchem 267872.ember c5-ts1-freq ccguser 351:17:0 R gridchem 267873.ember c5-ts1-ccsd ccguser 228:50:0 R gridchem 267897.ember c3-ts1-ccsdt ccguser 267:04:0 R gridchem 267912.ember FSDW103lnpvert kpatten 2178:07: R normal 267943.ember jobDP12 haihuliu 1506:40: R normal 267944.ember PF31 haihuliu 920:44:4 R normal 267945.ember jobDP8 haihuliu 1351:11: R normal 267946.ember FLOOArTSre2.com ccguser 91:32:30 R gridchem 267947.ember FLOOArTSre3.com ccguser 86:29:35 R gridchem 267949.ember vHLBIHl1O5 ccguser 01:23:03 R normal 267950.ember S-00889 kooper 00:00:00 R normal
Results
$ qstat -f 267950.ember
Job Id: 267950.ember Job_Name = S-00889 Job_Owner = [email protected] resources_used.cpupercent = 396 resources_used.cput = 00:02:26 resources_used.mem = 4981600kb resources_used.ncpus = 12 resources_used.vmem = 62051556kb resources_used.walltime = 00:01:02 job_state = R queue = normal server = ember Account_Name = gf7 Checkpoint = n ctime = Wed May 30 11:11:33 2012 Error_Path = ember.ncsa.illinois.edu:/u/ncsa/kooper/scratch-global/census/1 940/batch1/segmentation/S-00889.e267950 exec_host = ember-cmp1/1*6+ember-cmp1/2*6 exec_vnode = (ember-cmp1[11]:ncpus=6:mem=32505856kb)+(ember-cmp1[12]:ncpus= 6:mem=27262976kb)
Image and Spatial Data Analysis Grouphttp://isda.ncsa.illinois.edu
Questions?