Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
1
Parallel scripting with Swift for applications at the petascale and beyond
VecPar PEEPS Workshop
Berkeley, CA – June 22, 2010
Michael Wilde – [email protected] Institute, University of Chicago
and Argonne National Laboratory
www.ci.uchicago.edu/swift
• Many applications need loosely coupled scripting
• Swift harnesses parallel & distributed resources through a simple scripting language
• Productivity gains by enabling use of more powerful systems with less concern for the mechanics
2
Problems addressed by Swift
Modeling uncertainty for CIM‐EARTH
Parallel AMPL workflow by Joshua Elliott, Meredith Franklin, Todd Munson, Allan Espinosa.
4
Fast Ocean Atmosphere Model (MPI)
NCAR
Manual config, execution,
bookkeeping
VDS on Teragrid
Automated
Visualization courtesy Pat Behling and Yun Liu, UW
MadisonWork ofVeronica Nefedovaand Rob Jacob, Argonne
Problem: Drug screening at APS
5
2M+ ligands
(Mike Kubal, Benoit Roux, and others)
(B)
O(Millions)of drug
candidates
O(tens)of fruitful
candidates for wetlab &
APS
start
report
DOCK6Receptor
(1 per protein:defines pocket
to bind to)
ZINC3-D
structures
ligands complexes
NAB scriptparameters
(defines flexibleresidues, #MDsteps)
Amber Score:1. AmberizeLigand3. AmberizeComplex5. RunNABScript
end
BuildNABScript
NABScript
NABScript
Template
Amber prep:2. AmberizeReceptor4. perl: gen nabscript
FREDReceptor
(1 per protein:defines pocket
to bind to)
Manually prepDOCK6 rec file
Manually prepFRED rec file
1 protein(1MB)
6 GB2M
structures(6 GB)
DOCK6FRED ~4M x 60s x 1 cpu~60K cpu-hrs
Amber ~10K x 20m x 1 cpu~3K cpu-hrs
Select best ~500
~500 x 10hr x 100 cpu~500K cpu-hrsGCMC
PDBprotein
descriptions
Select best ~5KSelect best ~5K
6Work of Andrew Binkoaski and Michael Kubal
Problem: preprocessing and analysisof neuroscience experiments
ManyData Files:
3a.h
align_warp/1
3a.i
3a.s.h
softmean/9
3a.s.i
3a.w
reslice/2
4a.h
align_warp/3
4a.i
4a.s.h 4a.s.i
4a.w
reslice/4
5a.h
align_warp/5
5a.i
5a.s.h 5a.s.i
5a.w
reslice/6
6a.h
align_warp/7
6a.i
6a.s.h 6a.s.i
6a.w
reslice/8
ref.h ref.i
atlas.h atlas.i
slicer/10 slicer/12 slicer/14
atlas_x.jpg
atlas_x.ppm
convert/11
atlas_y.jpg
atlas_y.ppm
convert/13
atlas_z.jpg
atlas_z.ppm
convert/15
ManyApplication Programs:
Automated image registration for spatial normalization
reorientRun
reorientRun
reslice_warpRun
random_select
alignlinearRun
resliceRun
softmean
alignlinear
combinewarp
strictmean
gsmoothRun
binarize
reorient/01
reorient/02
res li c e_w arp/22
alignl inear/03 al ignl inear/07alignl inea r/11
reorien t/05
reorient/06
res lic e_w ar p/23
reo rien t/09
reorient/10
res l ic e_w arp/24
reorien t/25
reorient/51
res l ic e_warp/26
reorient/27
reo rient/52
r es l ic e_warp/28
reorient/29
reo rient/53
res l ic e_warp/30
reorien t/31
reorient/54
res lic e_w arp/32
reorient/33
r eorient/55
resl i c e_warp/34
reorient/35
reor ient/56
resl i ce_warp/36
reor ient/37
reorien t/57
resl ice_warp /38
resl i ce /04 resl ice/08res l i ce/12
gsm ooth/41
s tr ic tmean/39
gs mooth/42gs mooth /43gs mooth/44 gsm oo th/45 gs mooth/46 gs m ooth /47 gs mooth/48 gs m ooth /49 gsmooth/50
softm ean/13
a lign l inear/17
com binewarp/21
binari ze/40
reorient
reorient
alignlinear
reslice
softmean
alignlinear
combine_warp
reslice_warp
strictmean
binarize
gsmooth
AIRSN workflow expanded:AIRSN workflow:
Swift programs
• A Swift script is a set of functions– Atomic functions wrap & invoke application programs (on parallel
compute nodes)
– Composite functions invoke other functions (run in Swift engine)
• Data is typed as composable arrays and structures of files and simple scalar types (int, float, string)
• Collections of persistent file structures are mapped into this data model as arrays and structures
• Variables are single assignment
• Expressions and statements are executed in data‐flowdependency order and concurrency
• Members of datasets can be processed in parallel
• Provenance is gathered as scripts execute9
A simple Swift scriptTo run the Image Magick app “convert”:1 convert ‐rotate 180 $in $out
2 type imagefile { } // Declare a “file” type.3
4 app (imagefile output) rotate (imagefile input) {5 {6 convert "‐rotate" 180 @input @output ;7 }8
9 imagefile image <"m101.2010.0601.jpg">;10 imagefile newimage <"output.jpg">;11
12 newimage = rotate(image);10
Execution is driven by data flow
1 (int result) myproc (int input)2 {3 j = f(input); 4 k = g(input);5 result = j + k;6 }
7 j=f() and k=g() are computed in parallel.
8 This parallelism is automatic, based on futures;9 Works recursively down the scripts’s call graph.
11
Parallelism via foreach { }1 type imagefile; // Declare a “file” type.2
3 app (imagefile output) rotate (imagefile input) {4 convert "‐rotate" "180" @input @output;5 }6
7 imagefile observations[ ] <simple_mapper; prefix=“m101‐raw”>;8 imagefile flipped[ ] <simple_mapper; prefix=“m101‐flipped”>;9
10
11
12 foreach obs,i in observations {13 flipped[i] = rotate(obs); 14 }
12
Name outputs based on index
Process all dataset members in parallel
Map inputs from local directory
Many domains process structured datasets
ManyData Files:
3a.h
align_warp/1
3a.i
3a.s.h
softmean/9
3a.s.i
3a.w
reslice/2
4a.h
align_warp/3
4a.i
4a.s.h 4a.s.i
4a.w
reslice/4
5a.h
align_warp/5
5a.i
5a.s.h 5a.s.i
5a.w
reslice/6
6a.h
align_warp/7
6a.i
6a.s.h 6a.s.i
6a.w
reslice/8
ref.h ref.i
atlas.h atlas.i
slicer/10 slicer/12 slicer/14
atlas_x.jpg
atlas_x.ppm
convert/11
atlas_y.jpg
atlas_y.ppm
convert/13
atlas_z.jpg
atlas_z.ppm
convert/15
ManyApplication Programs:
Swift Data Mappingtype Study {
Group g[ ]; }
type Group { Subject s[ ];
}
type Subject { Volume anat; Run run[ ];
}
type Run { Volume v[ ];
}
type Volume { Image img; Header hdr;
}
On-DiskDataLayout Swift’s
in-memory data model
Mapping functionor script
Mapping functionor script
Swift app function“predict()”
Swift app function“predict()”
tt
seqseq dt
dtloglog
PSim applicationPSim application
pgpg
To run:psim –s 1ubq.fas –pdb p \
–temp 100.0 –inc 25.0 >log
In Swift code:
app (PDB pg, File log) predict (Protein seq,Float temp, Float dt)
{psim "-s" @pseq.fasta "-pdb" @pg
"–temp" temp ”-inc" dt;}
Protein p <ext; exec="Pmap", id="1ubq">;ProtGeo structure;TextFile log;
(structure, log) = predict(p, 100., 25.);
Fastafile
Fastafile
Application: Protein structure prediction
Encapsulation is the key to transparent distribution, parallelization, and provenance
foreach sim in [1:1000] {(structure[sim], log[sim]) = predict(p, 100., 25.);
}result = analyze(structure)
1000predict()
calls
Analyze()
Parallelism via foreach { }
Application: 3D Protein structure prediction
1. type Fasta; // Primary protein sequence file in FASTA format2. type SecSeq; // Secodary structure file3. type RamaMap; // “Ramachandra” mapping info files4. type RamaIndex;5. type ProtGeo; // PDB‐format file – protein geometry: 3D atom coords6. type SimLog;7.8. type Protein { // Input file struct to protein simulator9. Fasta fasta; // sequence to predict structure of10. SecSeq secseq; // Initial secondary structure to use11. ProtGeo native; // 3D structure from experimental data when known 12. RamaMap map;13. RamaIndex index;14. }15.16. type PSimCf { // Science configuration parameters to simulator17. float st;18. float tui;19. float coeff;20. }21.22. type ProtSim { // Output file struct from protein simulator23. ProtGeo pgeo;24. SimLog log;25. } 17
Protein structure prediction
1. app (ProtGeo pgeo) predict (Protein pseq)2. {3. PSim @pseq.fasta @pgeo;4. }5.6. (ProtGeo pg[ ]) doRound (Protein p, int n) {7. foreach sim in [0:n‐1] {8. pg[sim] = predict(p);9. }10. }11.12. Protein p <ext; exec="Pmap", id="1af7">;13. ProtGeo structure[ ];14. int nsim = 10000;15. structure = doRound(p, nsim);
18
Protein structure prediction
1 (ProtSim psim[ ]) doRoundCf (Protein p, int n, PSimCf cf) {2 foreach sim in [0:n‐1] {3 psim[sim] = predictCf(p, cf.st, cf.tui, cf.coeff );4 }5 }
6 (boolean converged) analyze( ProtSim prediction[ ], int r, int numRounds)7 {8 if( r == (numRounds‐1) ) {9 converged = true;10 }11 else {12 converged = test_convergence(prediction);13 }14 }
19
Protein structure prediction
1. ItFix( Protein p, int nsim, int maxr, float temp, float dt)2. {3. ProtSim prediction[ ][ ];4. boolean converged[ ];5. PSimCf config;6.7. config.st = temp;8. config.tui = dt;9. config.coeff = 0.1;10.11. iterate r {12. prediction[r] =13. doRoundCf(p, nsim, config);14. converged[r] =15. analyze(prediction[r], r, maxr);16. } until ( converged[r] );17. }
20
Protein structure prediction
1. Sweep( )2. {3. int nSim = 1000;4. int maxRounds = 3;5. Protein pSet[ ] <ext; exec="Protein.map">;6. float startTemp[ ] = [ 100.0, 200.0 ];7. float delT[ ] = [ 1.0, 1.5, 2.0, 5.0, 10.0 ];8. foreach p, pn in pSet {9. foreach t in startTemp {10. foreach d in delT {11. ItFix(p, nSim, maxRounds, t, d);12. }13. }14. }15. }16.17. Sweep();
21
10 proteins x 1000 simulations x3 rounds x 2 temps x 5 deltas
= 300K tasks
Submit host(Laptop, login host,…)
Workflowstatus
and logs
?????
Computenodes
f1
f2
f3
a1
a2
Data server
f1 f2 f3
Provenancelog
scriptAppa1
Appa2
sitelist
applist
Filetransport
Filetransport
CloudsClouds
UsingSwift
Swift is a self‐contained application with cluster and grid client code:Dowload, untar, and run
Small, fast, localmemory-based filesystems
Falkon client(load
balancing) Sharedglobal
filesystem
Swift scriptFalkon services on
BG/P IO Processors
BG/P Processor sets
Architecture for petascale scripting
Collective data management is critical for petascale
• Applies “scatter/gather” concepts at the file management level
• Seeks to avoid contention, maximize parallelism and use petascale interconnects– Broadcast common files to compute nodes– Place per‐task data on local (RAM) FS– Gather output into larger sets (time/space)– Aggregate small local FS’s into large striped FS
• Still in research phase: paradigm and architecures
24
Collective data management
Global FS
IFSNodeIFS
. . .
CNLFS
CNLFS
. . .
CNLFS
CNLFS
. . .
IFSNodeIFS
DistributorCollector
55
44
33
22
11
Performance: Molecular dynamics on BG/P
26935,803 DOCK jobs with Falkon on BG/P in 2 hours
Performance: SEM for fMRI on Constellation
27418K SEM tasks with Swift/Coasters on Ranger in 41 hours
Performance: Proteomics on BG/P
284,127 PTMap jobs with Swift/Falkon on BG/P in 3 minutes
Scaling the many‐task model
29
Compute unit
Compute unit
Client (master) application
Compute unit
graph executor
Mastergraph
executor
Compute unit
graph executor
Compute unit
graph executor
Compute unit
graph executor
Virtual data store
Global persistent storage
Extreme-scale computing complex
Ultra-fast Message queues
Scaling many‐task computing
• ADLB: tasks can be lightweight functions– Retains RPC model of input‐process‐output
– Fast, distributed, asynchronous load balancing
• Multi‐level task manager– Must scale to massive computing complexes
• Transparent distributed management of local storage– Leverage local filesystems (RAM), aggregate,
make access more transparent through DHT methods 30
Conclusion: Motivation for Swift
• Enhance scientific productivity– Location – and paradigm – independence:
Same scripts run on workstations, clusters, clouds, grids, and petascale supercomputers
– Automation of dataflow, resource selection and error recovery
• Enable and motivate collaboration– Community libraries of techniques, protocols,
methods– Designed for recording the provenance of all data
produced to facilitate scientific processes
• Swift is a parallel scripting system for Grids and clusters– for loosely‐coupled applications ‐ application and utility programs linked
by exchanging files
• Swift is easy to write: simple high‐level C‐like functional language– Small Swift scripts can do large‐scale work
• Swift is easy to run: contains all services for running Grid workflow ‐in one Java application– Untar and run – acts as a self‐contained Grid client
• Swift is fast: Karajan provides Swift a powerful, efficient, scalable and flexible execution engine.– Scaling close to 1M tasks – .5M in live science work, and growing
• Swift usage is growing:– applications in neuroscience, proteomics, molecular dynamics,
biochemistry, economics, statistics, and more.
To learn more and try Swift…
33
• www.ci.uchicago.edu/swift– Quick Start Guide:
• http://www.ci.uchicago.edu/swift/guides/quickstartguide.php
– User Guide:• http://www.ci.uchicago.edu/swift/guides/userguide.php
– Introductory Swift Tutorials:• http://www.ci.uchicago.edu/swift/docs/index.php
http://www.ci.uchicago.edu/swift
34IEEE COMPUTER, Nov 2009
Acknowledgments• Swift effort is supported in part by NSF grants OCI‐721939, OCI‐0944332, and PHY‐
636265, NIH DC08638, and the UChicago/Argonne Computation Institute• The Swift team (present and former):
– Ben Clifford, Allan Espinosa, Ian Foster, Mihael Hategan, Ioan Raicu, Sarah Kenny, Mike Wilde, Justin Wozniak, Zhao Zhang, Yong Zhao
• Java CoG Kit used by Swift developed by:– Mihael Hategan, Gregor Von Laszewski, and many collaborators
• Falkon software– developed by Ioan Raicu and Zhao Zhang
• ZeptoOS– Kamil Iskra, Kazutomo Yoshii, and Pete Beckman
• Scientific application collaborators and users– U. Chicago Open Protein Simulator Group (Karl Freed, Tobin Sosnick, Glen Hocky, Joe
Debartolo, Aashish Adhikari)– U.Chicago Radiology and Human Neuroscience Lab, (Dr. S. Small)– SEE/CIM‐EARTH/Econ: Joshua Elliott, Meredith Franklin, Todd Muson, Tib Stef‐Praun– PTMap: Yingming Zhao, Yue Chen
35