Upload
rafael-ferreira-da-silva
View
33
Download
3
Embed Size (px)
Citation preview
AUTOMATING ENVIRONMENTAL COMPUTING APPLICATIONSWITH SCIENTIFIC WORKFLOWS
Rafael Ferreira da Silva1, Ewa Deelman1, Rosa Filgueira2, Karan Vahi1Mats Rynge1, Rajiv Mayani1, Benjamin Mayer3
Environmental Computing Workshop (ECW’16)Baltimore, MD – October 23rd, 2016
1USC Information Sciences Institute2University of Edinburgh3Oak Ridge National Laboratory
1
OUTLINE
Introduction Scientific Workflows Workflow Manag. Systems
Pegasus WMS Workflow Applications
MotivationScientific ComputingExperiment Execution
DefinitionWorkflow CompositionWorkflow Design
Workflow SystemsWorkflow Execution
CyberShakeSeismic Cross-CorrelationACME
CapabilitiesArchitectureHeterogeneous Execution
R. Ferreira da Silva, E. Deelman, R. Filgueira, K. Vahi, M. Rynge, R. Mayani, B. MayerAutomating Environmental Computing Applications with Scientific Workflows 2
SummaryConclusionsFuture Research Directions
ScientificProblem
AnalyticalSolution
ComputationalScripts
Automation MonitoringandDebug
ScientificResultModels,QualityControl,ImageAnalysis,etc.
Fault-tolerance,Provenance,etc.
Shellscripts,Python,Matlab,etc.
EarthScience,Astronomy,Neuroinformatics,Bioinformatics,etc.
DistributedComputing
Workflows,MapReduce,etc.
Clusters,HPC,Cloud,Grid,etc.
EXPERIMENT TIMELINE
R. Ferreira da Silva, E. Deelman, R. Filgueira, K. Vahi, M. Rynge, R. Mayani, B. MayerAutomating Environmental Computing Applications with Scientific Workflows 3
4
EXPERIMENT EXECUTION
>Scientific ComputingExtract information from dataScientific Instruments
- USArray Transportable Array- LIGO Detectors- SNS Detector
R. Ferreira da Silva, E. Deelman, R. Filgueira, K. Vahi, M. Rynge, R. Mayani, B. MayerAutomating Environmental Computing Applications with Scientific Workflows
5
WHY SCIENTIFIC WORKFLOWS?
>>>AutomationEnables parallel, distributed computations
Automatically executes data transfers
ReproducibilityReusable, aids reproducibilityRecords how data was produced (provenance)
Automate
Recover
Debug
Recover & DebugHandles failures with to provide reliabilityKeeps track of data and files
R. Ferreira da Silva, E. Deelman, R. Filgueira, K. Vahi, M. Rynge, R. Mayani, B. MayerAutomating Environmental Computing Applications with Scientific Workflows
6
SCIENTIFIC WORKFLOWS
job
dependencyUsuallydatadependencies
Command-lineprograms
DAGdirected-acyclic graphs
<
RepresentationsDirected-acyclic Graphs (DAGs)Conditional statements:
- if-then-else- exception handling
Loops- for, while
Steering (human in the loop)
R. Ferreira da Silva, E. Deelman, R. Filgueira, K. Vahi, M. Rynge, R. Mayani, B. MayerAutomating Environmental Computing Applications with Scientific Workflows
7
JOBS & DEPENDENCIES
Some workflow systems require the implementation of the workflow mechanism as part of the application code.
A job often represents a program (or script) written in any programming language (black box).
Typically dependencies are based on the data flow.It can also be expressed as conditions, exceptions,
user triggered action, etc.
R. Ferreira da Silva, E. Deelman, R. Filgueira, K. Vahi, M. Rynge, R. Mayani, B. MayerAutomating Environmental Computing Applications with Scientific Workflows
import pickle
from obspy.core import read
from obspy.xseed import Parser
stream = read('20100206_045515_012.BGLD')
i = 0
for trace in stream:
i += 1
with open('trace_%s.trace' % i, 'wb') as output:
pickle.dump(trace, output)
preprocess.py
import pickle
from obspy.xseed import Parser
parser = Parser('dataless.seed.BW_BGLD')
trace = pickle.load(args[1])
trace.stats.paz =
parser.getPAZ(trace.stats.channel)
trace.stats.coordinates =
parser.getCoordinates(trace.stats.channel)
process.py
trace_1.trace trace_2.trace
import pickle
from obspy.xseed import Parser
parser = Parser('dataless.seed.BW_BGLD')
trace = pickle.load(args[1])
trace.stats.paz =
parser.getPAZ(trace.stats.channel)
trace.stats.coordinates =
parser.getCoordinates(trace.stats.channel)
process.py
trace_1.out trace_2.out
Dependencies define the order of execution.
8
WORKFLOW COMPOSITIONWORKFLOW DESCRIPTION
Workflow ConcretizationMaps the abstract workflow to the execution infrastructure:
- Adds stage-in/out jobs- Cleanup jobs- Clustering- etc.
High-level AbstractionLanguagesNo information about theunderlying infrastructure
Abstract Workflow
Executable Workflow
R. Ferreira da Silva, E. Deelman, R. Filgueira, K. Vahi, M. Rynge, R. Mayani, B. MayerAutomating Environmental Computing Applications with Scientific Workflows
Fosters reproducibility
9
WORKFLOW PATTERNS AND DESIGN
GUI: easy mechanism to create workflowsChallenge: visualize and handle large-scale workflows Command-Line: require programming skills
Advantage: easily scale to tackle large, complex problems
DR
AG
-AN
D-D
RO
P
COM
MAN
D-LI
NE
R. Ferreira da Silva, E. Deelman, R. Filgueira, K. Vahi, M. Rynge, R. Mayani, B. MayerAutomating Environmental Computing Applications with Scientific Workflows
WORKFLOW MANAGEMENT SYSTEMS
10
< < <
Tavernahttps://taverna.incubator.apache.org
Keplerhttps://kepler-project.org
Makeflowhttp://ccl.cse.nd.edu/software/makeflow
dispel4pyhttp://dispel4py.org
Nextflowhttps://www.nextflow.io
KNIMEhttps://www.knime.org
Swifthttp://swift-lang.org
VisTrailshttp://vistrails.org
FireWorkshttps://pythonhosted.org/FireWorks
Pegasushttp://pegasus.isi.edu
R. Ferreira da Silva, E. Deelman, R. Filgueira, K. Vahi, M. Rynge, R. Mayani, B. MayerAutomating Environmental Computing Applications with Scientific Workflows
WORKFLOW EXECUTION
Execution EnvironmentUser’s environment: laptop or desktop machineHeterogeneous environment: clouds, cyberinfrastructures
11
General Behavior
>
Parameter sweep studies are often composed of independent small tasks that can significantly benefit from grid or cloud computing environments
Parallel applications often require specialized systems such as clusters and supercomputers (for large scale applications)
split.sh
process.R process.R process.R
merge.c
visualization.py
Grid
or C
loud
C
ompu
ting
HPC
clus
ter
Loca
l
MPI application
Parameter SweepAnalysis
Local visualizationof results
Dat
a Tr
ansf
ers
R. Ferreira da Silva, E. Deelman, R. Filgueira, K. Vahi, M. Rynge, R. Mayani, B. MayerAutomating Environmental Computing Applications with Scientific Workflows
AutomationAutomates pipeline executions
Parallel, distributed computationsAutomatically executes data transfers
Heterogeneous resourcesTask-oriented model
Application is seen as a black box
DebugWorkflow execution and job performance metricsSet of debugging tools to unveil issuesReal-time monitoring, graphs, provenance
OptimizationJob clusteringData cleanup
RecoveryJob failure detection
Checkpoint FilesJob Retry
Rescue DAGs
12R. Ferreira da Silva, E. Deelman, R. Filgueira, K. Vahi, M. Rynge, R. Mayani, B. MayerAutomating Environmental Computing Applications with Scientific Workflows
PEGASUS WMS
13R. Ferreira da Silva, E. Deelman, R. Filgueira, K. Vahi, M. Rynge, R. Mayani, B. MayerAutomating Environmental Computing Applications with Scientific Workflows
PEGASUS ARCHITECTURE
Input data siteData staging site
sharedfilesystem
submit host(e.g., user’s laptop)
Compute site B
Compute site A
Data staging site
Input data siteOutput data site
Output data site
object storage
HETEROGENEOUS ENVIRONMENTS
R. Ferreira da Silva, E. Deelman, R. Filgueira, K. Vahi, M. Rynge, R. Mayani, B. MayerAutomating Environmental Computing Applications with Scientific Workflows 14
Pegasusdashboard
webinterfaceformonitoringanddebuggingworkflows
Real-timemonitoring ofworkflowexecutions.Itshows
thestatus oftheworkflowsandjobs,jobcharacteristics,statistics
andperformance metrics.Provenance dataisstoredintoa
relationaldatabase.
Real-timeMonitoringReportingDebugging
TroubleshootingRESTful API
R. Ferreira da Silva, E. Deelman, R. Filgueira, K. Vahi, M. Rynge, R. Mayani, B. MayerAutomating Environmental Computing Applications with Scientific Workflows 15
MONITORING
HPC Systemsubmit host
(e.g., user’s laptop)
workflow wrapped as an MPI jobAllows sub-graphs of a Pegasus workflow to be
submitted as monolithic jobs to remote resources
FINE-GRAINED WORKFLOWS ON HPC SYSTEMS
R. Ferreira da Silva, E. Deelman, R. Filgueira, K. Vahi, M. Rynge, R. Mayani, B. MayerAutomating Environmental Computing Applications with Scientific Workflows 16
Builders ask seismologists: What will the peak ground motion be at my new building in the next 50 years?
Seismologists answer this question using Probabilistic Seismic Hazard Analysis (PSHA)
R. Ferreira da Silva, E. Deelman, R. Filgueira, K. Vahi, M. Rynge, R. Mayani, B. MayerAutomating Environmental Computing Applications with Scientific Workflows 17
CYBERSHAKE>
CyberShake is a high-performance computing software that uses 3D waveform modeling to calculate PSHA
estimates for populated areas of California
R. Ferreira da Silva, E. Deelman, R. Filgueira, K. Vahi, M. Rynge, R. Mayani, B. MayerAutomating Environmental Computing Applications with Scientific Workflows 18
CYBERSHAKE WORKFLOW
Phase 2Computes individual contributions from over 400,000 different earthquakes
Constructs a hazard map (~100 million loosely coupled jobs)
Phase 1Constructs and Populates a 3D mesh of ~1.2 billion
elements with seismic velocity data to compute Strain Green Tensors (SGTs)
SGT simulations uses parallel wave propagations(requires ~4000 processors)
R. Ferreira da Silva, E. Deelman, R. Filgueira, K. Vahi, M. Rynge, R. Mayani, B. MayerAutomating Environmental Computing Applications with Scientific Workflows 19
CYBERSHAKE WORKFLOW EXECUTION
TITANBLUE WATERS
336 CyberShake Workflows1.4 million node-hours
500 TB of data
FAC
TS
> 1 single core job~4000 cores MPI jobs
and ~800 GPU jobsMPI jobs (~3800 cores)
20
SEISMIC AMBIENT NOISE CROSS-CORRELATION
>R. Ferreira da Silva, E. Deelman, R. Filgueira, K. Vahi, M. Rynge, R. Mayani, B. Mayer
Automating Environmental Computing Applications with Scientific Workflows
Phase1(pre-process)
Phase2(cross-correlation)
Preprocesses and cross-correlates traces (sequences of measurements of acceleration in three dimensions) from
multiple seismic stations (IRIS database)
Phase 1: data preparation using statistics for extracting information from the noise
Phase 2: compute correlation, identifying the time for signals to travel between stations. Infers properties of the intervening rock
21
SEISMIC WORKFLOW
>
The workflow is implemented as a hybrid workflow approach to enable the execution of data-intensive stream-basedworkflow applications across different e-Infrastructures
Single RunThe workflow requests an hour data from IRIS services (USArrayTA, data from 394 stations)
- Preprocess phase: 8 minutes (OpenMPI cluster)- Cross-correlation phase: 2 hours (Apache Storm cluster)
Stream-based The workflow reads data from IRIS services every 2 hours>
R. Ferreira da Silva, E. Deelman, R. Filgueira, K. Vahi, M. Rynge, R. Mayani, B. MayerAutomating Environmental Computing Applications with Scientific Workflows
Docker Container 2 OpenMPI cluster
Docker Container 1Pegasus and dispel4py
Docker Container 3Storm cluster
Pegasus workflow (DAX)
dispel4py preprocess workflowAutomatic translation
MPI application
dispel4py postprocess workflowAutomatic translation: Storm topology
Trace Prep
decim
detrend
remove resp
filter
white
calc fft
demean
ReadPrep xCorr Write xCorr
Results
ReadTraces
Write Prep.
Preprocessed data
Cross-correlated results
d4py preproc
.
d4py postpro
List of stations
Preprocessed data
Cross-correlated results
List of stations
Preprocessed data
SEISMIC WORKFLOW EXECUTION
R. Ferreira da Silva, E. Deelman, R. Filgueira, K. Vahi, M. Rynge, R. Mayani, B. MayerAutomating Environmental Computing Applications with Scientific Workflows 22
IRISdatabase(stations)
datatransfersbetweensitesperformedbyPegasus
ComputesiteB(ApacheStorm)
ComputesiteA(MPI-based)
input data (~150MB)
submit host(e.g.,user’slaptop)
output data (~40GB)
Phase1
Phase2
ACME WORKFLOW
Climate and Earth System ModelingCoupled climate models with ocean, land, atmosphere, and ice
23
ACME Workflow
>
Each stage of the workflow runs the ACME model for a few time steps—helps keep simulations within batch queue limits
Running on Cori @ NERSC, Titan @ OLCF, and Mira @ ANL (in process)
A simple workflow execution on Titan (OLCF) with four different parameter values for a number of simulation time units (5 years per stage), consumes over 50,000 CPU hours
R. Ferreira da Silva, E. Deelman, R. Filgueira, K. Vahi, M. Rynge, R. Mayani, B. MayerAutomating Environmental Computing Applications with Scientific Workflows
...
Year 1-5
Year 6-10
Year 36-40
Climatologies
Climatologies
ClimatologiesCADES
>
SUMMARY
Summary ConclusionsConclusionFuture Research Directions
Workflows have been used by several science domains to efficiently manage their computations on a number of different resources
We have presented an introduction to scientific workflows, emphasizing their main properties and how scientists may model their experiments as workflows
Although workflows have been used by a small group of environmental scientists, we strongly believe that more researchers can benefit from
these technologies
24R. Ferreira da Silva, E. Deelman, R. Filgueira, K. Vahi, M. Rynge, R. Mayani, B. MayerAutomating Environmental Computing Applications with Scientific Workflows
AUTOMATING ENVIRONMENTAL COMPUTING APPLICATIONSWITH SCIENTIFIC WORKFLOWS
Rafael Ferreira da Silva, Ph.D.Research Assistant ProfessorDepartment of Computer ScienceUniversity of Southern [email protected] – http://rafaelsilva.com
Thank You
Questions?
http://pegasus.isi.edu 25