38
Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd , 2012

Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

Embed Size (px)

Citation preview

Page 1: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

Science Gateway Group Overview

Marlon Pierce, Suresh Marru, Raminder SinghPresentation to PTI CREST Lab

May 2nd, 2012

Page 2: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

IU Science Gateway Group Overview• IU Science Gateway Group members

– Marlon Pierce: Group Lead– Suresh Marru: Principal Software Architect– Raminder Singh, Yu Ma, Jun Wang, Lahiru Gunathilake,

(Chathura Herath): Senior team members– Five interns and research assistants

• NSF SDCI funding of Open Gateway Computing Environments project – TACC (M. Dahan), SDSC (N. Wilkins-Diehr), SDSU (M.

Thomas), NCSA (S. Pamidighantam), UIUC (S. Wang), Purdue (C. Song), UTHSCSA (E. Brookes)

– XSEDE Extended Collaboration Support Services

Page 3: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

Science Gateway Group Focus Areas

• Open source, open community software for cyberinfrastructure– Apache Rave: http://rave.apache.org– (portal software)– Apache Airavata: http://www.airavata.org

(workflow software)• Extended collaborations with application

scientists– Astronomy, astrophysics, biophysics, chemistry,

nuclear physics, bio informatics…

Page 4: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

Science Gateways outreach & democratize Research and Education

Developers

Researchers

Cyberinfrastructure

Research & Education Community

Lowering the barrier for using complex end-to-end

computational technologies

Democratize

Empower

Facilitate

Page 5: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

Possibilities for Collaboration• Scientific workflows exhibit a number of

distributed execution patterns.– Not just DAGs.– Workflow start as an abstraction, but need system,

application, library level interactions.– We are trying to generalize our execution framework

over a number of applications. – This is parallel, complementary to work in CREST.

• Collaborations can be mediated through Apache, students’ independent study, targeted papers.

Page 6: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

Apache Airavata

Open Community Software for Scientific Workflows

airavara.org

Page 7: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

Apache Airavata

• Science Gateway software framework to– Compose, manage, execute, and monitor

computational workflows on Grids and Clouds– Web service abstractions to legacy command line-

driven scientific applications– Modular software framework to use as individual

components or as an integrated solution.• More Information

– Airavata Web Site: airavata.org– Developer Mailing Lists: airavata-

[email protected]

Page 8: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

Apache Airavata High Level Overview

Page 9: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

A Classic Scientific Workflow• Workflows are composite applications built out of independent parts.

– Executables wrapped as network accessible services• Example: codes A, B, and C need to be executed in a specific sequence.

– A, B, C: codes compiled and executable on a cluster, supercomputer, etc. through schedulers.

• A, B, and C do not need to be co-located• A, B, and C may be sequential or parallel• A, B and C may have data or control dependencies

– Data may need to be staged in and out• Some variations on ABC:

– Conditional execution branches, interactivity– Dynamic execution resource binding– Iterations (Do-while, For-Each) over all or parts of the sequence– Triggers, events, data streams

Page 10: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

On-DemandGrid Computing

Linked Environment for Atmospheric Discovery

StreamingObservations

Storms Forming

Forecast Model

Data Mining

Refine forecast grid

Instrument Steering

Envisioned by a multi-disciplinary team from OU, IU, NCSA, Unidata, UAH, Howard, Millersville, Colorado State, RENCI

Page 11: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

1111

OVP/RST/ MIG

OGCERe-engineer, Generalize,

Build, Test and Release

LEAD

GridChem

TeraGridUser Portal

OGCE NMI & SDCI Funding

Atmospheric Science

LEAD, OLAM

Open Grid/Gateway Computing Environments

Molecular Chemistry

GridChem, ParamChem,

OREChem

Bio Physics

Bio Informatics BioVLAB, mCpG

Astronomy ODI, DES-SimWG

Nuclear Physics LCCI

Ultrascan

Projects in the pipe

QuakeSim, VLAB, Einstein Gateway

XSEDE ECSS Gateway Projects

Page 12: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

Realizing the Universe for the Dark Energy Survey (DES) Using XSEDE SupportPrincipal Investigators: Prof. August Evrard (University of Michigan), Prof. Andrey Kravtsov (University of Chicago)

Background & Explanation: The Dark Energy Survey (DES) is an upcoming international experiment that aims to constrain the properties of dark energy and dark matter in the universe using a deep, 5000-square degree survey of cosmic structure traced by galaxies.

To support this science, the DES Simulation Working Group is generating expectations for galaxy yields in various cosmologies. Analysis of these simulated catalogs offers a quality assurance capability for cosmological and astrophysical analysis of upcoming DES telescope data.

Our large, multi-staged computations are a natural fit for workflow control atop XSEDE resources.

Fig. 2: A synthetic 2x3 arcmin DES sky image showing galaxies, stars, and observational artifacts. Courtesy Huan Lin, FNAL.

Fig. 1 The density of dark matter in a thin radial slice as seen by a synthetic observer located in the 8 billion light-year computational volume. Image courtesy Matthew Becker, University of Chicago.

Page 13: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

DES Application

Component Description

CAMB Code for Anisotropies in the Microwave Background is a serial FORTRAN code that computes the power spectrum of dark matter (simulation initial conditions). Output is a small ASCII file describing the power spectrum.

2LPTic Second-order Lagrangian Perturbation Theory initial conditions code is an MPI-based C code that computes the initial conditions for the simulation from parameters and an input power spectrum generated by CAMB. Output is a set of binary files that vary in size from ~80-250 GB depending on the simulation resolution.

LGadget MPI based C code that uses a TreePM algorithm to evolve a gravitational N-body system. The outputs of this step are system state snapshot files, as well as light-cone files, and some properties of the matter distribution, including the power spectrum at various timesteps. The total output from LGadget depends on resolution and the number of system snapshots stored, and approaches 10~TB for large DES simulation boxes.

ADDGALS Creates a synthetic galaxy catalog for science analysis

Page 14: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

Case Study: Dark Energy Survey• Long running code: Based on simulation box

size L-gadget can run for 3 to 5 days using more than 1024 cores on TACC Ranger.

• Do-While Construct: Restart service support is needed to work around queue time restrictions. Do-while construct was developed to address the need.

• Data size and file transfer challenges: L-gadget produces 10~TB for large DES simulation boxes in system scratch so data need to moved to persistent storage ASAP

• File system issues: More than 10,000 lightcone files are doing continues file I/O. Ranger has one Luster metadata server to serve 300 I/O nodes. Sometime metadata server can’t fine these lightcone files, which make simulations to stop. We have wasted ~50k SU this month struggling with I/O issues and to get recommendation to use MPI I/O

Figure: Processing steps to build a synthetic galaxy catalog. Xbaya workflow currently controls the top-most element (N-body simulations) which consists of methods to sample a cosmological power spectrum (ps), generating an initial set of particles (ic) and evolving the particles forward in time with Gadget (N-body). The remaining methods are run manually on distributed resources.

Page 15: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012
Page 16: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

Case Study: ParamChem• ParamChem researchers try to optimize the geometry of

new molecules which may or may not converge with in a given time or number of steps.

• Factors that include the mathematical convergence issues in solutions for partial integro-differential equations to potential shallowness of an energy landscape.

• The intermediate outputs from model iterations can be used to determine convergence.

Complex graph executions with support for long running and interactive executions to address non-deterministic convergence problems.

Page 17: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

Case Study: LCCI• The Leadership Class Configuration Interaction (LCCI) project is targeted to

accurately predict properties of nuclei for astrophysical and fusion energy processes.

– James Vary’s group at Iowa State

• One of the PetaScale Apps– Use DOE INCITE and NSF Blue Waters awarded resources– Currently using 55 million processor hours on ORNL Cray XK6 machine and Argonne Blue

Gene/P.

• Workflow Goals– Democratizing science: Reduce the learning curve associated with running simulations – Controlled access: avoid inappropriate use of super computing resources– Reproducibility of results– Avoiding waste: needless duplication; minimize erroneous use of codes and erroneous

exchanging of intermediate files– Sustainability: Ensure long-term preservation of applications, configurations and results– Provenance: Provide the ability to track down the provenance of results as well as reuse

previously completed results where applicable without recalculating– Parametric sweeps: Allow components to run over a range of dataset such that applications

may produce richer simulations

Page 18: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

Example Workflow: Nuclear Physics

Courtesy of collaboration with Prof. James Vary and team, Iowa State

Page 19: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

Next Generation Workflow Systems• Scientific workflow systems and compiled workflow

languages have focused on modeling, scheduling, data movement, dynamic service creation and monitoring of workflows.

• Building on these foundations we extend to a interactive and flexible workflow systems.– interactive ways of steering the workflow execution– interpreted workflow execution model– high level instruction set supporting diverse execution

patterns– flexibility to execute individual workflow activity and

wait for further analysis.

Page 20: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

Various State Changes can tap into lower layers

Page 21: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

Uncertainties in Workflows• Mathematical uncertainty:

– PDE’s may not converge for certain conditions– Statistical techniques lead to nondeterministic results, propagation of

uncertainties.– CLoser observation at computational output ensure acceptability of results.

• Domain uncertainty:– Optimization execution patterns: Scenarios of running against range of

parameter values in an attempt to find the most appropriate input set. – Initial execution providing estimate of the accuracy of the inputs and facilitating

further refinement.– Outputs are diverse and nondeterministic

• Resource uncertainty:– Failures in distributed systems are norm than an exception– Transient failures can be retried if computation is side-effect free/Idempotent. – Persistent failures require migration

Page 22: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

Next Steps • Workflow start as an abstraction, but need

system, application, library level interactions.– We are trying to generalize our execution

framework over a number of applications. – This is parallel, complementary to work in CREST.

• Collaborations can be mediated through Apache, students’ independent study, targeted papers.

Page 23: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

Backup Slides

Page 24: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

Apache Rave

Page 25: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

1

Apache Rave

• Open Community Software for Enterprise Social Networking, Shareable Web Components, and Science Gateways

• Founding members:• Mitre Software• SURFnet• Hippo Software• Indiana University

• More information• Project Website: http://rave.apache.org/• Mailing List: [email protected]

Page 26: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

Gadget Dashboard View

Gadget Store View

Page 27: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

Extending Rave for Science Gateways

• Rave is designed to be extended.– Good design (interfaces, easily pluggable

implementations) and code organization are required.– It helps to have a diverse, distributed developer

community• How can you work on it if we can’t work on it?

• Rave is also packaged so that you can extend it without touching the source tree.

• GCE11 paper presented 3 case studies for Science Gateways

Page 28: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

Apache Software Foundation and Cyberinfrastructure

Page 29: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

Why Apache for Gateway Software?• Apache Software Foundation is a neutral playing field

– 501(c)(3) non-profit organization.– Designed to encourage competitors to collaborate on foundational

software.– Includes a legal cell for legal issues.

• Foundation itself is sustainable– Incorporated in 1999– Multiple sponsors (Yahoo, Microsoft, Google, AMD, Facebook, IBM, …)

• Proven governance models– Projects are run by Program Management Committees.– New projects must go through incubation.

• Provides the social infrastructure for building communities.• Opportunities to collaborate with other Apache projects outside

the usual CI world.

Page 30: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

The Apache Way• Projects start as incubators with 1 champion and several mentors.

– Making good choices is very important

• Graduation ultimately is judged by the Apache community.– +1/-1 votes on the incubator list

• Good, open engineering practices required– DEV mailing list design discussions, issue tracking– Jira contributions– Important decisions are voted on

• Properly packaged code– Build out of the box– Releases are signed– Licenses, disclaimers, notices, change logs, etc.– Releases are voted

• Developer diversity– Three or more unconnected developers– Price is giving up sole ownership, replace with meritocracy

Page 31: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

Apache and Science Gateways

• Apache rewards projects for cross-pollination.– Connecting with complementary Apache projects

strengthens both sides.– New requirements, new development methods

• Apache methods foster sustainability– Building communities of developers, not just users– Key merit criterion

• Apache methods provide governance– Incubators learn best practices from mentors– Releases are peer-reviewed

Page 32: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

Apache Contributions Aren’t Just Software

• Apache committers and PMC members aren’t just code writers.

• Successful communities also include– Important users– Project evangelists – Content providers: documentation, tutorials– Testers, requirements providers, and constructive

complainers • Using Jira and mailing lists

– Anything else that needs doing.

Page 33: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

Case Study: LEAD• To create an Integrated, Scalable Geosciences

Framework, LEAD among things resulted in a developing a flexible Scientific workflow system.

• The initial goal was to realize WOORDS: Workflow Orchestration for On-Demand Real-Time Dynamically-Adaptive System.

• The system enables execution of legacy scientific codes and facilities sophisticated coupling while interacting with data and provenance sub-systems.

Page 34: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

Case Study: One Degree Imager• A single investigation requires multiple night observations• Each night takes hours of observations with multiple exposures• An exposure is divided into 64 Orthogonal Transfer Array (OTAs)• Each OTA is an 8x8 collection of 512x512 pixel CCD images.• Reducing these data sets require workflow planning taking advantage of

system architectures.• Currently we take advantage of threaded parallelism at node level

branching out to multiple node executions.

Page 35: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

Pipeline Parallelization

OTAs

Exposures

Night/Filter

Campaign TOP

FTR

EXP

OTA OTA

EXP

FTR. . .

. . .

. . .

Page 36: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

Illustrating Interactivity

Page 37: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

Execution PatternsParametric Sweeps

Dot vs Cartisian

Page 38: Science Gateway Group Overview Marlon Pierce, Suresh Marru, Raminder Singh Presentation to PTI CREST Lab May 2 nd, 2012

Interactivity Contd.• Deviations during workflow execution that do not

affect the structure of the workflow– dynamic change workflow inputs, workflow rerun.

interpreted workflow execution model.– dynamic change in point of execution, workflow smart

rerun.– Fault handling and exception models.

• Deviations that change the workflow DAG during runtime– Reconfiguration of activity.– Dynamic addition of activities to the workflow.– Dynamic remove or replace of activity to the workflow