9
SDM Cente r Experience with Fusion Workflows Norbert Podhorszki, Bertram Ludäscher Department of Computer Science University of California, Davis UC DAVIS Department of Computer Science kepler-project.org

Experience with Fusion Workflows

  • Upload
    baxter

  • View
    20

  • Download
    0

Embed Size (px)

DESCRIPTION

Norbert Podhorszki, Bertram Lud ä scher Department of Computer Science University of California, Davis. kepler-project.org. UC DAVIS Department of Computer Science. Experience with Fusion Workflows. New Challenges. - PowerPoint PPT Presentation

Citation preview

Page 1: Experience with Fusion Workflows

SDMCenter

Experience with Fusion Workflows

Norbert Podhorszki, Bertram Ludäscher

Department of Computer Science

University of California, Davis

UC DAVISDepartment ofComputer Science

kepler-project.org

Page 2: Experience with Fusion Workflows

SDMCenterNew Challenges

• The CPES project brought new challenges for Kepler and workflow automation people• Remote computations, services and tools

• Long running simulations, large amounts of data

• One-time-passwords

• Workflow = “Glue” • Scientists only need to connect individual components together

• Automate tedious processes (logins, copies of data, control, start-stop)

• Do it reliably

• Show what is going on

Page 3: Experience with Fusion Workflows

SDMCenterWorkflows

• Real-time Monitoring of Simulation:• Transfer current data set to a secondary resource• Execute short analysis/visualization routines• Display result

• Archival and post-processing• Transfer, pack and archive data

sets on the fly

Page 4: Experience with Fusion Workflows

SDMCenterKepler actors for CPES

• Job submission to various resource managers• Permanent SSH connection to perform tasks on a remote machine• Generalized actors (workflows themselves) for specified tasks:

• Watch a remote directory for simulation timesteps

• Execute an external command on a remote machine

• Tar and archive data in large junks to HPSS

• Transfer a remote image file and display on screen

• Control a running SCIRun server remotely

• Above actors do logging/checkpointing• the final workflow can be stopped / restarted

Page 5: Experience with Fusion Workflows

SDMCenter

Convert

Archive

Monitor

Transfer

Archival Workflow

Plasma physics simulation on 2048 processors on Seaborg@NERSC (LBL)Gyrokinetic Toroidal Code (GTC) to study energy transport in fusion devices (plasma microturbulence)

Generating 800GB of data (3000 files, 6000 timesteps, 267MB/timestep), 30+ hour simulation run

Under workflow control:Monitor (watch) simulation progress (via remote scripts)

Transfer from NERSC to ORNL concurrently with the simulation run

Convert each file to HDF5 file

Archive files to 4GB chunks into HPSS

Page 6: Experience with Fusion Workflows

SDMCenterMonitoring Workflow

Page 7: Experience with Fusion Workflows

SDMCenterFuture Plans

• Currently we have specialized actors that should be generalized for other disciplines and systems• “watching for” simulation output

• safe and robust transfer, recovery from failure

• archiving to different MSS, with different security policies, robust to failures and maintenance periods

• Next workflow is cyclic, not just streaming• couple two simulations on two resources, transfer data and control

between them

• use local job manager for code execution

• What about provenance management?• main reason to use scientific workflow system e.g. in bioinformatics

workflows – needed for debugging runs, interpreting results, etc.

Page 8: Experience with Fusion Workflows

SDMCenter

There is more, e.g., how to get from messy to neat & reusable designs?

Author: Tim McPhillips, UC Davis

Page 9: Experience with Fusion Workflows

SDMCenterThe Answer (YMMV)

• Collection-Oriented Modeling & Design (COMAD)• embrace an assembly line metaphor• data = tagged nested collections

• e.g. represented as flattened, pipelined token streams: