51
SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN GriPhyN: Grid Physics Network and iVDGL: International Virtual Data Grid Laboratory

SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN GriPhyN: Grid Physics Network and iVDGL: International Virtual Data Grid Laboratory

  • View
    214

  • Download
    1

Embed Size (px)

Citation preview

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

GriPhyN: Grid Physics Network

andiVDGL: International Virtual

Data Grid Laboratory

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

Collaboratory Basics

• Two NSF-funded Grid projects in HENP (high energy and nuclear physics) and computer science– MPS and CISE have oversight

• GriPhyN and iVDGL are too closely related to discuss one without discussing the other– One is CS research and test application, the other is to build

an international scale facility to do these tests, and to address other goals as well

– Share vision, personnel and components

• These two collaboratories are part of a larger effort to develop the components and infrastructure for supporting data intensive science

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

Some Science Drivers

• Computation is becoming an increasingly important tool of scientific discovery

– Computationally intense analyses– Global collaborations– Large datasets

• The increasing importance of computation in science is more pronounced in some fields

– Complex (e.g. climate modeling) and high volume (HEP) simulations– Detailed rendering (e.g. biomedical informatics)– Data intensive science (e.g. astronomy and physics)

• GriPhyN and iVDGL were founded to provide the models and software for the data management infrastructure for four large projects

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

SDSS / NVO

• SDSS / NVO are in full production

• Explore how the Grid can be used in astronomy

– What’s the benefit?– How to integrate?– How can the Grid be used for

future sky surveys?– Data processing pipelines are

complex– Has made the most sophisticated

use of the virtual data concept

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

LIGO

• Not in full production, but real data is being taken

– LIGO I Engineering Runs• 35 TB since 1999 and growing

– LIGO I Science Runs• 62 TB in two science runs, additional

run planned that will generate 135 TB• Eventual constant operation at 270

TB/year

– LIGO II Upgrade• Eventual Operation at 1-2 PB / year

• Need distributed computing power of the Grid

• Need virtual data catalogs for efficient dissemination of data and management of workflow

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

CMS / ATLAS

• CMS and ATLAS are two experiments being developed for the Large Hadron Collider at CERN

• Two projects, two cultures, but:– Similar data challenges– Similar geographic

distribution– Moving closer to common

tools through the LCG computing grid.

• Petabytes of data per year (100 PB by 2012)

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

Function Types

• GriPhyN– Distributed Research Center

• iVDGL– Community Data System

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

GriPhyN

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

GriPhyN Funding

• Funded in 2000 through NSF ITR program

• $11.9M + $1.6M matching

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

GriPhyN Project Team

• Led by U. Florida and U. Chicago– PD’s Paul Avery (UF) and Ian Foster (UC)

• 22 Participant institutions– 13 funded– 9 unfunded

• Roughly 82 people involved• 2/3 of activity computer science, 1/3 physics

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

• Funded Institutions– U. Florida

– U. Chicago

– CalTech

– U. Wisconsin - Madison

– USC / ISI

– Indiana U.

– Johns Hopkins U.

– Texas A & M

– UT Brownsville

– UC Berkeley

– U Wisconsin Milwaukee

– SDSC

• Unfunded Institutions– Argonne NL– Fermi NAL– Brookhaven NL– UC San Diego– U. Pennsylvania– U. Illinois - Chicago– Stanford– Harvard– Boston U.– Lawrence Berkeley Lab

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

Technology

• GriPhyN’s science drivers demand timely access to very large datasets and the computer cycles and information management infrastructure needed to manipulate and transform those datasets in a meaningful way

• Data Grids are an approach to data management and resource sharing in environments where datasets are very large– Policy-driven resource sharing, distributed storage,

distributed computation, replication and provenance tracking

• GriPhyN and iVDGL aim to enable petascale virtual data grids

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

Virtual Data Tools

Request Planning &Scheduling Tools

Request Execution & Management Tools

Transforms

Distributed resources(code, storage, CPUs,networks)

ResourceManagement

Services

Security andPolicy

Services

Other GridServices

Interactive User Tools

Production TeamSingle Researcher Workgroups

Raw datasource

PetaOps Petabytes Performance

Petascale Virtual DataGrids

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

GriPhyN Datagrid Contributions

• GriPhyN has three areas of contribution for achieving the DataGrid vision

• Contributing CS research– Virtual Data as a unifying concept– Planning, execution and performance monitoring– Integrating these facilities in a transparent and high-productivity manner:

Making the grid as easy to use as a workstation and the web

• Disseminating this research through the Virtual Data Toolkit and other tools

– Chimera– Pegasus

• Integrate CS research results into GriPhyN science projects• GriPhyN experiments serve as an exciting but demanding CS and HCI

laboratory

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

Virtual Data Toolkit: VDT

• A suite of tools developed by the CS team to support science on the Grid

• Uniting theme is virtual data– Nearly all data in physics / astronomy is virtual data - derivations of a large,

well known data set– It is possible to represent derived data as the set of instructions that created

it– There is no need to always copy a derived data set - it can be recomputed if

you have the workflow– Virtual data also has a number of beneficial side effects, e.g. data

provenance,discovery, re-creation, workflow automation

• Many packages, a few are unique to GriPhyN, others are common across many Grid projects

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

Transformation Derivation

Data

created-by

execution-of

consumed-by/generated-by

“I’ve detected a calibration error in an instrument and

want to know which derived data to recompute.”

“I’ve come across some interesting data, but I need to understand the nature of the corrections applied when it was constructed before I can trust it for my purposes.”

“I want to search an astronomical database for galaxies with certain characteristics. If a program that performs this analysis exists, I won’t have to write one from scratch.”

“I want to apply an astronomical analysis program to millions of objects. If the results

already exist, I’ll save weeks of computation.”

Motivations

Slide courtesy Ian Foster

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

Chimera

• The Chimera Virtual Data System is one of the core tools of the GriPhyN Virtual Data Toolkit

• Virtual Data Catalog– Represents transformation

procedures and derived data

• Virtual Data Language Interpreter– Translates user requests into Grid

workflow

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

Pegasus

• Planning Execution in Grids• Tool for mapping complex workflows

onto the Grid• Converts abstract Chimera workflow

into a concrete workflow, which is sent to DAGman for execution– DAGman is the Condor meta-scheduler– Determines sites and data transfers

Virtual Data Processing Tools

VDLx

abstractplanner

XMLDAG

CondorDAG

Pegasusconcreteplanner

Localshell

planner

shellDAG

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

1

10

100

1000

10000

100000

1 10 100

Number of Clusters

Number of Galaxies

Galaxy clustersize distribution

DAG

Example:Sloan Galaxy Cluster Finder

Sloan Data

Jim Annis, Steve Kent, Vijay Sehkri, Fermilab, Michael

Milligan, Yong Zhao, Chicago

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

1

10

100

1000

10000

100000

1 10 100

Number of Clusters

Number of Galaxies

Galaxy clustersize distribution

DAG

Example:Sloan Galaxy Cluster

Sloan Data

With Jim Annis & Steve Kent, FNAL

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

Resource Diagram

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

International Virtual Data Grid Laboratory: iVDGL

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

Some Context

• There is much more to the DataGrid world than GriPhyN

• Broad problem space, with many cooperative projects– U.S.

• Particle Physics Data Grid (PPDG)• GriPhyN

– Europe• DataTAG• EU DataGrid

– International• iVDGL

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

Background and Goals

• U.S. portion funded in 2001 as a Large ITR through NSF– $13.7M + $2M matching

• International partners responsible for own funding• Aims of iVDGL

– Establish a Global Grid Laboratory– Conduct DataGrid tests at scale– Promote interoperability– Promote testbeds for non-physics applications

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

Relationship to GriPhyN

• Significant overlap– Common management, personnel overlap

• Roughly 80 people on each project, 120 total

– Tight technical coordination• VDT• Outreach• Testbeds

– Common External Advisory Committee

• Different focus - domain challenges– GriPhyN - 2/3 CS, 1/3 Physics: IT Research– iVDGL - 1/3 CS, 2/3 Physics: Testbed deployment and

operation

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

Project Composition

• CS Research– U.S. iVDGL Institutions– UK e-science programme– DataTAG– EU DataGrid

• Testbeds– ATLAS / CMS– LIGO– National Virtual Observatory– SDSS

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

U Florida CMSCaltech CMS, LIGOUC San Diego CMS, CSIndiana U ATLAS, iGOCBoston U ATLASU Wisconsin, Milwaukee LIGOPenn State LIGOJohns Hopkins SDSS, NVOU Chicago CSU Southern California CSU Wisconsin, Madison CSSalish Kootenai Outreach, LIGOHampton U Outreach, ATLASU Texas, Brownsville Outreach, LIGOFermilab CMS, SDSS, NVOBrookhaven ATLASArgonne Lab ATLAS, CS

iVDGL Institutions

T2 / Software

CS support

T3 / Outreach

T1 / Labs(not funded)

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

US-iVDGL Sites

Partners?EUCERNBrazilAustraliaKoreaJapan

UF

Wisconsin

BNL

Indiana

Boston USKC

Brownsville

Hampton

PSU

J. Hopkins

Caltech

Tier1Tier2Tier3

FIU

FSUArlington

Michigan

LBL

Oklahoma

Argonne

Vanderbilt

UCSD/SDSC

NCSA

Fermilab

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

Component Projects

• iVDGL contains several core projects– iGOC

• International Grid Operations Center

– GLUE• Grid Laboratory Uniform Environment

– WorldGrid – 2002 international demo– Grid3 – 2003 deployment effort

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

iGOC

• International Grid Operations Center– iVDGL “headquarters”– Analogous to a Network Operations Center– Located at Indiana University

• Single point of contact for iVDGL operations• Database of contact information• Centralized information about storage, network and

compute resources• Directory for monitoring services at iVDGL sites

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

GLUE

• Grid Laboratory Uniform Environment– A grid interface subset specification that permits applications to run

on grids from VDT and EDG sources

• Effort to ensure interoperability across numerous physics grid projects– GriPhyN, iVDGL, PPDG– EU DataGrid, DataTAG, CrossGrid, etc.

• Interoperability effort focuses on:– Software– Configuration– Documentation– Test suites

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

WorldGrid

• Effort at a world wide DataGrid• Easy to deploy and administer

– Middleware based on VDT– Chimera development– Scalability

• Demo at SC2002– United DataTAG and iVDGL

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

Resource Diagram

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

iVDGL Management

Project Coordination Group

US External Advisory Committee

GLUE Interoperability Team

Collaborating Grid Projects

TeraGrid

EDG Asia

DataTAG

BTEV

LCG?

BioALICE Geo

?

D0 PDC CMS HI ?

US ProjectDirectors

Outreach Team

Core Software Team

Facilities Team

Operations Team

Applications Team

International Piece

US Project Steering Group

U.S. Piece

GriPhyN Mike Wilde

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

Issues across projects

• Technical readiness

• Infrastructure readiness

• Collaboration readiness

• Common ground

• Coupling of tasks

• Incentives

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

Technical readiness

• Very high

• Physics and CS are both very high on the adoption curve, generally

• Long history of infrastructure development to support national and global experiments

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

Infrastructure readiness

• Also quite high• Not all of the pieces are in place to meet

demand• The expertise exists within these communities

to build and maintain the necessary infrastructure– Community is inventing the infrastructure

• Real understanding in the project that interoperability and standards are part of infrastructure

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

Collaboration readiness

• Again, quite high

• Physicists have a long history of large scale collaboration

• CS collaborations built on old relationships with long time collaborators

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

Common ground

• Perhaps a bit too high• What you can do with a physics background:

– Win the ACM Turing Award– Co-invent the World Wide Web– Direct the development of the Abilene backbone

• Because application community has a strong understanding of the required work and the technical aspects of the work, some friction about how work separates– History of physicists building computational tools e.g. ROOT

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

Coupling of tasks

• Tasks decompose into subtasks that are somewhat tightly coupled– Locate tightly coupled tasks at individual

sites

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

Incentives

• Both groups are well motivated, but for different reasons

• CS is engaging in extremely cutting edge research across a large range of activities– Funded for deployment as well as development

• Physics is structurally committed to global collaborations

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

Some successes

• Lessons in infrastructure development

• Outreach and engagement

• Community buy-in / investment

• Achieving the CS research goals for Virtual Data and Grid execution

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

Infrastructure Dev

• Looking at the history of the Grid (electrical, not computational)– Long phases

• Invention• Initial production use• Adaptation• Standardization / regulation

– Geographically bounded dominant design• I.e. 220 vs. 110

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

Infrastructure Dev

• We don’t see this with GriPhyN / iVDGL– Projects concurrent, not consecutive– Pipeline approach to phases of infrastructure

development– Real efforts at cooperation with other DataGrid

communities

• Why?– Deep understanding at high levels of project that

building it alone is not enough– Directive and funding from NSF to do deployment

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

Outreach

• The GriPhyN / iVDGL community is extremely active in outreach to other projects and communities– Evangelizing virtual data– Distributed tools

• This is a huge win for building CI that others can use

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

Community buy-in

• Together, these projects are funded at nearly $30M over 6 years

• This does not represent the total investment that was needed to make this work– Leveraged FTE– Unfunded testbed sites– International partners– Lots of collaboration with PPDG; starting some with

Alliance Expeditions, etc

• This kind of community commitment necessary for a project of this size to succeed

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

Challenges

• Staying relevant

• Building infrastructure with term limited funding

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

Staying Relevant (1)

• The application communities are fast paced, high power groups of people– Real danger in those communities developing

tools that satisfice while they wait for the tools that are optimal and fit into a greater cyberinfrastructure

– Each experiment ideally wants tools perfectly tailored to their needs

• Maintaining user engagement and meeting the needs of each community is critical, but difficult

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

Staying Relevant (2)

• In addition to staying relevant to the experiments, GriPhyN must also be relevant to the greater scientific community– To CS researchers– To similarly data-intensive projects

• Easy to understand code, concepts, APIs, etc.

• How do you accommodate both a focused client community and the broader scientific community

• Common challenge across many CI initiatives

SCHOOL OF INFORMATION

UNIVERSITY OF MICHIGAN

Limited Term Investment

• These projects are both funded under the NSF ITR mechanism – 5 year limit

• Would you buy your telephone service from a company that was going to shut down after 5 years?

• Challenge to find a sustainable support mechanism for CI