Knowledge Environments for Science: Representative Projects Ian Foster Argonne National Laboratory...

Preview:

Citation preview

Knowledge Environments for Science:

Representative Projects

Ian Foster

Argonne National Laboratory

University of Chicago

http://www.mcs.anl.gov/~foster

Symposium on Knowledge Environments for Science, November 26, 2002

2

foster@mcs.anl.gov ARGONNE CHICAGO

Comments Informed By Participation in …

E-science/Grid application projects, e.g.– Earth System Grid: environmental science

– GriPhyN, PPDG, EU DataGrid: physics

– NEESgrid: earthquake engineering Grid technology R&D projects

– Globus Project and the Globus Toolkit

– NSF Middleware Initiative Grid infrastructure deployment projects

– Alliance, TeraGrid, DOE Sci. Grid, NASA IPG

– Intl. Virtual Data Grid Laboratory (iVDGL) Global Grid Forum: community & standards

3

foster@mcs.anl.gov ARGONNE CHICAGO

Data Grids for High Energy Physics

Enable community to access & analyze petabytes of data

Coordinated intl projects– GriPhyN, PPDG, iVDGL, EU

DataGrid, DataTAG Challenging computer science

research Real deployments and

applications Defining analysis architecture for

LHC

4

foster@mcs.anl.gov ARGONNE CHICAGO

NEESgrid Earthquake Engineering Collaboratory

2

Network for Earthquake Engineering Simulation

Field Equipment

Laboratory Equipment

Remote Users

Remote Users: (K-12 Faculty and Students)

High-Performance Network(s)

Instrumented Structures and Sites

Leading Edge Computation

Curated Data Repository

Laboratory Equipment (Faculty and Students)

Global Connections(fully developed

FY 2005 –FY 2014)

(Faculty, Students, Practitioners)

U.Nevada Reno

www.neesgrid.org

5

foster@mcs.anl.gov ARGONNE CHICAGO

Size distribution ofgalaxy clusters?

1

10

100

1000

10000

100000

1 10 100

Num

ber

of C

lust

ers

Number of Galaxies

Galaxy clustersize distribution

Chimera Virtual Data System+ GriPhyN Virtual Data Toolkit

+ iVDGL Data Grid (many CPUs)

Communities Need Not be Large:E.g., Astronomical Data Analysis

www.griphyn.org/chimera

6

foster@mcs.anl.gov ARGONNE CHICAGO

A “Knowledge Environment” is a System For …

“Interpersonalcollaboration”

“Integratingdata”

“Accessingspecializeddevices”

“Enablinglarge-scale

computation”

“Sharinginformation”

“Accessingservices”

“Largecommunities”

“Smallteams”

7

foster@mcs.anl.gov ARGONNE CHICAGO

It’s All of the Above: Enabling “Post-Internet Science”

Pre-Internet science– Theorize &/or experiment, in small teams

Post-Internet science– Construct and mine very large databases

– Develop computer simulations & analyses

– Access specialized devices remotely

– Exchange information within distributed multidisciplinary teams

Need to manage dynamic, distributed infrastructures, services, and applications

8

foster@mcs.anl.gov ARGONNE CHICAGO

Enabling Infrastructure for Knowledge Environments for Science

(aka “The Grid”)

“Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations”

9

foster@mcs.anl.gov ARGONNE CHICAGO

Grid Infrastructure What?

– Broadly deployed services in support of fundamental collaborative activities

– Services, software, and policies enabling on-demand access to critical resources

Open standards, software, infrastructure– Open Grid Services Architecture (GGF)

– Globus Toolkit (Globus Project: ANL, USC/ISI)

– NMI, iVDGL, TeraGrid Grid infrastructure R&D&ops is itself a distributed &

international community

10

foster@mcs.anl.gov ARGONNE CHICAGO

Lessons Learned (1)

Importance of standard infrastructure– Software: facilitate construction of systems, and

construction of interoperable systems

– Services: authentication, discovery, …, …

– Needs investment in research, development, deployment, operations, training

Building & operating infrastructure is hard– Challenging technical & policy issues

– Requisite skills not always available

– Can challenge existing organizations

11

foster@mcs.anl.gov ARGONNE CHICAGO

Lessons Learned (2)

Importance of community engagement– “Maine and Texas must have something to

communicate”

– Big science traditions seem to help

– Discipline champions certainly help

– Effective projects often true collaborations between disciplines and computer scientistis

Importance of international cooperation– Science is international, so is expertise

– Challenging, requires incentives & support

12

foster@mcs.anl.gov ARGONNE CHICAGO

Lessons Learned (3)

Collaborative science/Grids are a wonderful source of computer science problems– E.g., “virtual data grid” (GriPhyN): data,

programs, derivations as community resources

– E.g., security within virtual organizations Work in this space can be of intense interest

to industry– E.g., current rapid uptake of Grid

technologies

Recommended