Upload
constance-wilkerson
View
23
Download
2
Embed Size (px)
DESCRIPTION
Virtual Data Toolkit. R. Cavanaugh GriPhyN Analysis Workshop Caltech, June, 2003. MCAT; GriPhyN catalogs. MDS. MDS. GDMP. DAGMAN, Condor-G. GSI, CAS. Globus. GRAM. GridFTP; GRAM; SRM. Very Early GriPhyN Data Grid Architecture. Application. = initial solution is operational. - PowerPoint PPT Presentation
Citation preview
R. CavanaughGriPhyN
Analysis Workshop
Caltech, June, 2003
Virtual Data Toolkit
06.23.2003 Caltech Analysis Workshop 2
Very Early GriPhyN Data Grid Architecture
Application
Planner
Executor
Catalog Services
Info Services
Monitoring
Repl. Mgmt.
Reliable TransferService
Compute Resource Storage Resource
DAGMAN, Condor-G
GRAM GridFTP; GRAM; SRM
GSI, CAS
MDS
MCAT; GriPhyN catalogs
GDMP
MDS
Globus
= initial solution is operational
Policy/Security
06.23.2003 Caltech Analysis Workshop 3
Currently Evolved GriPhyN Picture
Picture Taken from Mike Wilde
06.23.2003 Caltech Analysis Workshop 4
Current VDT Emphasis
Current reality– Easy grid construction
> Strikes a balance between flexibility and “easibility” > purposefully errs (just a little bit) on the side of “easibility”
– Long running, high-throughput, file-based computing– Abstract description of complex workflows– Virtual Data Request Planning– Partial provenance tracking of workflows
Future directions (current research) including:– Policy based scheduling
> With notions of Quality of Service (advanced reservation of resources, etc)
– Dataset based (arbitrary type structures)– Full provenance tracking of workflows– Several others…
06.23.2003 Caltech Analysis Workshop 5
Current VDT Flavors
Client– Globus Toolkit 2
> GSI > globusrun> GridFTP Client
– CA signing policies for DOE and EDG
– Condor-G 6.5.1 / DAGMan– RLS 1.1.8 Client– MonALISA Client (soon)
Chimera 1.0.3 SDK
– Globus– ClassAds– RLS 1.1.8 Client– Netlogger 2.0.13
Server– Globus Toolkit 2.2.4
> GSI > Gatekeeper> job-managers and GASS Cache> MDS> GridFTP Server
– MyProxy– CA signing policies for DOE and
EDG– EDG Certificate Revocation List– Fault Tolerant Shell– GLUE Schema– mkgridmap– Condor 6.5.1 / DAGMan– RLS 1.1.8 Server– MonALISA Server (soon)
06.23.2003 Caltech Analysis Workshop 6
Chimera Virtual Data System
Virtual Data Language– textual
– XML
Virtual Data Catalog– MySQL or PostGreSQL based
– File based version available
06.23.2003 Caltech Analysis Workshop 7
Virtual Data Language
TR CMKIN( out a2, in a1 )
{
argument file = ${a1};
argument file = ${a2};
}
TR CMSIM( out a2, in a1 )
{
argument file = ${a1};
argument file = ${a2};
}
DV x1->CMKIN( a2=@{out:file2}, a1=@{in:file1});
DV x2->CMSIM( a2=@{out:file3}, a1=@{in:file2});
file1
file2
file3
x1
x2
Picture Taken from Mike Wilde
06.23.2003 Caltech Analysis Workshop 8
Virtual Data Request Planning
Abstract Planner– Graph traversal of (virtual) data dependencies– Generates the graph with maximal data dependencies– Somewhat analogous to Build Style
Concrete (Pegasus) Planner– Prunes execution steps for which data already exists (RLS
lookup)– Binds all execution steps in the graph to a site– Adds “housekeeping” steps
> Create environment, stage-in data, stage-out data, publish data, clean-up environment, etc
– Generates a graph with minimal execution steps– Somewhat analogous to Make Style
06.23.2003 Caltech Analysis Workshop 9
Chimera Virtual Data System:Mapping Abstract Workflows onto Concrete Environments
Abstract DAGs (virtual workflow)
– Resource locations unspecified
– File names are logical
– Data destinations unspecified
– build style
Concrete DAGs (stuff for submission)
– Resource locations determined
– Physical file names specified
– Data delivered to and returned from physical
locations
– make style
Abs. PlanVDC
RLS C. Plan.
DAX
DAGMan
DAG
VDL
Log
ical
Ph
ysi
cal
XML
XML
In general there is a full range of planning steps between abstract workflows and concrete workflows
Picture Taken from Mike Wilde
06.23.2003 Caltech Analysis Workshop 10
mass = 200decay = WWstability = 1event = 8
mass = 200decay = WWstability = 1plot = 1
mass = 200decay = WWplot = 1
mass = 200decay = WWevent = 8
mass = 200decay = WWstability = 1
mass = 200decay = WWstability = 3
mass = 200
mass = 200decay = WW
mass = 200decay = ZZ
mass = 200decay = bb
mass = 200plot = 1
mass = 200event = 8
A virtual space of simulated data is generated for future use by scientists...
Supercomputing 2002
06.23.2003 Caltech Analysis Workshop 11
mass = 200decay = WWstability = 1LowPt = 20HighPt = 10000
mass = 200decay = WWstability = 1event = 8
mass = 200decay = WWstability = 1plot = 1
mass = 200decay = WWplot = 1
mass = 200decay = WWevent = 8
mass = 200decay = WWstability = 1
mass = 200decay = WWstability = 3
mass = 200
mass = 200decay = WW
mass = 200decay = ZZ
mass = 200decay = bb
mass = 200plot = 1
mass = 200event = 8
Scientistsmay add new derived data branches...
Supercomputing 2002
06.23.2003 Caltech Analysis Workshop 12
POOL
Gen
era
tor
Sim
ula
tor
Form
ato
r
wri
teES
D
wri
teA
OD
wri
teTA
G
wri
teES
D
wri
teA
OD
wri
teTA
G
An
aly
sis
Scr
ipts
Dig
itis
er
Calib
. D
B
ExampleCMS Data/Workflow
06.23.2003 Caltech Analysis Workshop 13
POOL
Gen
era
tor
Sim
ula
tor
Form
ato
r
wri
teES
D
wri
teA
OD
wri
teTA
G
wri
teES
D
wri
teA
OD
wri
teTA
G
An
aly
sis
Scr
ipts
Dig
itis
er
Calib
. D
B
Onlin
e Te
ams
(Re)
proc
essin
g Te
amMC P
rodu
ctio
n Te
am
Phys
ics G
roup
s
Data/workflowis a collaborative
endeavour!
06.23.2003 Caltech Analysis Workshop 14
A “Concurrent Analysis Versioning System:”
Complex Data Flow and Data Provenance in HEP
Raw
ESD
AO
D
TA
G
Plo
ts,
Table
s,
Fit
s
Com
pari
sons
Plo
ts,
Table
s,Fit
s
Real Data
SimulatedData
Family History of a Data Analysis
Collaborative Analysis Development Environment
"Check-point" a Data Analysis
Analysis Development Environment (like CVS)
Audit a Data Analysis
06.23.2003 Caltech Analysis Workshop 15
Current Prototype GriPhyN “Architecture” (Picture)
Picture Taken from Mike Wilde
06.23.2003 Caltech Analysis Workshop 16
Post-talk: My wandering mind…Typical VDT Configuration
Single public head-node (gatekeeper)– VDT-server installed
Many private worker-nodes – Local scheduler software installed– No grid-middleware installed
Shared file system (e.g. NFS) – User area shared between head-node and
worker-nodes– One or many raid systems typically shared
06.23.2003 Caltech Analysis Workshop 17
computemachinesCondor-G
Chimera
DAGman
gahp_server
submit host remote host
gatekeeper
Local Scheduler(Condor, PBS, etc.)
Default middleware configurationfrom the Virtual Data Toolkit
06.23.2003 Caltech Analysis Workshop 18
EDG Configuration(for comparison)
CPU separate from Storage– CE: single gatekeeper for access to cluster
– SE: single gatekeeper for access to storage Many public worker-nodes (at least NAT)
– Local scheduler installed (LSF or PBS)
– Each worker-node runs a GridFTP Client No assumed shared file system
– Data access is accomplished via globus-url-copy to local disk on worker-node
06.23.2003 Caltech Analysis Workshop 19
Why Care?
Data Analyses would benefit from being fabric independent!
But…the devil is (still) in the details!– Assumptions in job descriptions/requirements currently lead to direct
fabric-level consequences and vice versa.
Are existing middleware configurations sufficient for Data Analysis (“scheduled” and “interactive”)?– Really need input from groups like here!– What kind of fabric layer is necessary for “interactive” data analysis using
PROOF, JAS?
Does the VDT need multiple configuration flavors? – Production, batch oriented (current default)– Analysis, interactive oriented