Overview of GT4 Data Services. Globus Data Services Talk Outline Summarize capabilities and plans for data services in the Globus Toolkit Version 4.0.2

Overview of GT4 Data Services

Globus Data Services Talk Outline

Summarize capabilities and plans for data services in the Globus Toolkit Version 4.0.2

Extensible IO (XIO) system Flexible framework for I/O

GridFTP Fast, secure data transport

The Reliable File Transfer Service (RFT) Data movement services for GT4

The Replica Location Service (RLS) Distributed registry that records locations of data copies

The Data Replication Service (DRS) Integrates RFT and RLS to replicate and register files

The eXtensible Input / Output (XIO) System

GridFTP

The Reliable File Transfer Service (RFT)

Bill Allcock, ANL

Technology Drivers Internet revolution: 100M+ hosts

Collaboration & sharing the norm Universal Moore’s law: x103/10 yrs

Sensors as well as computers Petascale data tsunami

Gating step is analysis & our old infrastructure?

114 genomes735 in progress

You are here

Slide courtesy of Ian Foster

Extensible IO (XIO) system Provides a framework that implements a

Read/Write/Open/Close Abstraction Drivers are written that implement the

functionality (file, TCP, UDP, GSI, etc.) Different functionality is achieved by building

protocol stacks GridFTP drivers will allow 3rd party applications to

easily access files stored under a GridFTP server Other drivers could be written to allow access to

other data stores. Changing drivers requires minimal change to the

application code.

Globus XIO Framework

Moves the data from user to driver stack.

Manages the interactions between drivers.

Assist in the creation of drivers.

Internal API for passing operations down the stack

User API

Fram

ework

Driver Stack

Transform

Transform

Transport

GridFTP

A secure, robust, fast, efficient, standards based, widely accepted data transfer protocol

GridFTP Protocol Defined through the Global/Open Grid Forum Multiple Independent implementations can interoperate

The Globus Toolkit supplies a reference implementation: Server Client tools (globus-url-copy) Development Libraries

GridFTP Protocol FTP protocol is defined by several IETF RFCs

Start with most commonly used subset Standard FTP: get/put operations, 3rd-party transfer

Implement standard but often unused features GSS binding, extended directory listing, simple restart

Extend in various ways, while preserving interoperability with existing servers

Parallel data channels Striped transfers Partial file transfers Automatic & manual TCP buffer setting Progress monitoring Extended restart

GridFTP Protocol (cont.)

Existing standards RFC 959: File Transfer Protocol RFC 2228: FTP Security Extensions RFC 2389: Feature Negotiation for the File Transfer

Protocol Draft: FTP Extensions GridFTP: Protocol Extensions to FTP for the Grid

Grid Forum Recommendation GFD.20 http://www.ggf.org/documents/GWD-R/GFD-R.020.pdf

GT4 GridFTP Implementation 100% Globus code. No licensing issues. Striping support is provided in 4.0 Has IPV6 support included

Based on XIO Extremely modular to allow integration with a variety

of data sources (files, mass stores, etc.) Storage Resource Broker (SRB) DSI – Allows use of

GridFTP to access data stored in SRB systems High Performance Storage System (HPSS) DSI – Provides

GridFTP access to hierarchical mass storage systems (HPSS 6.2) that include tape storage

?Clients

Data StorageInterfaces (DSI) -POSIX -SRB -HPSS -NEST

GridFTP ServerSeparate control, dataStriping, fault tolerance

Metrics collectionAccess control

XIO Drivers -TCP -UDT (UDP) -Parallel streams -GSI -SSH

Client Interfaces Globus-URL-Copy

C Library RFT (3rd party)

FileSystems

I/ONetwork

GridFTP

www.gridftp.org

Control and Data Channels

Control

Data

Typical Installation

Control

Data

Separate Processes

Striped Server

Control

Data

GridFTP (and FTP) use (at least) two separate socket connections: A control channel for carrying the commands and responses A data Channel for actually moving the data

Control Channel and Data Channel can be (optionally) completely separate processes.

Striped Server Mode Multiple nodes work together *on a single file* and act

as a single GridFTP server An underlying parallel file system allows all nodes to see

the same file system and must deliver good performance (usually the limiting factor in transfer speed) I.e., NFS does not cut it

Each node then moves (reads or writes) only the pieces of the file that it is responsible for.

This allows multiple levels of parallelism, CPU, bus, NIC, disk, etc. Critical if you want to achieve better than 1 Gbs

without breaking the bank

TeraGrid Striping results

Ran varying number of stripes Ran both memory to memory and disk to

disk. Memory to Memory gave extremely good

(nearly 1:1) linear scalability. We achieved 27 Gbs on a 30 Gbs link (90%

utilization) with 32 nodes. Disk to disk we were limited by the storage

system, but still achieved 17.5 Gbs

Memory to Memory over 30 Gigabit/s Network (San Diego — Urbana)

BANDWIDTH Vs STRIPING

0

5000

10000

15000

20000

25000

30000

0 10 20 30 40 50 60 70

Degree of Striping

Ban

dw

idth

(M

bp

s)

# Stream = 1 # Stream = 2 # Stream = 4 # Stream = 8 # Stream = 16 # Stream = 32

30 Gb/s

20 Gb/s

10 Gb/s

Striping

Disk to Disk over 30 Gigabit/s Network (San Diego — Urbana)

BANDWIDTH Vs STRIPING

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

20000

0 10 20 30 40 50 60 70

Degree of Striping

Ban

dw

idth

(M

bp

s)

# Stream = 1 # Stream = 2 # Stream = 4 # Stream = 8 # Stream = 16 # Stream = 32

20 Gb/s

10 Gb/s

Striping

Scalability Results

GridFTP Server Performance Upload 10MB file from 1800 clients to ned-6.isi.edu:/dev/null

0

100

200

300

400

500

600

700

800

900

1000

1100

1200

1300

1400

1500

1600

1700

1800

1900

2000

0 500 1000 1500 2000 2500 3000 3500

Time (sec)

# o

f Co

ncu

ren

t Ma

chin

es

/ Re

spo

nse

Tim

e (

sec)

0

10

20

30

40

50

60

70

80

90

100

CP

U %

/ Th

rou

gh

pu

t (M

B/s

))

Throughput

Response Time

Load

CPU %

Memory Used

Lots Of Small Files (LOSF)

Pipelining Many transfer requests outstanding at once Client sends second request before the first

completes Latency of request is hidden in data transfer

time Cached Data channel connections

Reuse established data channels (Mode E) No additional TCP or GSI connect overhead

“Lots of Small Files” (LOSF) Optimization

Send 1 GB partitioned into equi-sized filesover 60 ms RTT,1 Gbit/s WAN

Meg

abit/

sec

File size (Kbyte) (16MB TCP buffer)

Number of files

John Bresnahan et al., Argonne

Reliable File Transfer Service (RFT)

Service that accepts requests for third-party file transfers Maintains state in a DB about ongoing transfers Recovers from RFT service failures Increased reliability because state is stored in a database. Service interface

The client can submit the transfer request and then disconnect and go away

Similar to a job scheduler for transfer job Two ways to check status

Subscribe for notifications Poll for status (can check for missed notifications)

Reliable File Transfer (cont.)

RFT accepts a SOAP description of the desired transfer It writes this to a database It then uses the Java GridFTP client library to initiate 3rd

part transfers on behalf of the requestor Restart Markers are stored in the database to allow for

restart in the event of an RFT failure Supports concurrency, i.e., multiple files in transit at

the same time This gives good performance on many small files

Reliable File Transfer:Third Party Transfer

RFT Service

RFT Client

SOAP Messages

Notifications(Optional)

DataChannel

Protocol Interpreter

MasterDSI

DataChannel

SlaveDSI

IPCReceiver

IPC Link

MasterDSI

Protocol Interpreter

Data Channel

IPCReceiver

SlaveDSI

Data Channel

IPC Link

GridFTP Server GridFTP Server

Fire-and-forget transfer Web services interface Many files & directories Integrated failure recovery Has transferred 900K files

Data Transfer Comparison

Control

Data

Control

Data

Control

Data

Control

Data

globus-url-copy RFT Service

RFT Client

SOAP Messages

Notifications(Optional)

Globus Replica Location Service

Replica Management in Grids

Data intensive applications produce terabytes or petabytes of data stored as of millions of data objects

Replicate data at multiple locations for reasons of: Fault tolerance:

Avoid single points of failure

Performance Avoid wide area data transfer latencies Achieve load balancing

Need tools for: Registering the existence of data items, discovering them Replicating data items to new locations

A Replica Location Service• A Replica Location Service (RLS) is a distributed

registry that records the locations of data copies and allows replica discovery Must perform and scale well: support hundreds of

millions of objects, hundreds of clients

E.g., LIGO (Laser Interferometer Gravitational Wave Observatory) Project RLS servers at 8 sites Maintain associations between 3 million logical file

names & 30 million physical file locations

RLS is one component of a Replica Management system Other components include consistency services,

replica selection services, reliable data transfer, etc.

A Replica Location Service

A Replica Location Service (RLS) is a distributed registry that records the locations of data copies and allows discovery of replicas

RLS maintains mappings between logical identifiers and target names

An RLS framework was designed in a collaboration between the Globus project and the DataGrid project (SC2002 paper)

LRC LRC LRC

RLIRLI

LRCLRC

Replica Location Indexes

Local Replica Catalogs

• Replica Location Index (RLI) nodes aggregate information about one or more LRCs

• LRCs use soft state update mechanisms to inform RLIs about their state: relaxed consistency of index

• Optional compression of state updates reduces communication, CPU and storage overheads

• Membership service registers participating LRCs and RLIs and deals with changes in membership

RLS Framework

• Local Replica Catalogs (LRCs) contain consistent information about logical-to-target mappings

Replica Location Service In Context

Replica Location ServiceReliable Data

Transfer Service

GridFTP

Reliable Replication Service

Replica Consistency Management Services

MetadataService

The Replica Location Service is one component in a layered data management architecture

Provides a simple, distributed registry of mappings Consistency management provided by higher-level services

Components of RLS Implementation Common server implementation for

LRC and RLI

Front-End Server Multi-threaded Written in C Supports GSI Authentication using

X.509 certificates

Back-end Server MySQL, PostgreSQL, Oracle

Relational Database Embedded SQLite DB

Client APIs: C, Java, Python Client Command line tool

DB

LRC/RLI Server

ODBC (libiodbc)

myodbc

mySQL Server

clientclient

RLS Implementation Features Two types of soft state updates from LRCs to RLIs

Complete list of logical names registered in LRC Compressed updates: Bloom filter summaries of LRC

Immediate mode Incremental updates

User-defined attributes May be associated with logical or target names

Partitioning (without bloom filters) Divide LRC soft state updates among RLI index nodes

using pattern matching of logical names

Currently, static membership configuration only No membership service

Alternatives for Soft State Update Configuration

LFN List Send list of Logical Names stored on LRC Can do exact and wildcard searches on RLI Soft state updates get increasingly expensive as

number of LRC entries increases space, network transfer time, CPU time on RLI

E.g., with 1 million entries, takes 20 minutes to update mySQL on dual-processor 2 GHz machine (CPU-limited)

Bloom filters Construct a summary of LRC state by hashing logical

names, creating a bitmap Compression Updates much smaller, faster Supports higher query rate Small probability of false positives (lossy compression) Lose ability to do wildcard queries

Immediate Mode for Soft State Updates

Immediate Mode Send updates after 30 seconds (configurable) or after

fixed number (100 default) of updates Full updates are sent at a reduced rate Tradeoff depends on volatility of data/frequency of

updates Immediate mode updates RLI quickly, reduces period of

inconsistency between LRC and RLI content

Immediate mode usually sends less data Because of less frequent full updates

Usually advantageous An exception would be initially loading of large

database

Performance Testing Extensive performance testing reported in HPDC 2004

paper

Performance of individual LRC (catalog) or RLI (index) servers Client program submits operation requests to server

Performance of soft state updates Client LRC catalogs sends updates to index servers

Software Versions: Replica Location Service Version 2.0.9 Globus Packaging Toolkit Version 2.2.5 libiODBC library Version 3.0.5 MySQL database Version 4.0.14 MyODBC library (with MySQL) Version 3.51.06

Testing Environment Local Area Network Tests

100 Megabit Ethernet Clients (either client program or LRCs) on cluster:

dual Pentium-III 547 MHz workstations with 1.5 Gigabytes of memory running Red Hat Linux 9

Server: dual Intel Xeon 2.2 GHz processor with 1 Gigabyte of memory running Red Hat Linux 7.3

Wide Area Network Tests (Soft state updates) LRC clients (Los Angeles): cluster nodes RLI server (Chicago): dual Intel Xeon 2.2 GHz

machine with 2 gigabytes of memory running Red Hat Linux 7.3

LRC Operation Rates (MySQL Backend)Operation Rates,

LRC with 1 million entries in MySQL Back End, Multiple Clients, Multiple Threads Per Client,

Database Flush Disabled

0

500

1000

1500

2000

2500

1 2 3 4 5 6 7 8 9 10

Number Of Clients

Op

erat

ion

s P

er

seco

nd

Query Rate w ith 10 threads per client

Add Rate w ith 10 threads per client

Delete Rate w ith 10 threads per client

• Up to 100 total requesting threads

• Clients and server on LAN

• Query: request the target of a logical name

• Add: register a new <logical name, target> mapping

• Delete a mapping

Comparison of LRC to Native MySQL Performance

Operation Rates for MySQL Native Database, 1 Million entries in the mySQL back end,

Multiple Clients, Multiple Threads Per Client, Database flush disabled

0

500

1000

1500

2000

2500

3000

3500

4000

1 2 3 4 5 6 7 8 9 10

Number of Clients

Op

erat

ion

s p

er

seco

nd

Query Rate w ith 10 threads per clientAdd Rate w ith 10 threads per clientDelete Rate w ith 10 threads per client

LRC Overheads

Highest for queries: LRC achieve 70-80% of native rates

Adds and deletes: ~90% of native performance for 1 client (10 threads)

Similar or better add and delete performance with 10 clients (100 threads)

Bulk Operation Performance For user convenience,

server supports bulk operations

E.g., 1000 operations per request

Combine adds/deletes to maintain approx. constant DB size

For small number of clients, bulk operations increase rates

E.g., 1 client (10 threads) performs 27% more queries, 7% more adds/deletes

Bulk vs. Non-Bulk Operation Rates, 1000 Operations Per Request, 10 Request Threads Per Client

0

500

1000

1500

2000

2500

3000

1 2 3 4 5 6 7 8 9 10

Number of clients

Op

erat

ion

Rat

es

Bulk QueryBulk Add/DeleteNon-bulk QueryNon-bulk AddNon-bulk Delete

Uncompressed Soft State Updates

Time for Uncompressed LFN Updates in LAN to Single RLI as Size & Number of LRCs Increase

1

10

100

1000

10000

1 2 3 4 5 6 7 8

Number of LRCs

Ave

rag

e T

iime

for

Up

dat

e (s

eco

nd

s)

10K entries in LRC

100K entries in LRC

1M entries in LRC

• Perform poorly when multiple LRCs update RLI

•E.g., 6 LRCs with 1 million entries updating RLI, average update ~5102 seconds in Local Area

• Limiting factor: rate of updates to an RLI database

• Advisable to use incremental updates

Bloom Filter Compression

Construct a summary of each LRC’s state by hashing logical names, creating a bitmap

RLI stores in memory one bitmap per LRC

Advantages: Updates much smaller, faster Supports higher query rate

Satisfied from memory rather than database

Disadvantages: Lose ability to do wildcard queries, since not sending

logical names to RLI Small probability of false positives (configurable)

Relaxed consistency model

Bloom Filter Performance: Single Wide Area Soft State Update

(Los Angeles to Chicago)

LRC Database Size

Avg. time to send soft state update (seconds)

Avg. time for initial bloom filter computation (seconds)

Size of bloom filter (bits)

100,000 entries

Less than 1 2 1 million

1 million entries

1.67 18.4 10 million

5 million entries

6.8 91.6 50 million

Scalability of Bloom Filter Updates

14 LRCs with 5 million mappings send Bloom filter updates continuously in Wide Area (unlikely, represents worst case)

Update times increase when 8 or more clients send updates 2 to 3 orders of magnitude better performance than uncompressed (e.g.,

5102 seconds with 6 LRCs)

Average Time to Perform Continuous Bloom Filter Updates From

Increasing Number of LRC Clients

02468

101214

1 2 3 4 5 6 7 8 9 10 11 12 13 14

Number of LRC Clients

Ave

rge

Clie

nt

Up

dat

e T

ime

Bloom Filter Compression Supports Higher RLI Query Rates

RLI Bloom Filter Query rate, Each Bloom Filter has 1 Million Mappings, Multiple Clients with 3 Threads per Client

0

2000

4000

6000

8000

10000

12000

14000

1 2 3 4 5 6 7 8 9 10

Number of Clients

Ave

rag

e Q

uer

y ra

te

Query Rate w ith 3 threads per client. 1 Bloom filter at RLIQuery Rate w ith 3 threads per client. 10 Bloom filters at RLIQuery Rate w ith 3 threads per client. 100 Bloom filters at RLI

• Uncompressed updates: about 3000 queries per second

• Higher rates with Bloom filter compression

• Scalability limit: significant overhead to check 100 bit maps

• Practical deployments: <10 LRCs updating an RLI

RLS Performance SummaryIndividual RLS servers perform well and scale up to

Millions of entries One hundred requesting threads

Soft state updates of the distributed index scale well when using Bloom filter compression

Uncompressed updates slow as size of catalog grows Immediate mode is advisable

Current Work Ongoing maintenance and improvements to RLS

RLS is a stable component Good performance and scalability No major changes to existing interfaces

Recently added features: WS-RLS: WS-RF compatible web services interface to

existing RLS service Embedded SQLite database for easier RLS

deployment Pure Java client implementation completed

Wide Area Data Replication for Scientific Collaborations

Ann Chervenak, Robert Schuler, Carl KesselmanUSC Information Sciences Institute

Scott KorandaUniva Corporation

Brian MoeUniversity of Wisconsin Milwaukee

Motivation Scientific application domains spend considerable

effort managing large amounts of experimental and simulation data

Have developed customized, higher-level Grid data management services

Examples: Laser Interferometer Gravitational Wave Observatory

(LIGO) Lightweight Data Replicator System

High Energy Physics projects: EGEE system, gLite, LHC Computing Grid (LCG) middleware

Portal-based coordination of services (E.g., Earth System Grid)

Motivation (cont.) Data management functionality varies by application

Share several requirements: Publish and replicate large datasets (millions of files) Register data replicas in catalogs and discover them Perform metadata-based discovery of datasets May require ability to validate correctness of replicas In general, data updates and replica consistency services

not required (i.e., read-only accesses)

Systems provide production data management services to individual scientific domains

Each project spends considerable resources to design, implement & maintain data management system

Typically cannot be re-used by other applications

Motivation (cont.) Long-term goals:

Generalize functionality provided by these data management systems

Provide suite of application-independent services

Paper describes one higher-level data management service: the Data Replication Service (DRS)

DRS functionality based on publication capability of the LIGO Lightweight Data Replicator (LDR) system

Ensures that a set of files exists on a storage site Replicates files as needed, registers them in catalogs

DRS builds on lower-level Grid services, including: Globus Reliable File Transfer (RFT) service Replica Location Service (RLS)

Outline

Description of LDR data publication capability Generalization of this functionality

Define characteristics of an application-independent Data Replication Service (DRS)

DRS Design DRS Implementation in GT4 environment Evaluation of DRS performance in a wide area Grid Related work Future work

A Data-Intensive Application Example: The LIGO Project

Laser Interferometer Gravitational Wave Observatory (LIGO) collaboration

Seeks to measure gravitational waves predicted by Einstein

Collects experimental datasets at two LIGO instrument sites in Louisiana and Washington State

Datasets are replicated at other LIGO sites Scientists analyze the data and publish their results,

which may be replicated Currently LIGO stores more than 40 million files across

ten locations

The Lightweight Data Replicator LIGO scientists developed the Lightweight Data

Replicator (LDR) System for data management Built on top of standard Grid data services:

Globus Replica Location Service GridFTP data transport protocol

LDR provides a rich set of data management functionality, including

a pull-based model for replicating necessary files to a LIGO site

efficient data transfer among LIGO sites a distributed metadata service architecture an interface to local storage systems a validation component that verifies that files on a storage

system are correctly registered in a local RLS catalog

LIGO Data Publication and Replication

Two types of data publishing

1. Detectors at Livingston and Hanford produce data sets

Approx. a terabyte per day during LIGO experimental runs Each detector produces a file every 16 seconds Files range in size from 1 to 100 megabytes Data sets are copied to main repository at CalTech, which

stores them in tape-based mass storage system LIGO sites can acquire copies from CalTech or one another

2. Scientists also publish new or derived data sets as they perform analysis on existing data sets

E.g., data filtering or calibration may create new files These new files may also be replicated at LIGO sites

Some Terminology A logical file name (LFN) is a unique identifier for the

contents of a file Typically, a scientific collaboration defines and manages

the logical namespace Guarantees uniqueness of logical names within that

organization

A physical file name (PFN) is the location of a copy of the file on a storage system.

The physical namespace is managed by the file system or storage system

The LIGO environment currently contains: More than 25 million unique logical files More than 145 million physical files stored at ten sites

Components at Each LDR Site Local storage system GridFTP server for file transfer Metadata Catalog: associations

between logical file names and metadata attributes

Replica Location Service: Local Replica Catalog (LRCs)

stores mappings from logical names to storage locations

Replica Location Index (RLI) collects state summaries from LRCs

Scheduler and transfer daemons Prioritized queue of requested

files

Local Replica Catalog

Replica Location

Index

GridFTP Server

Metadata Catalog

MySQL Database

Scheduler Daemon

Transfer Daemon

Prioritized List of

Requested Files

Site Storage System

LDR Data Publishing Scheduling daemon runs at each LDR site

Queries site’s metadata catalog to identify logical files with specified metadata attributes

Checks RLS Local Replica Catalog to determine whether copies of those files already exist locally

If not, puts logical file names on priority-based scheduling queue

Transfer daemon also runs at each site Checks queue and initiates data transfers in priority order Queries RLS Replica Location Index to find sites where

desired files exists Randomly selects source file from among available replicas Use GridFTP transport protocol to transfer file to local site Registers newly-copied file in RLS Local Replica Catalog

Generalizing the LDR Publication Scheme

Want to provide a similar capability that is Independent of LIGO infrastructure Useful for a variety of application domains

Capabilities include: Interface to specify which files are required at local site Use of Globus RLS to discover whether replicas exist locally

and where they exist in the Grid Use of a selection algorithm to choose among available

replicas Use of Globus Reliable File Transfer service and GridFTP

data transport protocol to copy data to local site Use of Globus RLS to register new replicas

Relationship to Other Globus Services

At requesting site, deploy:

WS-RF Services Data Replication Service Delegation Service Reliable File Transfer

Service

Pre WS-RF Components

Replica Location Service (Local Replica Catalog, Replica Location Index)

GridFTP Server

Web Service Container

Data Replication

Service

Replicator Resource

Reliable File

Transfer Service

RFT Resource

Local Replica Catalog

Replica Location

Index

GridFTP Server

Delegation Service

Delegated Credential

Local Site

DRS Functionality Initiate a DRS Request Create a delegated credential Create a Replicator resource Monitor Replicator resource Discover replicas of desired files in RLS, select among replicas Transfer data to local site with Reliable File Transfer Service Register new replicas in RLS catalogs Allow client inspection of DRS results Destroy Replicator resource

DRS implemented in Globus Toolkit Version 4, complies with Web Services Resource Framework (WS-RF)

WSRF in a Nutshell Service State Management:

Resource Resource Property

State Identification: Endpoint Reference

State Interfaces: GetRP, QueryRPs,

GetMultipleRPs, SetRP Lifetime Interfaces:

SetTerminationTime ImmediateDestruction

Notification Interfaces Subscribe Notify

ServiceGroups

RPs

Resource

ServiceGetRP

GetMultRPs

SetRP

QueryRPs

Subscribe

SetTermTime

Destroy

EPREPR

EPR

Service Container

Create Delegated Credential

Client

Delegation

Data Rep.

RFTReplicaIndex

ReplicaCatalog

GridFTPServer

GridFTPServer

ReplicaCatalog

ReplicaCatalog Replica

Catalog

MDS

Credential

RP

proxy

•Initialize user proxy cert.

•Create delegated credential resource•Set termination time

•Credential EPR returnedEPR

Service Container

Create Replicator Resource

Client

Delegation

Data Rep.

RFTReplicaIndex

ReplicaCatalog

GridFTPServer

GridFTPServer

ReplicaCatalog


Catalog

MDS

Credential

RP

•Create Replicator resource•Pass delegated credential EPR•Set termination time

•Replicator EPR returned

EPRReplicator

RP

•Access delegated credential resource

Service Container

Monitor Replicator Resource

Client

Delegation

Data Rep.

RFTReplicaIndex

ReplicaCatalog

GridFTPServer

GridFTPServer

ReplicaCatalog


Catalog

MDS

Credential

RP

Replicator

RP

•Periodically polls Replicator RP via GetRP or GetMultRP

•Add Replicator resource to MDS Information service Index

Index

RP

•Subscribe to ResourceProperty changes for “Status” RP and “Stage” RP

•Conditions may trigger alerts or other actions (Trigger service not pictured)

EPR

Service Container

Query Replica Information

Client

Delegation

Data Rep.

RFTReplicaIndex

ReplicaCatalog

GridFTPServer

GridFTPServer

ReplicaCatalog


Catalog

MDS

Credential

RP

Replicator

RP

Index

RP

•Notification of “Stage” RP value changed to “discover”

•Replicator queries RLS Replica Index to find catalogs that contain desired replica information

•Replicator queries RLS Replica Catalog(s) to retrieve mappings from logical name to target name (URL)

Service Container

Transfer Data

Client

Delegation

Data Rep.

RFTReplicaIndex

ReplicaCatalog

GridFTPServer

GridFTPServer

ReplicaCatalog


Catalog

MDS

Credential

RP

Replicator

RP

Index

RP

•Notification of “Stage” RP value changed to “transfer”

•Create Transfer resource•Pass credential EPR•Set Termination Time•Transfer resource EPR returned

Transfer

RP

EPREPR

•Access delegated credential resource

•Setup GridFTP Server transfer of file(s)

•Data transfer between GridFTP Server sites

•Periodically poll “ResultStatus” RP via GetRP•When “Done”, get state information for each file transfer

Service Container

Register Replica Information

Client

Delegation

Data Rep.

RFTReplicaIndex

ReplicaCatalog

GridFTPServer

GridFTPServer

ReplicaCatalog


Catalog

MDS

Credential

RP

Replicator

RP

Index

RP

•Notification of “Stage” RP value changed to “register”

•RLS Replica Catalog sends update of new replica mappings to the Replica Index

Transfer

RP•Replicator registers new file mappings in RLS Replica Catalog

Service Container

Client Inspection of State

Client

Delegation

Data Rep.

RFTReplicaIndex

ReplicaCatalog

GridFTPServer

GridFTPServer

ReplicaCatalog


Catalog

MDS

Credential

RP

Replicator

RP

Index

RP

•Notification of “Status” RP value changed to “Finished” Transfer

RP

•Client inspects Replicator state information for each replication in the request

Service Container

Resource Termination

Client

Delegation

Data Rep.

RFTReplicaIndex

ReplicaCatalog

GridFTPServer

GridFTPServer

ReplicaCatalog


Catalog

MDS

Credential

RP

Replicator

RP

Index

RP

•Termination time (set by client) expires eventually

Transfer

RP•Resources destroyed (Credential, Transfer, Replicator)

TIME

Performance Measurements: Wide Area Testing

The destination for the pull-based transfers is located in Los Angeles

Dual-processor, 1.1 GHz Pentium III workstation with 1.5 GBytes of memory and a 1 Gbit Ethernet

Runs a GT4 container and deploys services including RFT and DRS as well as GridFTP and RLS

The remote site where desired data files are stored is located at Argonne National Laboratory in Illinois

Dual-processor, 3 GHz Intel Xeon workstation with 2 gigabytes of memory with 1.1 terabytes of disk

Runs a GT4 container as well as GridFTP and RLS services

DRS Operations Measured

Create the DRS Replicator resource Discover source files for replication using local RLS

Replica Location Index and remote RLS Local Replica Catalogs

Initiate an Reliable File Transfer operation by creating an RFT resource

Perform RFT data transfer(s) Register the new replicas in the RLS Local Replica

Catalog

Experiment 1: Replicate 10 Files of Size 1 Gigabyte

Component of Operation Time (milliseconds)

Create Replicator Resource 317.0

Discover Files in RLS 449.0

Create RFT Resource 808.6

Transfer Using RFT 1186796.0

Register Replicas in RLS 3720.8

Data transfer time dominates Wide area data transfer rate of 67.4 Mbits/sec

Experiment 2: Replicate 1000 Files of Size 10 Megabytes

Component of Operation Time (milliseconds)

Create Replicator Resource 1561.0

Discover Files in RLS 9.8

Create RFT Resource 1286.6

Transfer Using RFT 963456.0

Register Replicas in RLS 11278.2

Time to create Replicator and RFT resources is larger Need to store state for 1000 outstanding transfers

Data transfer time still dominates Wide area data transfer rate of 85 Mbits/sec

Documents

Overview of GT4 Data Services. Globus Data Services Talk Outline Summarize capabilities and plans for data services in the Globus Toolkit Version 4.0.2