Upload
lewis-thomas
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
3 August 2005 The Ohio State University 2
Outline
Introduction Features Motivation Architecture Globus XIO Experimental Results
3 August 2005 The Ohio State University 3
What is GridFTP?
In Grid environments, access to distributed data is very important
Distributed scientific and engineering applications require: Transfers of large amounts of data between storage
systems, and Access to large amounts of data by many geographically
distributed applications and users for analysis, visualization etc
GridFTP - a general-purpose mechanism for secure, reliable and high performance data movement.
3 August 2005 The Ohio State University 4
Features Standard FTP features get/put etc, third
party control of data transfer User at one site to initiate, monitor and
control data transfer operation between two other sites
Authentication, data integrity, data confidentiality Support Generic Security Services (GSS)-API
authentication of the connections Support user-controlled levels of data
integrity and/or confidentiality
3 August 2005 The Ohio State University 5
Features
Parallel data transfer Multiple transport streams between a single
source and destination Striped data transfer
1 or more transport streams between m network endpoints on the sending side and n network endpoints on the receiving side
3 August 2005 The Ohio State University 6
Features
Partial file transfer Some applications can benefit from
transferring portions of file For example, analyses that require access to
subsets of massive object-oriented database files
Manual/Automatic control of TCP buffer sizes
Support for reliable and restartable data transfer
3 August 2005 The Ohio State University 7
Motivation
Continued commoditization of end system devices means the data sources and sinks are often clusters A common configuration might have
individual nodes connected by 1 Gbit/s Ethernet connection to a switch connected to external network at 10 Gbit/s or faster
Striped data movement - data distributed across a set of computers at one end is transferred to remote set of computers
3 August 2005 The Ohio State University 8
Motivation
Data sources and sinks come in many shapes and sizes Clusters with local disks, clusters with
parallel file system, geographically distributed data sources, archival storage systems
Enable clients to access such sources and sinks via a uniform interface
Make it easy to adapt our system to support different sources and sinks
3 August 2005 The Ohio State University 9
Motivation Standard protocol for network data transfer
remains TCP. TCP’s congestion control algorithm can lead to poor
performance Particularly in default configurations and on paths
with high bandwidth and high round trip time Solutions include:
Careful tuning of TCP parameters, multiple TCP connections and substitution of alternate UDP based reliable protocols
We want to support such alternatives
3 August 2005 The Ohio State University 10
Architecture GridFTP (and normal FTP) use (at least) two separate
socket connections: A control channel for carrying the commands and
responses A Data Channel for actually moving the data
GridFTP (and normal FTP) has 3 distinct components: Client and server protocol interpreters which handle
control channel protocol Data Transfer Process which handles the accessing of
actual data and its movement via the data channel
3 August 2005 The Ohio State University 11
Architecture
These components can be combined in various ways to create servers with different capabilities. For example, combining the server PI and DTP
components in one process creates a conventional FTP server
A striped server might use one server PI on the head node of a cluster and a DTP on all other nodes.
3 August 2005 The Ohio State University 12
Globus GridFTP Architecture
D a t a C h a n n e l
S e r v e r P I
D T P
D e s c r i p t i o n o f t r a n s f e r : c o m p l e t e l y
s e r v e r - i n t e r n a l c o m m u n i c a t i o n .
P r o t o c o l i s u n s p e c i f i e d a n d l e f t u p
t o t h e i m p l e m e n t a t i o n .
S e r v e r P I
D T P
I n t e r n a l I P C A P I I n t e r n a l I P C A P I
C l i e n t P II n f o o n t r a n s f e r : r e s t a r t
m a r k e r s , p e r f o r m a n c e
m a r k e r s , e t c . S e r v e r P I
o p t i o n a l l y p r o c e s s e s
t h e s e , t h e n s e n d s
t h e m t o t h e c l i e n t P I
C o n t r o l
C h a n n e l s
3 August 2005 The Ohio State University 13
Architecture
The server PI handles the control channel exchange.
In order for a client to contact a GridFTP server Either the server PI must be running as a
daemon and listening on a well known port (2811 for GridFTP), or
Some other service (such as inetd) must be listening on the port and be configured to invoke the server PI.
The client PI then carries out its protocol exchange with the server PI.
3 August 2005 The Ohio State University 14
Architecture
When the server PI receives a command that requires DTP activity, the server PI passes the description of the transfer to DTP
After that, DTP can carry out the transfer on its own.
The server PI then simply acts as a relay for transfer status information. Performance markers, restart markers, etc.
3 August 2005 The Ohio State University 15
Architecture
PI-to-DTP communications are internal to the server Protocol used can evolve with no impact on the
client. The data channel communication structure is
governed by data layout. If the number of nodes at both ends is equal, each
node communicates with just one other node. Otherwise, each sender makes a connection to each
receiver, and sends data based on data offsets.
3 August 2005 The Ohio State University 16
MODE ESPAS (Listen) - returns list of host:port pairsSTOR <FileName>
MODE ESPOR (Connect) - connect to the host-port pairsRETR <FileName>
18-Nov-03
GridFTP Striped Transfer
Host Z
Host Y
Host A
Block 1
Block 5
Block 13
Block 9
Host B
Block 2
Block 6
Block 14
Block 10
Host C
Block 3
Block 7
Block 15
Block 11
Host D
Block 4
Block 8 - > Host D
Block 16
Block 12 -> Host D
Host X
Block1 -> Host A
Block 13 -> Host A
Block 9 -> Host A
Block 2 -> Host B
Block 14 -> Host B
Block 10 -> Host B
Block 3 -> Host C
Block 7 -> Host C
Block 15 -> Host C
Block 11 -> Host C
Block 16 -> Host D
Block 4 -> Host D
Block 5 -> Host A
Block 6 -> Host B
Block 8
Block 12
3 August 2005 The Ohio State University 17
Data transfer pipeline
D a t a
A c c e s s
M o d u l e
D a t a
P r o c e s s i n g
M o d u l e
D a t a
C h a n n e l
P r o t o c o l
M o d u l e
D a t a
s o u r c e
o r s i n k
D a t a
c h a n n e l
Data Transfer Process is architecturally, 3
distinct pieces: Data Access Module, Data processing module
and Data Channel Protocol Module
3 August 2005 The Ohio State University 18
Data access module
Number of storage systems in use by the scientific and engineering community Distributed Parallel Storage System (DPSS) High Performance Storage System (HPSS) Distributed File System (DFS) Storage Resource Broker (SRB) HDF5
Use incompatible protocols for accessing data and require the use of their own clients
3 August 2005 The Ohio State University 19
Data access module
It provides a modular pluggable interface to data storage systems.
Conceptually, the data access module is very simple.
It consist of several function signatures and a set of semantics.
When a new module is created, programmer implements the functions to provide the semantics associated with them.
3 August 2005 The Ohio State University 20
Data processing module
The data processing module - provides ability to manipulate the data prior to transmission. Compression, on-the-fly concatenation of
multiple files etc Currently handled via the data access
module In future we plan to make this a separate
module
3 August 2005 The Ohio State University 21
Data channel protocol module
This module handles the operations required to fetch data from, or send data to, the data channel. A single server may support multiple data channel
protocols the MODE command is used to select the protocol to
be used for a particular transfer. We use Globus eXtensible Input/Output (XIO)
system as the data channel protocol module interface Currently support two bindings: Stream-mode TCP
and Extended Block Mode TCP
3 August 2005 The Ohio State University 22
Extended block mode
Striped and parallel data transfer require support for out-of-order data delivery
Extended block mode supports out-of-sequence data delivery
Extended block mode header
Descriptor is used to indicate if this block is a EOD marker, restart marker etc
Descriptor
8 bits
Byte Count
64 bits
Offset
64 bits
3 August 2005 The Ohio State University 23
Globus XIO
Simple Open/Close/Read/Write API Make single API do many types of IO Many Grid IO needs can be treated as a
stream of bytes Open/close/read/write functionality
satisfies most requirements Specific drivers for specific
protocols/devices
3 August 2005 The Ohio State University 26
Experimental results
Three settings LAN - 0.2 msec RTT and 622 Mbit/s MAN - 2.2 msec RTT and 1 Gbit/s WAN - 60 msec RTT and 30 Gbit/s MAN - Distributed Optical Testbed in the
Chicago area WAN - TeraGrid link between NCSA in Illinois
and SDSC in California - each individual has a 1Gbit/s bottleneck link
3 August 2005 The Ohio State University 27
Experimental results - LAN
0
100
200
300
400
500
600
700
0 10 20 30 40
number of streams
bandwidth (Mbit/s)
iperf globus mem
globus disk bonnie
3 August 2005 The Ohio State University 28
Experimental results - MAN
0100200300400500600700800900
1000
0 10 20 30 40
number of streams
bandwidth (Mbit/s)
iperf globus mem
globus disk bonnie
3 August 2005 The Ohio State University 29
Experimental results - WAN
0
200
400
600
800
1000
1200
1400
0 10 20 30 40
number of streams
bandwidth (Mbit/s)
iperf globus mem
globus disk bonnie
3 August 2005 The Ohio State University 30
Memory to MemoryStriping Performance
0
5000
10000
15000
20000
25000
30000
0 10 20 30 40 50 60 70
Degree of Striping
Bandwidth (Mbit/s)
# Stream = 1 # Stream = 2 # Stream = 4 # Stream = 8 # Stream = 16 # Stream = 32
3 August 2005 The Ohio State University 31
Disk to Disk Striping Performance
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
0 10 20 30 40 50 60 70
Degree of Striping
Bandwidth (Mbit/s)
# Stream = 1 # Stream = 2 # Stream = 4 # Stream = 8 # Stream = 16 # Stream = 32
3 August 2005 The Ohio State University 32
Scalability tests
Evaluate performance as a function of the number of clients
DiPerf test framework to deploy the clients Ran server on a 2-processor 1125 MHz x86
machine running Linux 2.6.8.1 1.5 GB memory and 2 GB swap space 1 Gbit/s Ethernet network connection and 1500 B
MTU Clients created on hosts distributed over PlanetLab
and at the University of Chicago (UofC)
3 August 2005 The Ohio State University 33
Scalability Results
0
2 0 0
4 0 0
6 0 0
8 0 0
1 0 0 0
1 2 0 0
1 4 0 0
1 6 0 0
1 8 0 0
2 0 0 0
0 5 0 0 1 0 0 0 1 5 0 0 2 0 0 0 2 5 0 0 3 0 0 0 3 5 0 0
T im e ( s e c )
0
1 0
2 0
3 0
4 0
5 0
6 0
7 0
8 0
9 0
1 0 0
T h r o u g h p u t ( M b y t e / s ) _
_ R e s p o n s e T im e ( s )
_ L o a d ( c o n c u r r e n t c l ie n t s )
C P U % _
_ M e m o r y ( M b y t e s )
Left axis - load, response time, memory allocated
Right axis - Throughput and CPU %
3 August 2005 The Ohio State University 34
Scalability results
1800 clients mapped in a round robin fashion on 100 PlanetLab hosts and 30 UofC hosts
A new client created once a second and ran for 2400 seconds During this time, repeatedly requests the
transfer of 10 Mbyte file from server’s disk to client’s /dev/null
Total of 150.7 Gbytes transferred in 15,428 transfers