26
ALCF Argonne Leadership Computing Facility GridFTP Roadmap Bill Allcock (on behalf of the GridFTP team) Argonne National Laboratory

GridFTP Roadmap

  • Upload
    yori

  • View
    34

  • Download
    1

Embed Size (px)

DESCRIPTION

GridFTP Roadmap. Bill Allcock (on behalf of the GridFTP team) Argonne National Laboratory. Usability & Performance Packaging GridFTP as RPM GWFTP GridFTP GUI Automatic Firewall Traversal Sync feature for globus-url-copy. Packaging GridFTP as RPM. - PowerPoint PPT Presentation

Citation preview

Page 1: GridFTP Roadmap

ALCF Argonne Leadership Computing Facility

GridFTP Roadmap

Bill Allcock (on behalf of the GridFTP team)Argonne National Laboratory

Page 2: GridFTP Roadmap

Argonne National Laboratory

– Usability & PerformancePackaging GridFTP as RPMGWFTPGridFTP GUIAutomatic Firewall TraversalSync feature for globus-url-copy

Page 3: GridFTP Roadmap

Argonne National Laboratory

Packaging GridFTP as RPM

Modify packaging of GridFTP and its dependencies

Make it suitable for packaging as an RPMMake it compatible with major Linux

distribution standardsEventually some distribution might pick it upGridFTP available as part of standard Linux

distribution– Attract a whole new set of users– Put it in par with scp, standard ftp in terms of

availability

Page 4: GridFTP Roadmap

Argonne National Laboratory

GridFTP Where there’s FTP (GWFTP)

GridFTP has been in existence for some time and has proven to be quite robust and useful

Only few GridFTP clients availableFTP has innumerable clientsGWFTP - created to leverage the FTP clientsA proxy between FTP clients and GridFTP

servers

Page 5: GridFTP Roadmap

Argonne National Laboratory

GWFTP

FTP Client

GWFTP(GSI

Credential)

wiggum.mcs.anl.govGridFTP Server

(2811)

USER <GWFTP username> ::gsiftp://wiggum.mcs.anl.gov:2811/

PASS GSI Authentication

Get requestGet request

DataData

Page 6: GridFTP Roadmap

Argonne National Laboratory

GUI Client

08/14/2008Computation Institute

Page 7: GridFTP Roadmap

Argonne National Laboratory

GridFTP GUI

A Java Web Start Application– Updates automatically– Users always use the latest release

Transfer files and directoriesThird-party transferMultiple concurrent transfersSupport authentication through MyProxyManage local and remote files and directories

– Browse– Create and delete

Page 8: GridFTP Roadmap

Argonne National Laboratory

Automatic Firewall Traversal

• Control channel port is statically assigned

• Data channel ports are dynamically assignedGridFTP Protocol Changes

• New commands to communicate the 4 tuple (src ip, src port, dst ip, dst port) to both ends of transfer • Use simultaneous Open/TCP splicing or Use a

broker to open ports temporarily

• Hooks in GridFTP to contact a broker at the right time

Page 9: GridFTP Roadmap

Argonne National Laboratory

Firewall

GridFTPSourceServer

GridFTPDest

Server

Client

TCP 2811TCP 2811

DATA

Page 10: GridFTP Roadmap

Argonne National Laboratory

Automatic traversal using a connection Broker

GridFTPSourceServer

GridFTPDest

Server

Client

TCP 2811TCP 2811

CB CB

DATA

IP 4 tuple IP 4 tupleTemporary hole Temporary hole

Page 11: GridFTP Roadmap

Argonne National Laboratory

Sync feature for globus-url-copyCheck for the existence of a file at the

destination before transferring If exists, determine whether the source

version is different from that of the destination

Based on how much the source has changed, optimize the transfer

Research into developing a logic that does not involve any changes to the GridFTP protocol

Page 12: GridFTP Roadmap

Argonne National Laboratory

– Reliability & SecurityImproved restart mechanismImproved memory management algorithm

Load balancingData channel security for SSH based GridFTP

GUMS authorization callout

Page 13: GridFTP Roadmap

Argonne National Laboratory

Improved Restart Mechanism

• globus-url-copy can recover from server and network failures

• Can not recover from its own failure

• Number of users including ESG, APS and SNS use this client to transfer large data sets with complex directory structures

• Develop methods to enable globus-url-copy to recover from its failure

Page 14: GridFTP Roadmap

Argonne National Laboratory

Gfork architecture

Server Host

GFork

Server

GridFTP

Plugin

GridFTP Server

Instance

Fork

GridFTP Server

InstanceGridFTP Server

Instance

State Sharing Link

Client

Inherited

Links

Control Channel Connections

Client

Client

Page 15: GridFTP Roadmap

Argonne National Laboratory

Memory Management

Optimistic memory provisioning by operating system– possible that under heavy loads GridFTP server can

consume all of systems memory resources.

Gfork – xinted like super server daemon– Allows state to be maintained across connections

GridFTP plugin for Gfork has a simple memory limiting option– 90% of the memory to the first 10% of the allowed

connections– Remaining connections receive half of what is

available

Develop an improved memory management algorithm

Page 16: GridFTP Roadmap

Argonne National Laboratory

Load balancing capabilities

The separation of processes buys the ability to proxy

– Allows for load balancing

– Frontend can choose from a pool of DPIs to service a client request

Client DPI

IPC

DPI

Frontend DPI

DPI

Page 17: GridFTP Roadmap

Argonne National Laboratory

sshd

SSH based GridFTP (GridFTP-

Lite)

Client

GridFTP Server

2811

Port 22

ROOT

USER

ssh Stdin/out

(control channel)

Page 18: GridFTP Roadmap

Argonne National Laboratory

Data Channel Security for SSH based GridFTPSSH based GridFTP does not have data

channel security Investigate and prototype a way to let a client

send a shared secret to both source and destination GridFTP servers • Used to secure the data channel(s) between the

two servers • Shared secret can be used to authenticate,

integrity-protect and encrypt the data channel• This feature will increase the adoption of SSH based

GridFTP

Page 19: GridFTP Roadmap

Argonne National Laboratory

GUMS Authorization Callout

• GUMS – Grid User Management System– Grid identity mapping service– Maps grid identity to local site identity– Used in OSG

GUMSserver

3. Obtain local identity from GUMS server/DC=org/DC=doegrids/OU=People/CN=John Bresnahanz

bresnaha

GridFTPClient

GUMS callout

1. Authentication

2. Data transfer operations

Disk

4. Access data as local identity

Page 20: GridFTP Roadmap

Argonne National Laboratory

GUMS Authorization Callout

• Role based authorization using voms extended proxy

GUMSserver

3. Obtain local identity from GUMS server/DC=org/DC=doegrids/OU=People/CN=John Bresnahanz

usatlasdev

GridFTPClient

GUMS callout

1. Authentication

2. Data transfer operations

Disk

4. Access data as local identity

/VO=ATLAS/Group=USATLAS/Role=developer

Page 21: GridFTP Roadmap

Argonne National Laboratory

– Quality of ServiceInformation providerProvision end-point GridFTP resources

Integrate network provisioningIntegrate storage provisioningCo-schedule data transfer resources

Page 22: GridFTP Roadmap

Argonne National Laboratory

GridFTP information provider service– Max connections– Open connections – Load

Higher level services can utilize this information for scheduling data transfers– Help with selecting the appropriate

replica of data

Information Provider

Page 23: GridFTP Roadmap

Argonne National Laboratory

Provision end-point resources

GridFTPServer

GridFTPInfo

Provider

CPU Memory BW

ResourceLimiter

Ad

ControlChannel

Data Movement Service (RFT replacement)

Data Point

GFTPResource

BrokerProvisionG

ridFTP

Page 24: GridFTP Roadmap

Argonne National Laboratory

Integrate Network Provisioning

GridFTPServer

GridFTPInfo

Provider

CPU Memory BW

ResourceLimiter

Ad

ControlChannel

Data Movement Service

Data Point

GFTPResource

BrokerProvisionG

ridFTP

Network Reservation

Service

ReserveBandwidth

Bandwidth

Token

Page 25: GridFTP Roadmap

Argonne National Laboratory

Integrate Storage Provisioning

GridFTPServer

GridFTPInfo

Provider

CPU Memory BW

ResourceLimiter

Ad

ControlChannel

Data Movement Service

Data Point

GFTPResource

BrokerProvisionG

ridFTP

Network Reservation

Service

ProvisionBandwidth

Bandwidth

Token

FileSystem

Lotman

Provision

Storage

Page 26: GridFTP Roadmap

Argonne National Laboratory

Co-schedule Data Transfer Resources

Data Movement Service

Network Reservation

ServiceProvision

Bandwidth

Source Data Point

Destination Data Point

Prov

ision

Grid

FTP

and

Stor

age

reso

urce

sProvision GridFTP and

Storage resources