15
An Architectural Approach to Managing Data in Transit Micah Beck Director & Associate Professor Logistical Computing and Internetworking Lab Computer Science Department University of Tennessee DOE Data Management Workshop 3/17/2004

An Architectural Approach to Managing Data in Transit Micah Beck Director Associate Professor Logistical Computing and Internetworking Lab Computer Science

Embed Size (px)

DESCRIPTION

Interoperability via a Common Interface »Span heterogeneous physical resources, operating systems, local management schemes »Serve changing and unexpected application requirements; enable application autonomy »We measure success in terms of infrastructure deployment scalability In networks and distributed systems, this means number, distribution, global reach, spanning administrative domains… The Internet is the gold standard of infrastructure deployment scalability

Citation preview

Page 1: An Architectural Approach to Managing Data in Transit Micah Beck Director  Associate Professor Logistical Computing and Internetworking Lab Computer Science

An Architectural Approach to Managing Data in Transit

Micah Beck Director & Associate Professor

Logistical Computing and Internetworking LabComputer Science Department

University of Tennessee

DOE Data Management Workshop 3/17/2004

Page 2: An Architectural Approach to Managing Data in Transit Micah Beck Director  Associate Professor Logistical Computing and Internetworking Lab Computer Science

“Data in Transit”

» After being generated by an instrument or supercomputer

» Not stored in a permanent archive» Serving the diverse purposes of a community of

users and applications» Being transferred, processed and stored to meet

changing and unanticipated needs• Visualization• Data Mining• Collaboration• Distributed Computing

Page 3: An Architectural Approach to Managing Data in Transit Micah Beck Director  Associate Professor Logistical Computing and Internetworking Lab Computer Science

Interoperability via a Common Interface

» Span heterogeneous physical resources, operating systems, local management schemes

» Serve changing and unexpected application requirements; enable application autonomy

» We measure success in terms of infrastructure deployment scalability• In networks and distributed systems, this means

number, distribution, global reach, spanning administrative domains…

• The Internet is the gold standard of infrastructure deployment scalability

Page 4: An Architectural Approach to Managing Data in Transit Micah Beck Director  Associate Professor Logistical Computing and Internetworking Lab Computer Science

Layering as An Architectural Approach

» Abstractions at each layer can hide differences at lower layers

» Exposed approaches avoid creating overly complex mechanisms at lower layers

» The E2E Principle: Attributes of lower layers implemented on shared infrastructure enable deployment scalability• Generality: Serve diverse application needs,

model diverse lower layer resources• Weak semantics: Don’t give too much away at

one time!

Page 5: An Architectural Approach to Managing Data in Transit Micah Beck Director  Associate Professor Logistical Computing and Internetworking Lab Computer Science

The IP Network Stack

common interface (IP)

Physical

Link

Network

Transport

Application

Page 6: An Architectural Approach to Managing Data in Transit Micah Beck Director  Associate Professor Logistical Computing and Internetworking Lab Computer Science

IP’s Failure of Scalability

» Today, IP is failing as a common interface» The design of IP is out of date

• Application communities are more diverse• Link layer technologies violate IP assumptions

» Application communities are defining their own common interfaces for general resource sharing, deploying their own infrastructure (e.g. the Grid)

» Some networking communities have abandoned interoperability at the network layer between widely divergent link layer technologies (e.g. optical switching & IP)

Page 7: An Architectural Approach to Managing Data in Transit Micah Beck Director  Associate Professor Logistical Computing and Internetworking Lab Computer Science

The Transit Layer: A New Location for Interoperability

» Expand the link layer to a local layer to model transfer, storage and processing resources

» Insert a new transit layer between the local and network layers to implement a common interface to diverse technologies at the local layer

» Adopt a highly general common interface at the transit layer, providing a uniform view of all of the resources of the network node

» Build diverse network services on top of this common interface to model diverse application requirements

» “Locating Interoperability in the Network Stack”, Micah Beck & Terry Moore, UT-CS-04-520, Univ. of TN CS Dept Tech Rpt

Page 8: An Architectural Approach to Managing Data in Transit Micah Beck Director  Associate Professor Logistical Computing and Internetworking Lab Computer Science

The Transit Network Stack

common interface

Physical

Local

Network

Transport

Application

Transit

transfer storage processing

Page 9: An Architectural Approach to Managing Data in Transit Micah Beck Director  Associate Professor Logistical Computing and Internetworking Lab Computer Science

Transit Networking: A Unified View

“… memory locations … are just wires turned sideways in time”

Dan Hillis, 1982,Why Computer Science is No

Good

Page 10: An Architectural Approach to Managing Data in Transit Micah Beck Director  Associate Professor Logistical Computing and Internetworking Lab Computer Science

Logistical Networking: An Overlay Implementation of the Transit Layer

» Logistical Networking is an overlay implementation of transit layer functionality built on top of the IP network

» The Internet Backplane Protocol is the common transit layer interface for Logistical Networking

» Network nodes are IBP “depots” that run as user level processes, communicate using TCP/IP as well as other link and network layer protocols

» Depots also serve storage and processing resources to Logistical Networking clients

Page 11: An Architectural Approach to Managing Data in Transit Micah Beck Director  Associate Professor Logistical Computing and Internetworking Lab Computer Science

LN Tools and Deployment

» The Logistical Runtime System (LoRS) is a set of tools based on IBP that enable users to take advantage of the resources of IBP depots

» Logistical Distribution Network (LoDN) is a data directory, monitoring and management system

» The Logistical Backbone is a Resources Discovery service and global experimental IBP testbed• Over 35 TB of storage available• Over 300 depots in 21 countries• Leverages the resources of PlanetLab

» Additional depots deployed at ORNL & NERSC

Page 12: An Architectural Approach to Managing Data in Transit Micah Beck Director  Associate Professor Logistical Computing and Internetworking Lab Computer Science

L-Bone: August 2003 (20TB)

Page 13: An Architectural Approach to Managing Data in Transit Micah Beck Director  Associate Professor Logistical Computing and Internetworking Lab Computer Science

Example LN Applications

» Astrophysics: Terascale Supernova Initiative (A. Mezzacappa, ORNL; J. Blondin, NCSU)• Management of massive datasets

» Fusion Energy Research (S. Klasky, PPPL)• Streaming of simulation data during generation

» Viewset-Based Visualization• Prestaging & caching of distant data

» Content Distribution• Heroic data distribution problems (Linux ISOs)

» Multimedia Networking• Creation, mgt & delivery of high value content

Page 14: An Architectural Approach to Managing Data in Transit Micah Beck Director  Associate Professor Logistical Computing and Internetworking Lab Computer Science

LN Futures and Directions

» Storage• Implementation of file system services• Moving data through firewalls at line speed• QoS in highly controlled environments

» Networking• Interoperability at ultrascale• Advanced services (e.g. multicast)

» Computation• Offloading visualization to IBP depots• Developing sets of operations to support

application communities

Page 15: An Architectural Approach to Managing Data in Transit Micah Beck Director  Associate Professor Logistical Computing and Internetworking Lab Computer Science

Thank you!

[email protected]

http://loci.cs.utk.edu