17
Plethora: A Wide-Area Read-Write Storage Repository Design Goals, Objectives, and Applications Suresh Jagannathan, Christoph Hoffmann, Ananth Grama Computer Sciences, Purdue University.

Plethora: A Wide-Area Read-Write Storage Repository Design Goals, Objectives, and Applications Suresh Jagannathan, Christoph Hoffmann, Ananth Grama Computer

Embed Size (px)

Citation preview

Page 1: Plethora: A Wide-Area Read-Write Storage Repository Design Goals, Objectives, and Applications Suresh Jagannathan, Christoph Hoffmann, Ananth Grama Computer

Plethora: A Wide-Area Read-Write Storage Repository

Design Goals, Objectives, and Applications

Suresh Jagannathan, Christoph Hoffmann, Ananth GramaComputer Sciences, Purdue University.

Page 2: Plethora: A Wide-Area Read-Write Storage Repository Design Goals, Objectives, and Applications Suresh Jagannathan, Christoph Hoffmann, Ananth Grama Computer

Plethora: Design Goals

● To build a wide-area read-write object repository from semi-static peers for supporting a single seamless distributed storage resource.

● To support desirable features of end-user performance, global resource utilization, robustness, and application support.

Page 3: Plethora: A Wide-Area Read-Write Storage Repository Design Goals, Objectives, and Applications Suresh Jagannathan, Christoph Hoffmann, Ananth Grama Computer

Plethora: Motivation

● A number of applications require: – Large aggregate storage;– Supporting distributed access to data;– Collaborative operations on shared datasets;– Content-based retrieval;– Distributed services infrastructure; and– High degree of availability and robustness.

Such applications motivate design decisions in Plethora.

Page 4: Plethora: A Wide-Area Read-Write Storage Repository Design Goals, Objectives, and Applications Suresh Jagannathan, Christoph Hoffmann, Ananth Grama Computer

Trends in Storage Software

Page 5: Plethora: A Wide-Area Read-Write Storage Repository Design Goals, Objectives, and Applications Suresh Jagannathan, Christoph Hoffmann, Ananth Grama Computer

Trends in Storage Software

Page 6: Plethora: A Wide-Area Read-Write Storage Repository Design Goals, Objectives, and Applications Suresh Jagannathan, Christoph Hoffmann, Ananth Grama Computer

Sample Applications: GriPhyN

● The Grid Physics Network (GriPhyN) is a classic example of a large dataset that is accessed by a number of people.

● Data is generated at the rate of roughly 1 PB/year in the form of high-energy physics experiment readouts (each experiment corresponds to roughly a MB of data).

● Researchers across the world access selected experiments.

Page 7: Plethora: A Wide-Area Read-Write Storage Repository Design Goals, Objectives, and Applications Suresh Jagannathan, Christoph Hoffmann, Ananth Grama Computer

Sample Applications: GriPhyN

Tier 0 is the data source (in this case CERN), tier 1 is a national center (Fermi Labs), tier 2 are regional centers, tier 3 consists of workgroup servers, tier 4 are individual desktops.

Page 8: Plethora: A Wide-Area Read-Write Storage Repository Design Goals, Objectives, and Applications Suresh Jagannathan, Christoph Hoffmann, Ananth Grama Computer

Sample Application: Collaborative Design

● The volume of data associated with typical product lifecycle stages (concept, design, analysis, manufacturing, and field support) grows exponentially.

● At the same time, the need for effective data access, sharing, capture, and protection becomes increasingly important.

● Scalable and distributed solutions to these problems are critical components of PLM.

Page 9: Plethora: A Wide-Area Read-Write Storage Repository Design Goals, Objectives, and Applications Suresh Jagannathan, Christoph Hoffmann, Ananth Grama Computer

Sample Application: Collaborative Design

● Desirable Characteristics:– Complexity and Interoperability– Distributed Design Collaboration– Reuse and Versioning– Availability– Performance

Page 10: Plethora: A Wide-Area Read-Write Storage Repository Design Goals, Objectives, and Applications Suresh Jagannathan, Christoph Hoffmann, Ananth Grama Computer

Collaborative Design: State-of-the-art

Source: www.netapp.com

Page 11: Plethora: A Wide-Area Read-Write Storage Repository Design Goals, Objectives, and Applications Suresh Jagannathan, Christoph Hoffmann, Ananth Grama Computer

Collaborative Design: State-of-the-art

● Client-server model ideal for local area environments. Does not scale to larger number of installations.

● Mechanisms for availability rely on conventional mechanisms such as snapshots. These do not facilitate real-time recovery or account for network failures.

● Minimal support for end-user performance in terms of client-side support.

● Little or no support in terms of content-based location, application-specific consistency mechanisms, versioning techniques.

Page 12: Plethora: A Wide-Area Read-Write Storage Repository Design Goals, Objectives, and Applications Suresh Jagannathan, Christoph Hoffmann, Ananth Grama Computer

Plethora: System Overview

● Plethora Routing Core: Routing data requests to appropriate sites.

● Robustness: Novel erasure coding schemes.● Versioning Semantics: Supporting read-write

access efficiently and data reuse over wide-area networks.

● Content-based Location and Placement: Routing queries on content.

Page 13: Plethora: A Wide-Area Read-Write Storage Repository Design Goals, Objectives, and Applications Suresh Jagannathan, Christoph Hoffmann, Ananth Grama Computer

Plethora Routing Core

● Design goals: reliability, performance, end-user latency.– Locality enhancing multi-level overlays of

participating sites.– Efficient caching techniques for end-user

latency.– Network maintenance via redundant overlay

links and real time monitoring and updation.

Page 14: Plethora: A Wide-Area Read-Write Storage Repository Design Goals, Objectives, and Applications Suresh Jagannathan, Christoph Hoffmann, Ananth Grama Computer

Plethora: Robustness

● Novel erasure coding techniques:– Conventional (n,m) techniques can reconstruct

data if any m of n total data blocks can be accessed.

– These techniques are resilient to multiple network and disk failures.

– These techniques, however, have considerable communication and computing overhead for block updates, block reconstruction, and for reconstituting the code.

– Plethora relies on novel codes that minimize these overheads.

Page 15: Plethora: A Wide-Area Read-Write Storage Repository Design Goals, Objectives, and Applications Suresh Jagannathan, Christoph Hoffmann, Ananth Grama Computer

Plethora: Versioning Semantics● Scaling to wide-area systems require alternate

concurrent data access semantics. Plethora relies on versioning semantics to facilitate performance.

– Each access is to a version of an object.– Updates to objects are not reflected globally

unless they are committed.– The resulting version tree for each object can be

reconciled in an application specific manner.● Versioning systems are ideally suited to high

latency environment with real-time applications. They also facilitate version-based data reuse.

Page 16: Plethora: A Wide-Area Read-Write Storage Repository Design Goals, Objectives, and Applications Suresh Jagannathan, Christoph Hoffmann, Ananth Grama Computer

Plethora: Content-Based Location

● Content-based location is critical for supporting design applications.– Each data object has keys corresponding to

searchable attributes installed in the Plethora routing core (keys are derived using conventional hashing techniques).

– The routing core is then used to route queries generated at clients (using the same hash function) to locate data objects.

– By giving applications the ability to install keys, powerful content-based searching capability supported by Plethora.

Page 17: Plethora: A Wide-Area Read-Write Storage Repository Design Goals, Objectives, and Applications Suresh Jagannathan, Christoph Hoffmann, Ananth Grama Computer

Plethora: Deliverables

● Fully functional Plethora client.● Extensive system-level and application-level

scaling studies and performance characterization (simulations and deployment).

● Sample applications demonstrating large storage capabilities, access performance, collaboration facilities, and mobile applications that maximize value for sponsor.