Upload
hugh-barnett
View
213
Download
0
Embed Size (px)
Citation preview
Plethora: A Wide-Area Read-Write Storage Repository
Design Goals, Objectives, and Applications
Suresh Jagannathan, Christoph Hoffmann, Ananth GramaComputer Sciences, Purdue University.
Plethora: Design Goals
● To build a wide-area read-write object repository from semi-static peers for supporting a single seamless distributed storage resource.
● To support desirable features of end-user performance, global resource utilization, robustness, and application support.
Plethora: Motivation
● A number of applications require: – Large aggregate storage;– Supporting distributed access to data;– Collaborative operations on shared datasets;– Content-based retrieval;– Distributed services infrastructure; and– High degree of availability and robustness.
Such applications motivate design decisions in Plethora.
Trends in Storage Software
Trends in Storage Software
Sample Applications: GriPhyN
● The Grid Physics Network (GriPhyN) is a classic example of a large dataset that is accessed by a number of people.
● Data is generated at the rate of roughly 1 PB/year in the form of high-energy physics experiment readouts (each experiment corresponds to roughly a MB of data).
● Researchers across the world access selected experiments.
Sample Applications: GriPhyN
Tier 0 is the data source (in this case CERN), tier 1 is a national center (Fermi Labs), tier 2 are regional centers, tier 3 consists of workgroup servers, tier 4 are individual desktops.
Sample Application: Collaborative Design
● The volume of data associated with typical product lifecycle stages (concept, design, analysis, manufacturing, and field support) grows exponentially.
● At the same time, the need for effective data access, sharing, capture, and protection becomes increasingly important.
● Scalable and distributed solutions to these problems are critical components of PLM.
Sample Application: Collaborative Design
● Desirable Characteristics:– Complexity and Interoperability– Distributed Design Collaboration– Reuse and Versioning– Availability– Performance
Collaborative Design: State-of-the-art
Source: www.netapp.com
Collaborative Design: State-of-the-art
● Client-server model ideal for local area environments. Does not scale to larger number of installations.
● Mechanisms for availability rely on conventional mechanisms such as snapshots. These do not facilitate real-time recovery or account for network failures.
● Minimal support for end-user performance in terms of client-side support.
● Little or no support in terms of content-based location, application-specific consistency mechanisms, versioning techniques.
Plethora: System Overview
● Plethora Routing Core: Routing data requests to appropriate sites.
● Robustness: Novel erasure coding schemes.● Versioning Semantics: Supporting read-write
access efficiently and data reuse over wide-area networks.
● Content-based Location and Placement: Routing queries on content.
Plethora Routing Core
● Design goals: reliability, performance, end-user latency.– Locality enhancing multi-level overlays of
participating sites.– Efficient caching techniques for end-user
latency.– Network maintenance via redundant overlay
links and real time monitoring and updation.
Plethora: Robustness
● Novel erasure coding techniques:– Conventional (n,m) techniques can reconstruct
data if any m of n total data blocks can be accessed.
– These techniques are resilient to multiple network and disk failures.
– These techniques, however, have considerable communication and computing overhead for block updates, block reconstruction, and for reconstituting the code.
– Plethora relies on novel codes that minimize these overheads.
Plethora: Versioning Semantics● Scaling to wide-area systems require alternate
concurrent data access semantics. Plethora relies on versioning semantics to facilitate performance.
– Each access is to a version of an object.– Updates to objects are not reflected globally
unless they are committed.– The resulting version tree for each object can be
reconciled in an application specific manner.● Versioning systems are ideally suited to high
latency environment with real-time applications. They also facilitate version-based data reuse.
Plethora: Content-Based Location
● Content-based location is critical for supporting design applications.– Each data object has keys corresponding to
searchable attributes installed in the Plethora routing core (keys are derived using conventional hashing techniques).
– The routing core is then used to route queries generated at clients (using the same hash function) to locate data objects.
– By giving applications the ability to install keys, powerful content-based searching capability supported by Plethora.
Plethora: Deliverables
● Fully functional Plethora client.● Extensive system-level and application-level
scaling studies and performance characterization (simulations and deployment).
● Sample applications demonstrating large storage capabilities, access performance, collaboration facilities, and mobile applications that maximize value for sponsor.