27
CSE 598D Storage Systems, Spring 2007 Object Based Storage Presented By: Kanishk Jain

CSE 598D Storage Systems, Spring 2007 Object Based Storage

  • Upload
    rhian

  • View
    35

  • Download
    1

Embed Size (px)

DESCRIPTION

CSE 598D Storage Systems, Spring 2007 Object Based Storage. Presented By: Kanishk Jain. Introduction. Object Based Storage ANSI T10 Object-based Storage Devices Standard - PowerPoint PPT Presentation

Citation preview

Page 1: CSE 598D Storage Systems, Spring 2007 Object Based Storage

CSE 598D Storage Systems, Spring 2007

Object Based Storage

Presented By: Kanishk Jain

Page 2: CSE 598D Storage Systems, Spring 2007 Object Based Storage

Introduction

Object Based Storage ANSI T10 Object-based Storage Devices

Standard storage object: a logical collection of bytes

on a storage device, with well-known methods for access, attributes describing characteristics of the data, and security policies that prevent unauthorized access.

“intelligent data layout”

Page 3: CSE 598D Storage Systems, Spring 2007 Object Based Storage

Object Storage Interface

OSD model is simply a rearrangement of existing data management functions

OSD is a level higher than block access but one level below file access

Page 4: CSE 598D Storage Systems, Spring 2007 Object Based Storage

Background – NAS sharing NAS being used to

share files among a number of clients

The files themselves may be stored on a fast SAN

The file server is used to intermediate all requests and thus becomes the bottleneck !

Page 5: CSE 598D Storage Systems, Spring 2007 Object Based Storage

Background – SAN sharing The files themselves

are stored on a fast SAN (e.g., iSCSI) to which the clients are also attached

While the file server is removed as a bottleneck, security is a concern !

Page 6: CSE 598D Storage Systems, Spring 2007 Object Based Storage

Object-based storage security architecture

Metadata managers grant capabilities to clients; clients present these capabilities to the devices on every I/O to ensure security

Secure separation of control and data path !

Page 7: CSE 598D Storage Systems, Spring 2007 Object Based Storage

Development of OSD

Most initial work on object storage devices (OSD) was done at Parallel Data Lab at CMU Focused on developing underlying

concepts in two closely related areas: NASD and Active Disks

Proposed as part of same project as NASD Standardized by Storage Networking

Industry Association (SNIA) in 2004.

Page 8: CSE 598D Storage Systems, Spring 2007 Object Based Storage

OSD v/s Active Disks OSD standard only talks about the

interface. It does not assume anything about the

processing power at the disk. OSD intelligence is software/firmware

running at the disk (no specifications for this)

Processing power of an OSD can be scaled to meet the requirements of the functions an active disk

Page 9: CSE 598D Storage Systems, Spring 2007 Object Based Storage

File System – Application side (User Component only)

The OSD has the intelligence to perform basic data management functions such as space allocation, free space management etc., those functions are no longer part of the application-side file system.

Thus the application side file system is reduced to a manager : an abstraction layer between user application and the OSD. Only provides security and backward

compatibility

Page 10: CSE 598D Storage Systems, Spring 2007 Object Based Storage

File System - On the Device (Storage Component) Workload offered to OSDs may be quite

different from that of general-purpose file systems

At the OSD level, objects typically have no logical relationship, presenting a flat name space

General-purpose file systems, which are usually optimized for workloads exhibiting relatively small variable-sized files, relatively small hierarchical directories, and some degree of locality are not effective in this case

Page 11: CSE 598D Storage Systems, Spring 2007 Object Based Storage

Object based File System

Separation of metadata and data paths: Separate metadata servers (MDS) manage the directory hierarchy, permissions and file to object mapping.

Distribution and replication of a file across a sequence of objects on many OSDs.

Example files systems: Lustre, Panasas, Ceph

Page 12: CSE 598D Storage Systems, Spring 2007 Object Based Storage
Page 13: CSE 598D Storage Systems, Spring 2007 Object Based Storage
Page 14: CSE 598D Storage Systems, Spring 2007 Object Based Storage
Page 15: CSE 598D Storage Systems, Spring 2007 Object Based Storage

Some Optimizations in Ceph Partitioning the directory tree: To efficiently balance load, the

MDS partition the directory tree across the cluster. A client guesses which metadata server is responsible for a file, and contacts that server to open the file. That MDS will forward the request to the correct MDS if necessary. Distribution and replication of a file across a sequence of objects on many OSDs.

Limit on object size and use of regions: Ceph limits objects to a maximum size (e. g., 1MB), so files are a sequence of bytes broken into chunks on the maximum object size boundary. Since only the MDS hold the directory tree, OSDs do not have directory information to suggest layout hints for file data. Instead, the OSDs organize objects into small and large object regions, using small block sizes (e. g., 4KB or 8KB) for small objects and large block sizes (e. g. 50–100% of the maximum object size) for large objects.

Use of a specialized mapping algorithm: A file handle returned by the metadata server describes which objects on which OSD contain the file data. A special algorithm, RUSH maps a sequence index to the OSD holding the object at that position in the sequence, distributing the objects in a uniform way.

Page 16: CSE 598D Storage Systems, Spring 2007 Object Based Storage

Possible Performance Results

OBFS outperforms Linux Ext2 and Ext3 by a factor of two or three, and while OBFS is 1/25 the size of XFS, it provides only slightly lower read performance and 10%–40% higher write performance

Page 17: CSE 598D Storage Systems, Spring 2007 Object Based Storage

Possible Performance Results (contd..)

Page 18: CSE 598D Storage Systems, Spring 2007 Object Based Storage

Database Storage Management Object attributes are also the key to giving

storage devices an awareness of how objects are being accessed, so that it can use this information to optimize disk layout specific to the application.

Database software often has very little detailed information about the storage subsystem

Previous research took the view that a storage device can provide relevant characteristics to applications

Device-specific information is known to the storage subsystem, and thus it is better-equipped to manage low-level storage tasks

Page 19: CSE 598D Storage Systems, Spring 2007 Object Based Storage

Database Storage Management (contd..) Object attributes can contain information

about the expected behavior of an object such as expected read/write ratio, access pattern (sequential vs. random), or expected size, dimension, and content of the object.

Using OSD, a DBMS can inform the storage subsystem of the geometry of a relation, thereby passing responsibility for low-level data layout to the storage device.

The dependency between the metadata and storage system/application is removed. This assists with data sharing between different storage applications

Page 20: CSE 598D Storage Systems, Spring 2007 Object Based Storage

OSD Objects and Attributes

Page 21: CSE 598D Storage Systems, Spring 2007 Object Based Storage

ScalabilityScalability – what does that word really mean : Capacity: number of bytes, number of objects, number of files, …

etc. OSD aggregation techniques will allow for hierarchical representations of more complex objects that consist of larger numbers of smaller objects.

Performance: Bandwidth, Transaction rate, Latency. OSD performance management can be used in conjunction with OSD aggregation techniques to more effectively scale each of these three performance metrics and maintain required QoS levels on a per-object basis.

Connectivity: number of disks, hosts, arrays, …etc. Since the OSD model requires self-managed devices and is transport agnostic the number of OSDs and hosts can grow to the size limits of the transport network.

Geographic: LAN, SAN, WAN, …etc. Again, since the OSD model is transport agnostic and since there is a security model built into the OSD architecture, the geographic scalability is not bounded.

Processing Power: OSD processing power can be scaled.

Page 22: CSE 598D Storage Systems, Spring 2007 Object Based Storage

Other Advantages Manageability: OSD management model relies on self-

managed, policy driven storage devices, that can be centrally managed and locally administered (i.e. central policies, local execution).

Density: OSD on individual storage devices can optimize densities by abstracting the physical characteristics of the underlying storage medium

Cost: address issues such as $/MB, $/sqft, $/IOP, $/MB/sec, TCO, …etc.

Adaptability: to changing applications. Can the OSD be repurposed to different uses such as from a film editing station to mail serving?

Capability: can add functionality for different applications. Can additional functionality be added to an OSD to increase its usefulness?

Page 23: CSE 598D Storage Systems, Spring 2007 Object Based Storage

Other Advantages (contd..) Availability: Fail-over capabilities between cooperating

OSD devices. 2-way failover versus N-way failover? Reliability: Connection-integrity capabilities Serviceability: Remote monitoring, remote servicing,

hot-plug capability, genocidal sparing. When an OSD dies and a new one is put in it’s place, how does it get “rebuilt”? How automated is the service process?

Interoperability: Supported by many OS vendors, file system vendors, storage vendors, middleware vendors.

Power: decrease the power per unit volume by relying on the policy-driven self management schemes to “power down” objects (i.e. move them to disks and spin those disks down).

Page 24: CSE 598D Storage Systems, Spring 2007 Object Based Storage

Cluster Computing Traditionally 'divide-and-conquer' approach,

decomposing the problem to be solved into thousands of independently executed tasks using a problem's inherent data parallelism--identifying the data partitions that comprise the individual task, then distributing each task and corresponding partition to the compute nodes for processing.

Data from a shared storage system is staged (copied) to the compute nodes, processing is performed, and results are de-staged from the nodes back to shared storage when done. In many applications, the staging setup time can be appreciable-up to several hours for large clusters.

Page 25: CSE 598D Storage Systems, Spring 2007 Object Based Storage

OSD for Cluster Computing Object-based storage clustering is useful in

unlocking the full potential of these Linux compute clusters.

Intrinsic ability to linearly scale in capacity and performance to meet the demands of the supercomputing applications.

High bandwidth parallel data access between thousands of Linux cluster nodes and a unified storage cluster over standard TCP/IP networks.

Page 26: CSE 598D Storage Systems, Spring 2007 Object Based Storage

Commercial Products

Page 27: CSE 598D Storage Systems, Spring 2007 Object Based Storage

OSD Commands