Upload
zara-tariq
View
50
Download
0
Embed Size (px)
Citation preview
DESIGN AND EVALUATION OF AN I/O CONTROLLER FOR DATAPROTECTION : DARC
PRESENTED BY: ZARA TARIQ REG # 1537119SAPNA KUMARI REG # 1537131
MOTIVATION Data Integrity for data at rest through error detection and error
correction Human errors through transparent online versioning Storage device failures through evolving RAID techniques
OUR APPROACH: DATA PROTECTION IN THE CONTROLLER
Use persistent checksums for error detection If error is recovered use second copy of mirror for recovery
Use versioning for dealing with human errors After failure, revert to previous version
Perform both techniques transparently to Devices: can use any type of (low-cost) devices
Potential for high-rate I/O Make use of specialized data-path & hardware resources Perform (some) computations on data while they are on transit
SYSTEM DESIGN & ARCHITECTURE
1. Host controller I/O Path2. Buffer management 3. Context scheduling 4. Error Detection and Correction5. Storage Virtualization6. Controller on-board Cache
BUFFER MANAGEMENT
Buffer pools Pre-allocated, fixed-size
2 classes: 64KB for application data, 4KB for control information Trade-off between space-efficiency and latency
IO allocation/de-allocation overheadLazy de-allocation
De-allocate when: Idle, or under extreme memory pressure
Command & completion FIFO queues
CONTEXT SCHEDULING
Multiple in-flight I/O commands at any one time I/O command processing actually proceeds in discrete stages, with several
events/notifications being triggered at each1. Option-I: Event-driven
Design (and tune) dedicated FSM Many events during I/O processing
Eg: DMA transfer start/completion, disk I/O start/completion, …
2. Option-II: Thread-based Encapsulate I/O processing stages in threads, schedule threads
We have used Thread-based, using full Linux OS Programmable, infrastructure in-place to build advanced functionality more
easily but more s/w layers, with less control over timing of events/interactions
ERROR DETECTION AND CORRECTION
DARC approach for correcting errors is based on a combining two mechanisms which are:
1. Error detection through the calculation of data checksums, which are insistently stored and checked on every read command
2. Error correction through data reconstruction using available data redundancy schemes.
SYSTOR 2010 - DARC
ERROR CORRECTION PROCEDURE IN THE CONTROLLER IO PATH
HOST-CONTROLLER I/O PATH
I/O commands [ transferred via Host-initiated PIO ] SCSI command descriptor block + DMA segments DMA segments reference host-side memory addresses
I/O completions [transferred via Controller-initiated DMA ] Status code + reference to originally issued I/O command
Options for transfer of commands PIO vs DMA
PIO: simple, but with high CPU overhead DMA: high throughput, but completion detection is complicated
Options: Polling, Interrupts
IO issue and Completion Path in DARC
STORAGE VIRTUALIZATION
DARC uses the Violin block-driver framework for volume virtualization & versioning Violin is located above the SCSI (Small Computer System Interface)
drivers in the controller (Violin already provides versioning) and RAID modules.
M. Flouris and A. Bilas – Proc. MSST, 2005 VIOLIN
Provides new virtualization functions for extension modules Combine these functions in storage hierarchies with rich semantics Meta-data persistence
VIOLIN supports: Asynchronous IO (Improves performance but challenging)
CONTROLLER ON-BOARD CACHE
Typically, I/O controllers have an on-board cache: Exploit temporal locality (recently-accessed data blocks) Read-ahead for spatial locality (prefetch adjacent data blocks) Coalescing small writes (e.g. partial-stripe updates with RAID-5/6)
Many design decisions needed RAID affects cache implementation
Performance Failures (degraded RAID operation)
SUMMARY OF DESIGN CHOICES
I/O STACK IN DARC - “DATA PROTECTION CONTROLLER”
User-Level Applications
Storage Controller
Buffer Cache
File System
SCSI Layer
Virtual File System (VFS)
System Calls
Block-level Device Drivers
Raw I/O
CONCLUSION
I/O controllers are not so much limited from host connectivity competences, but from internal resources and their allocation and management policies.
How to integrate data protection features in a commodity I/O controller, particularly protecting using checksums and versioning of storage volumes.
Incorporation of data protection features in a commodity I/O controller integrity protection using persistent checksums versioning of storage volumes
Several challenges in implementing an efficient I/O path between the host machine & the controller
REFERENCES
[1] T10 DIF (Data Integrity Field) standard. http://www.t10.org.[2] Intel. Intel Xscale IOP Linux Kernel Patches. http://sourceforge.net/projects/xscaleiop/les/.[3] M. D. Flouris and A. Bilas. Violin: A framework for extensible blocklevel storage. In Proceedings of 13th IEEE/NASA Goddard (MSST2005) Conference on Mass Storage Systems and Technologies, pages 128142, Monterey, CA, Apr. 2005.[4] M. D. Flouris and A. Bilas. Clotho: transparent data versioning at the block i/o level. In Proceedings of 12th IEEE/NASA Goddard (MSST2004) Conference on Mass Storage Systems and Technologies, pages 315328, 2004.[5] G. A. Gibson, D. F. Nagle, K. Amiri, J. Butler, F. W. Chang, H. Gobio, C. Hardin, E. Riedel, D. Rochberg, and J. Zelenka. A cost-eective, highbandwidth storage architecture. In Proc. of the 8th ASPLOS Conference. ACM Press, Oct. 1998.[6] E. K. Lee and C. A. Thekkath. Petal: distributed virtual disks. In Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VII), pages 8493. ACM SIGARCH/SIGOPS/SIGPLAN, Oct. 1996.
REFERENCES
[7] A. Krioukov, L. N. Bairavasundaram, G. R. Goodson, K. Srinivasan, R. Thelen, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau. Parity lost and parity regained. In Proc. of the 6th USENIX Conf. on File and Storage Technologies (FAST08), pages 127141, 2008.[8] Microsoft. Optimizing Storage for Microsoft Exchange Server 2003. http://technet.microsoft. com/enus/exchange/default.aspx[9] C.-H. Moh and B. Liskov. Timeline: a high performance archive for a distributed object store. In NSDI, pages 351364, 2004.[10] V. Prabhakaran, L. N. Bairavasundaram, N. Agrawal, H. S. Gunawi, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau. IRON le systems. In Proc. of the 20th ACM Symposium on Operating Systems Principles (SOSP 05), pages 206220, Brighton, United Kingdom, October 2005.[11] Markos Fountoulakis*, Manolis Marazakis, Michail D. Flouris, and Angelos Bilas*. DARC: Design and Evaluation of an I/O Controller for Data Protection. Foundation of Research and Technology Hellas (FORTH), Greece, May 2010