43
Coding Techniques for Efficient, Reliable Networked Distributed Storage in Data Centers Anwitaman Datta Joint work with Frédérique Oggier (SPMS) School of Computer Engineering Nanyang Technological University Infocomm Professional Development Forum 7 th July 2011, Singapore

Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Coding Techniques for Efficient, Reliable g q ,Networked Distributed Storage in Data Centers

Anwitaman DattaJoint work with Frédérique Oggier (SPMS)

School of Computer EngineeringNanyang Technological University

Infocomm Professional Development Forum 7th July 2011, Singaporey , g p

Page 2: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Self-* Aspects of Networked Distributed SystemsSelf Aspects of Networked Distributed Systems

Me, myself & SANDS& SANDS

htt //sands sce ntu d sg/

© 2011 A. DattaNTU Singapore

http://sands.sce.ntu.edu.sg/

http://www.ntu.edu.sg/home/anwitaman/

Page 3: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

OutlineOo Preliminaries Data Center is at the center of the Cloud Data Center is at the center of the Cloud Failure is not an option, but …

• Known unknowns and unknown unknowns• Existing contingencies• Existing contingencies

o Fault-tolerance using erasure codes (ECs) Background

• RAID vs. ECs used as “Blackbox” o Erasure codes tailor-made for distributed networked

storageg Pyramid & hierarchical codes Regenerating codes Self-repairing codes

© 2011 A. DattaNTU Singapore

Self-repairing codes o Wrap-up

Page 4: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Preliminaries

o CloudPreliminaries- Cloud o Cloud

o Data center/Storage systemsDi t ib t d fil t

Cloud- Storage systems

Fault-tolerance

Tailor-made d Distributed file system

• Distributed data management– Key-Value store NoSQL Hadoop etc

codes

Wrap up

– Key-Value store, NoSQL, Hadoop, etc.

Reliable storage

© 2011 A. DattaNTU Singapore

Page 5: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

What is the Cloud?Wo At least we can all agreePreliminaries

- Cloud o At least, we can all agree Cloud is something “big” and happening! It’s all of these

Cloud- Storage systems

Fault-tolerance

Tailor-made d Old It s all of these …

• … and some more!codes

Wrap up

SaaS

Old wine

IaaSData

center

PaaS %*$aaS

© 2011 A. DattaNTU Singapore

Page 6: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

NIST Definition for Cloud ComputingN p gPreliminaries- Cloud o Cloud computing is a model for enablingCloud- Storage systems

Fault-tolerance

Tailor-made d

o Cloud computing is a model for enablingubiquitous, convenient, on-demand networkaccess to a shared pool of configurablecodes

Wrap up

access to a shared pool of configurablecomputing resources (e.g., networks,servers storage applications andservers, storage, applications, andservices) that can be rapidly provisionedand released with minimal managementand released with minimal managementeffort or service provider interaction.

© 2011 A. DattaNTU Singapore

Page 7: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Two Sides of the Cloud CoinPreliminaries- Cloud o Outside viewCloud- Storage systems

Fault-tolerance

Tailor-made d

o Outside view A “single/exclusive” entity

• Access through a “demilitarized zone”codes

Wrap up

• Access through a demilitarized zone …• API based• Agnostic to multi-tenancyg y

Infinite/elastic resources• Pay per use, on-demand, … y p

Browser based access (often)• Anytime, anywhere, any device …

© 2011 A. DattaNTU Singapore

Page 8: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Two Sides of the Cloud CoinPreliminaries- Cloud o Inside viewCloud- Storage systems

Fault-tolerance

Tailor-made d

o Inside view Pool of resources

• In flux: New compute units joining, old ones codes

Wrap up

p j gretiring

• Self-*: Load-balancing, fault-tolerance, auto-configuration,configuration, …

Multi-tenancy• Virtualization, transparent migration, …

Distributed file system and data-management• Google’s GFS, Amazon’s Dynamo, Yahoo!’s Pnuts

© 2011 A. DattaNTU Singapore

• Map Reduce/Hadoop, Pig, Chubby, …

Page 9: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

The New Stack Preliminaries- Cloud

ApplicationsCloud

- Storage systems

Fault-tolerance

Tailor-made d

NoSQLe g Map‐Reduce

SQL Implementationse g PIG (relationalcodes

Wrap up

e.g., Map Reduce, Hadoop, BigTable, Hbase, Cassandra …

e.g., PIG (relational algebra), HIVE, …

Distributed File System (e.g., Key‐Value Store)

Distributed Physical Infrastructure: Storage/Compute Nodes

Reliable storage service

© 2011 A. DattaNTU Singapore

Distributed Physical Infrastructure: Storage/Compute Nodes

Disclaimer: This is a personal “view”, and may not be standard/universal 

Page 10: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Data CenterPreliminaries- Cloud o Essentially a networked distributed storageCloud- Storage systems

Fault-tolerance

Tailor-made d

o Essentially a networked distributed storagesystem

codes

Wrap up

© 2011 A. DattaNTU Singapore

Source of topology: http://www.cisco.com/en/US/docs/solutions/Enterprise/Data_Center/vmware/VMware.html

Page 11: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Data Center Design EvolutionGen 1  DC Collocation

Gen 2 Gen 3 Gen 4 (future)Modular Data Center

g

Deployment Scale UnitDeployment Scale Unit

ServerServer RackRack ContainersContainers PrePre‐‐Assembled Assembled 

Density and 

ScalabilityTh d   f 

Right Time to Market,

ComponentsComponents

© 2011 A. DattaNTU Singapore

Capacityand Sustainability

Thousands of Servers 

Lower TCO (PUE)Scalable Data Centers

Slide courtesy Roger Barga (Microsoft) from his P2P 2009 Keynote talk

Page 12: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Failure (of individual nodes) is inevitableFailure (of individual nodes) is inevitable

o But failure of the system is not an option!Preliminaries- Cloud o But, failure of the system is not an option!

o Solution: RedundancyCloud

- Storage systems

Fault-tolerance

Tailor-made dcodes

Wrap up

© 2011 A. DattaNTU Singapore

Page 13: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Is the Danger Real? Yesg

Preliminaries- CloudCloud- Storage systems

Fault-tolerance

Tailor-made dcodes

Wrap up

Cloud is 

© 2011 A. DattaNTU Singapore

NSFW

Page 14: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Is the Danger Real? Yeso There is also the danger of data falling in

“wrong hands”, e.g. due to security breach

g

Preliminaries- Cloud g , g yCloud- Storage systems

Fault-tolerance

Tailor-made dcodes

Wrap up

o Security/privacy issues are out of the scope ofo Security/privacy issues are out of the scope of this talk@ SANDS we work on those issues also

© 2011 A. DattaNTU Singapore

A*Star TSRP project pCloud• http://sands.sce.ntu.edu.sg/pCloud/

Page 15: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Data Center Fault-TolerancePreliminaries o Faults are omnipresentFault-tolerance- Existing approaches- Has EC a role?

o Faults are omnipresent Hardware, network,

software, human, Tailor-made codes

Wrap up

misconfiguration, …o Cascade of failures in

interdependent networks Power failure => Network

switches stop workingswitches stop working Network failure => Control

system for power system i ff ti

© 2011 A. DattaNTU Singapore

ineffective

Page 16: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Redundancy Based Fault ToleranceRedundancy Based Fault Tolerance

o Replicate dataPreliminarieso Replicate data e.g., 3 or more copies In nodes on different racks

Fault-tolerance- Existing approaches- Has EC a role?

In nodes on different racks• Can deal with switch failures

o Power back up using battery between racks

Tailor-made codes

Wrap up

o Power back-up using battery between racks (Google)

© 2011 A. DattaNTU Singapore

Page 17: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Redundancy Based Fault ToleranceRedundancy Based Fault Tolerance

o Using “independent” physical infrastructurePreliminarieso Using independent physical infrastructure Over different availability zones (Amazon AZ)

• How independent are components in a complex

Fault-tolerance- Existing approaches- Has EC a role?

• How independent are components in a complex network?

Over multiple geographical regions

Tailor-made codes

Wrap up

p g g p g

© 2011 A. DattaNTU Singapore

Page 18: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Amazon’s AWS: Availability Zones yPreliminaries

Fault-tolerance- Existing approaches- Has EC a role?

Tailor-made codes

Wrap up

© 2011 A. DattaNTU Singapore Note: The recent (April 2011) AWS outage was the first region‐wide failure

Page 19: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Five Levels of RedundancyFive Levels of Redundancy

o PhysicalPreliminarieso Physicalo Virtual resource

A il bili

Fault-tolerance- Existing approaches- Has EC a role?

o Availability zoneo Region

Tailor-made codes

Wrap up

o Cloud

© 2011 A. DattaNTU Singapore

From: http://broadcast.oreilly.com/2011/04/the‐aws‐outage‐the‐clouds‐shining‐moment.html

Page 20: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

At What Cost?At What Cost?Preliminaries

o Failure is not an option butFault-tolerance- Existing approaches- Has EC a role?

o Failure is not an option, but … … are the overheads acceptable?

Tailor-made codes

Wrap up

© 2011 A. DattaNTU Singapore

Page 21: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Reducing the Overheads of Redundancy

o Erasure codes

Reducing the Overheads of RedundancyPreliminaries

o Erasure codes Much lower storage overhead High level of fault tolerance

Fault-tolerance- Existing approaches- Has EC a role?

High level of fault-tolerance • In contrast to replication or RAID based systems

o Has the potential to significantly improve

Tailor-made codes

Wrap up

o Has the potential to significantly improve the “bottomline”

C it h t h th f d ? Can it however match the performance needs?• An open question

© 2011 A. DattaNTU Singapore

Page 22: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Erasure Codes (ECs)Erasure Codes (ECs)

o Originally designed for communication

B1

B

o Originally designed for communication EC(n,k)

Receive any  k’ (≥ k) blocks 

ge ata

O1

O2

B2 O1

O2

Encoding Decoding…… …

a = messag

nstruct D

a

Data

Reco

Ok Ok

Lost blocks

© 2011 A. DattaNTU Singapore

n encoded blocksOriginalk blocks k blocks

k

Bn

k

Page 23: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Erasure Codes for Networked StorageErasure Codes for Networked Storage

OB1 Retrieve any   O

ject

O1

O2B2

k’ (≥ k) blocks 

 Data

O1

O2

Data = Obj

Encoding… …

construct

DecodingBl

D

Ok

… Lost blocks O i i l

Re

Ok

k blocks Bn

n encoded blocks

… Lost blocks Originalk blocks

© 2011 A. DattaNTU Singapore

(stored in storage devices in a network)

Page 24: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Static ResilienceStatic Resilienceo Replicated r times

F lt th t b t l t d 1

objectFor f=0.1its 10‐3Preliminaries

Faults that can be tolerated: r-1 Probability of failure: f r

Storage efficiency: 1/r

replic

replic

replic

Fault-tolerance- Existing approaches- Has EC a role?

g y Access: Find any one good replica

o Erasure coded (k of n)

ca ca ca

object

For f=0.13 of 9 code

Tailor-made codes

Wrap up

Faults that can be tolerated: n-k Probability of failure:

k k

object

Blk Blk Blk

its ~3*10‐6

Storage efficiency: k/n

 jkjkn

k

jff

jknn

)1(

1

 jkjkn

k

jff

jknn

)1(

1

Blk

Blk

Blk

Blk

Blk

Blk

© 2011 A. DattaNTU Singapore

Storage efficiency: k/n Access: Find k good blocks

o Assumption: Peer failure is i.i.d. with failure probability f

Page 25: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Replenishing Lost Redundancy for ECs p g yo Repairs needed for long-term resilience

B2

B1

Retrieve any  k’ (≥ k) blocks

O1

O2Recreate lost blocks…

…Decoding EncodingBl

lost blocks

Re‐insert

Reinsert in (new) d i

Bn

… Lost blocks Originalk blocks

Ok storage devices, so that there is (again) n encoded blocks

© 2011 A. DattaNTU Singapore

k blocksn encoded blocks

o Repairs are expensive!

Page 26: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Can We Do Better?

o What is the best one can do (w.r.to repairs)?

Can We Do Better?Preliminaries

Minimize bandwidth usage per repair Minimize number of live nodes used per repair

Fault-tolerance- Existing approaches- Has EC a role?

o Erasure codes have some other drawbacks Coding/Decoding is “Expensive”

Tailor-made codes

Wrap upg g p

• In contrast to replication or RAID/XOR based systems• Systematic codes can help (with decoding/access)!

N t d t h l d b l i i l i !!– Not adequate when load-balancing is also an issue!!

More complex system design We do not attempt to address these explicitly

© 2011 A. DattaNTU Singapore

We do not attempt to address these explicitly• But, some solution we will arrive at will be amenable!

Page 27: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Can We Do Better?Can We Do Better?Preliminaries o Erasure codes tailor-made for distributed Fault-tolerance

Tailor-made codes- Pyramid codes

networked storage

- Regenerating Codes - Self-repairing codes

Wrap upPyramid codes

Wrap up

Regenerating codes

Self‐repairing codes

© 2011 A. DattaNTU Singapore

Page 28: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Pyramid & Hierarchical CodesPyramid & Hierarchical CodesPreliminaries o Essentially, “nested” use of erasure codesFault-tolerance

Tailor-made codes- Pyramid codes

Oa1

…Oa2

Bla1 loca

redund

subgro

- Regenerating Codes - Self-repairing codes

Wrap up

…Bl

an’

a2

Bg1

aldancy

gred

oup

Rep

global rethe c

Wrap up

Bgr’

1 globaldundancy

plicate cod

edundancy code group

Os1

Os2

localredundan…

Bls1

subgrou

…de group

from

ps

© 2011 A. DattaNTU Singapore

ncyBlsn’

up

Code group Multi-hierarchical extension

Page 29: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Pyramid & Hierarchical CodesPyramid & Hierarchical CodesPreliminaries

o Pros If `small’ number of faults

Fault-tolerance

Tailor-made codes- Pyramid codes

• Communication restricted within the `hierarchy’ suffice• Progressively go higher-up for larger number of faults• Isolated faults can be repaired independently

- Regenerating Codes - Self-repairing codes

Wrap up

p p y Naturally maps to hierarchical data-center design?

o ConsWrap up Asymmetry

• Different encoded blocks have different importance• Difficult to analyze• Complex algorithm (for decoding/repair) and system design

© 2011 A. DattaNTU Singapore

Another (essentially identical, but independent) proposal:  Hierarchical Codes: How to Make Erasure Codes Attractive for Peer‐to‐Peer Storage Systems from Eurecom

Page 30: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Regenerating CodesRegenerating CodesPreliminaries o Network information flow based argumentsFault-tolerance

Tailor-made codes- Pyramid codes

o Network information flow based arguments to determine “optimal” trade-off of storage/repair-bandwidth

- Regenerating Codes - Self-repairing codes

Wrap up

sto age/ epa ba dw dt

Wrap up

© 2011 A. DattaNTU Singapore

Page 31: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Regenerating CodesRegenerating CodesPreliminaries o Example code (w/ Functional Repair)Fault-tolerance

Tailor-made codes- Pyramid codes

p ( p ) Based on random linear network coding

• Exact repairing codes have also been proposed- Regenerating Codes - Self-repairing codes

Wrap upWrap up

© 2011 A. DattaNTU Singapore

Page 32: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Regenerating Codesg g

Preliminaries o ProsFault-tolerance

Tailor-made codes- Pyramid codes

Network information flow analysis determines optimal (w.r.to repair bandwidth)

• Catch: Subject to MDS property of code (more on this, later)- Regenerating Codes - Self-repairing codes

Wrap up

Some proposed codes apply network coding on top of ECs

• Inherit the properties for EC for de/codingWrap up Inherit the properties for EC for de/coding

o Cons Codes for only specific points on trade-off curve

• Information flow analysis itself does not suggest any code Restrictive

• Some proposed codes can carry out repair only for one fault

© 2011 A. DattaNTU Singapore

p p y p y• Needs to contact all live nodes for repair to be optimal

Not simple: Algorithmic as well as system design

Page 33: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Self-repairing Codes (SRC)p g ( )

Preliminaries o Self-repairing codes are (n; k) codes s.t.Fault-tolerance

Tailor-made codes- Pyramid codes

p g ( ) encoded fragments can be repaired directly

from other subsets of encoded fragments.- Regenerating Codes - Self-repairing codes

Wrap up

a fragment can be repaired from a fixed number of encoded fragments, independently of which

Wrap up specific blocks are missing• Analogous to erasure codes supporting

reconstruction using any n k losses independentlyreconstruction using any n - k losses, independently of which

© 2011 A. DattaNTU Singapore

Page 34: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Self-repairing Codes: Blackbox Viewp g

Preliminaries B1 Retrieve some  Fault-tolerance

Tailor-made codes- Pyramid codes

B2

k” (< k) blocks (e.g. k”=2)to recreate a lost block

- Regenerating Codes - Self-repairing codes

Wrap up

Bl

Re‐insert

Wrap up

… Lost blocksReinsert in (new) storage devices, so h h ( )

Bn

n encoded blocks

… Lost blocks that there is (again) n encoded blocks

© 2011 A. DattaNTU Singapore

n encoded blocks(stored in storage 

devices in a network)

Page 35: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Self-repairing Codesp g

Preliminaries o Two families of SRCs have been proposedFault-tolerance

Tailor-made codes- Pyramid codes

p p

- Regenerating Codes - Self-repairing codes

Wrap upWrap up

© 2011 A. DattaNTU Singapore

Page 36: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Self-repairing Codesp g

Preliminaries o There is at least one pair to repair a node, Fault-tolerance

Tailor-made codes- Pyramid codes

p pfor up to (n -1)/2 simultaneous failures Parallel & fast repair of multiple fairs

- Regenerating Codes - Self-repairing codes

Wrap up

p po PSRC (projective geometric SRC) example Data object split in four parts: PSRC(5 3)Wrap up Data object split in four parts: PSRC(5,3)

© 2011 A. DattaNTU Singapore

Page 37: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Toy Example: PSRC(5,3) repairy p ( , ) p

Preliminaries

Fault-tolerance

Tailor-made codes- Pyramid codes- Regenerating Codes - Self-repairing codes

Wrap up

(o1+o2+o4) + (o1) => o2+o4R i i t dWrap up (o3) + (o2+o3) => o2

(o1) + (o2) => o1+ o2

Repair using two nodes

Four pieces needed to regenerate two pieces 

Say N1 and N3

(o +o +o ) + (o ) => o +o(o2) + (o4) => o2+ o4Repair using three nodes

© 2011 A. DattaNTU Singapore

(o1+o2+o4) + (o4) => o1+o2

Three pieces needed to regenerate two pieces 

Say N2, N3 and N4

Page 38: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Toy Example: PSRC(5,3) reconstructiony p ( , )

Preliminaries

Fault-tolerance

Tailor-made codes- Pyramid codes- Regenerating Codes - Self-repairing codes

Wrap upWrap up o3o4(o3) + (o1+o3) => o1(o1) +(o4)+(o1+o2+o4) => o2(o1) +(o4)+(o1+o2+o4)  > o2

Reconstruction, say using N3, N4 and N5

© 2011 A. DattaNTU Singapore

Page 39: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Symmetry in SRCsy y

o All encoded blocks have symmetric rolePreliminarieso All encoded blocks have symmetric role Equivalent importance of all blocks

• for both data reconstruction & repair

Fault-tolerance

Tailor-made codes- Pyramid codes • for both data reconstruction & repair

o Symmetry is goodE t l d t d d i l t

- Regenerating Codes - Self-repairing codes

Wrap up Easy to analyze, understand and implement Simpler algorithm and system design

Wrap up

© 2011 A. DattaNTU Singapore

Page 40: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Maximum Distance Separable (MDS)?Maximum Distance Separable (MDS)?o SRC is not MDS (and can not be!)

D i ?Preliminaries

Does it matter?• Not much• In practice access will be “planned”

Fault-tolerance

Tailor-made codes- Pyramid codes • In practice, access will be “planned” …

PSRC needs less bandwidth than `optimal’ RGC!- Regenerating Codes - Self-repairing codes

Wrap up

This is with random access

Wrap up

© 2011 A. DattaNTU Singapore

PSRC(21,3)

Page 41: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Practical propertiesp po (Current) SRCs are not systematic PSRC is “like systematic”

Preliminaries PSRC is like systematic Need to contact more nodes (than k)

• To obtain systematic `pieces’

Fault-tolerance

Tailor-made codes- Pyramid codes

• Same total bandwidth usage– Parallel download for access can even be an `advantage’

• `mixed’ strategies for access, i.e. get some systematic

- Regenerating Codes - Self-repairing codes

Wrap up g , g ypieces, and some others …

– Power saving (by switching off nodes) strategies possible

Wrap up

© 2011 A. DattaNTU Singapore

Page 42: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Practical propertiesp po Self-repair implies

h “l ll d d bl ”Preliminaries

somewhat “locally decodable”• If access to only part of the whole object is desired

C di /d di i PSRC b h i

Fault-tolerance

Tailor-made codes- Pyramid codes

o Coding/decoding in PSRC are both using XOR operations only

- Regenerating Codes - Self-repairing codes

Wrap upWrap up

© 2011 A. DattaNTU Singapore

Page 43: Codinggq , Techniques for Efficient, Reliable Networked ...sands.sce.ntu.edu.sg/CodingForNetworkedStorage/pdf/IDA...o Fault-tolerance using erasure codes (ECs) Background • RAID

Outlook

o 2020: Self-repairing codes in a data-center near Preliminaries

you?o Ongoing:

Fault-tolerance

Tailor-made codes g g

Concepts/Implementation Prototype miniature

data center

Wrap up

data-center Template for

preassembled component of a modular 4G+ data center

o Interested to Follow:

© 2011 A. DattaNTU Singapore

http://sands.sce.ntu.edu.sg/CodingForNetworkedStorage/ Get involved: {anwitaman,frederique}@ntu.edu.sg