42

Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

  • Upload
    oro

  • View
    34

  • Download
    0

Embed Size (px)

DESCRIPTION

Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes. Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes. Part of OceanStore. POOL. POOL. OceanStore. Cache. Naming/Location. Client. Erasure Codes. - PowerPoint PPT Presentation

Citation preview

Page 1: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes
Page 2: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

Typhoon: An Ultra-Available Archive and Backup System

Utilizing Linear-Time Erasure Codes

Page 3: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

Typhoon: An Ultra-Available Archive and Backup System

Utilizing Linear-Time Erasure Codes

Part of OceanStore...

Page 4: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

Client

Naming/LocationCache

Page 5: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

• Erasure Code: a form of data coding that allows lost portions of data to be recovered

• Idea is similar to ECC, except that the algorithm must be told which portions of the data are missing

• Reed Solomon Codes are a common type of Erasure Code, but they are computationally expensive and are usually implemented in hardware

Erasure CodesErasure Codes

Page 6: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

Tornado Codes: A Linear-Time Tornado Codes: A Linear-Time Probabilistic Family of Erasure Probabilistic Family of Erasure

CodesCodes

• Tornado Codes are linear time, but use probabilistic

assumptions to “guarantee” that the decoding process will

succeed

• A 1/2 rate Erasure Code will double the size of a file

• Any half of e. file can be used to recreate the original data

• T. Codes also require slightly more than half of the encoded

file, thus trading a network bandwidth for speed

– Inventors of T. Codes report that 5% is typical

Page 7: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

• File is divided into nodes of equal size (e.g. 512 bytes)

• Data Nodes are associated with Check Nodes using a series of Bipartite Graphs

• Contents of a Check Node is the XOR of its neighbors

• Bipartite Graphs are created to satisfy mathematical constraints that “guarantee” the recovery process will successfully recover the file

Overview of Encoding Overview of Encoding ProcessProcess

Data Nodes

CheckNodes

Data File

Page 8: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

Overview of Encoding Overview of Encoding ProcessProcess

Data Nodes

Data File

Check Nodes

•Once a file is encoded, the data nodes and check nodes are randomly distributed to a set of recipients

Page 9: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

MMX: SIMD or Marketing?MMX: SIMD or Marketing?

•There are eight MMX registers•Data in registers can be divided into four different sizes:

Page 10: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

MMX: SIMD or Marketing?MMX: SIMD or Marketing?

•There are eight MMX registers•Data in registers can be divided into four different sizes•MMX has 57 instructions for 6 types of operations:

— ADD— SUBTRACT— MULTIPLY— MULTIPLY THEN ADD— COMPARISON— LOGICAL

• AND• NAND• OR• XOR

Page 11: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

MMX: SIMD or Marketing?MMX: SIMD or Marketing?

•There are eight MMX registers•Data in registers can be divided into four different sizes•MMX has 57 instructions for 6 types of operations

char array1[512];char array2[512];

for(int i=0; i<512; ++i)array1[i]=array1[i] ^ array2[i];

MMX is 2.3 times faster than this (1.9 w/o pipeline sched.)

Page 12: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

MMX: SIMD or Marketing?MMX: SIMD or Marketing?

•There are eight MMX registers•Data in registers can be divided into four different sizes•MMX has 57 instructions for 6 types of operations

char array1[512];char array2[512];

long * array1ptr=(long*)array1;long * array2ptr=(long*)array2;

for(int i=0; i<512/sizeof(long); ++i)array1ptr[i]=array1ptr[i] ^ array2ptr[i];

MMX is 50% faster than this (22% w/o sched.)

Page 13: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

MMX: SIMD or Marketing?MMX: SIMD or Marketing?

•There are eight MMX registers•Data in registers can be divided into four different sizes•MMX has 57 instructions for 6 types of operations

char array1[512];char array2[512];

long * array1ptr=(long*)array1;long * array2ptr=(long*)array2;

for(int i=0; i<512; i+=32)xor32fast(array1ptr+i, array2ptr+i);

Page 14: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

MMX: SIMD or Marketing?MMX: SIMD or Marketing?inline void xor32bytes(long * array1reg, long* array2reg, long* destreg){

_asm{

mov eax, [array1reg]mov ecx, [array2reg]movq mm0, [eax]movq mm1, [ecx]movq mm2, [eax+8]movq mm3, [ecx+8]movq mm4, [eax+16]movq mm5, [ecx+16]movq mm6, [eax+24]movq mm7, [ecx+24]pxor mm0, mm1 ; 64-bit xorpxor mm2, mm3 ; 64-bit xorpxor mm4, mm5 ; 64-bit xorpxor mm6, mm7 ; 64-bit xormov ecx, [destreg]movq [ecx], mm0 ; store resultmovq [ecx+8], mm2 ; store resultmovq [ecx+16], mm4 ; store resultmovq [ecx+24], mm6 ; store result

}}

Page 15: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

MMX: SIMD or Marketing?MMX: SIMD or Marketing?inline void xor32fast(long * array1reg, long* array2reg, long* destreg){

_asm{

mov eax, [array1reg]mov ebx, [array2reg]mov ecx, [destreg]movq mm0, [eax] ; load 1a Umovq mm1, [ebx] ; load 1b Umovq mm2, [eax+8] ; load 2a U Vpxor mm0, mm1 ; xor 1 movq mm3, [ebx+8] ; load 2b Umovq [ecx], mm0 ; store 1 U Vpxor mm2, mm3 ; xor 2 movq mm4, [eax+16] ; load 3a Umovq mm5, [ebx+16] ; load 3b Umovq mm6, [eax+24] ; load 4a U Vpxor mm4, mm5 ; xor 3movq mm7, [ebx+24] ; load 4b Umovq [ecx+8], mm2 ; store 2 U Vpxor mm6, mm7 ; xor 4movq [ecx+16], mm4 ; store 3 Umovq [ecx+24], mm6 ; store 4 U

}}

Page 16: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

Overview of Encoding ProcessOverview of Encoding Process

•Server sends storage announcement to a particular set of severs

– Set can be determined/specified using multicast groups, a server list, or some form of DNS address lookup

UDPUDP

Page 17: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

Overview of Encoding ProcessOverview of Encoding Process

•Server sends storage announcement to a particular set of severs

– Set can be determined/specified using multicast groups, a server list, or some form of DNS address lookup

MulticastMulticast

Page 18: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

Overview of Encoding ProcessOverview of Encoding Process

•Server encodes file•During encoding process, the data nodes and check nodes are [randomly] distributed to other servers

Page 19: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

• A set of nodes are received, ideally with random distribution

• Check nodes can be used to recover missing data nodes

• Only check nodes that are missing one neighbor can recreate a data node

• The structure of the graph ensures [w.h.p.] that the encoding process will succeed

– Graph is designed so that there is always at least one check node that is missing only one child

– Data nodes can be used to recover check nodes, but is not important

Overview of Decoding ProcessOverview of Decoding Process

CheckNodes

Data File

Node ReceivedNode Not Received

Page 20: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

• A set of nodes are received, ideally with random distribution

• Check nodes can be used to recover missing data nodes

• Only check nodes that are missing one neighbor can recreate a data node

• The structure of the graph ensures [w.h.p.] that the encoding process will succeed

– Graph is designed so that there is always at least one check node that is missing only one child

– Data nodes can be used to recover check nodes, but is not important

Overview of Decoding ProcessOverview of Decoding Process

CheckNodes

Data File

Node ReceivedNode Not Received

Page 21: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

• A set of nodes are received, ideally with random distribution

• Check nodes can be used to recover missing data nodes

• Only check nodes that are missing one neighbor can recreate a data node

• The structure of the graph ensures [w.h.p.] that the encoding process will succeed

– Graph is designed so that there is always at least one check node that is missing only one child

– Data nodes can be used to recover check nodes, but is not important

Overview of Decoding ProcessOverview of Decoding Process

CheckNodes

Data File

Node ReceivedNode Not Received

Page 22: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

• A set of nodes are received, ideally with random distribution

• Check nodes can be used to recover missing data nodes

• Only check nodes that are missing one neighbor can recreate a data node

• The structure of the graph ensures [w.h.p.] that the encoding process will succeed

– Graph is designed so that there is always at least one check node that is missing only one child

– Data nodes can be used to recover check nodes, but is not important

Overview of Decoding ProcessOverview of Decoding Process

CheckNodes

Data File

Node ReceivedNode Not Received

Page 23: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

• A set of nodes are received, ideally with random distribution

• Check nodes can be used to recover missing data nodes

• Only check nodes that are missing one neighbor can recreate a data node

• The structure of the graph ensures [w.h.p.] that the encoding process will succeed

– Graph is designed so that there is always at least one check node that is missing only one child

– Data nodes can be used to recover check nodes, but is not important

Overview of Decoding ProcessOverview of Decoding Process

CheckNodes

Data File

Node ReceivedNode Not Received

Page 24: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

• A set of nodes are received, ideally with random distribution

• Check nodes can be used to recover missing data nodes

• Only check nodes that are missing one neighbor can recreate a data node

• The structure of the graph ensures [w.h.p.] that the encoding process will succeed

– Graph is designed so that there is always at least one check node that is missing only one child

– Data nodes can be used to recover check nodes, but is not important

Overview of Decoding ProcessOverview of Decoding Process

CheckNodes

Data File

Node ReceivedNode Not Received

Page 25: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

• A set of nodes are received, ideally with random distribution

• Check nodes can be used to recover missing data nodes

• Only check nodes that are missing one neighbor can recreate a data node

• The structure of the graph ensures [w.h.p.] that the encoding process will succeed

– Graph is designed so that there is always at least one check node that is missing only one child

– Data nodes can be used to recover check nodes, but is not important

Overview of Decoding ProcessOverview of Decoding Process

CheckNodes

Data File

Node ReceivedNode Not Received

Page 26: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

• A set of nodes are received, ideally with random distribution

• Check nodes can be used to recover missing data nodes

• Only check nodes that are missing one neighbor can recreate a data node

• The structure of the graph ensures [w.h.p.] that the encoding process will succeed

– Graph is designed so that there is always at least one check node that is missing only one child

– Data nodes can be used to recover check nodes, but is not important

Overview of Decoding ProcessOverview of Decoding Process

CheckNodes

Data File

Node ReceivedNode Not Received

Page 27: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

• A set of nodes are received, ideally with random distribution

• Check nodes can be used to recover missing data nodes

• Only check nodes that are missing one neighbor can recreate a data node

• The structure of the graph ensures [w.h.p.] that the encoding process will succeed

– Graph is designed so that there is always at least one check node that is missing only one child

– Data nodes can be used to recover check nodes, but is not important

Overview of Decoding ProcessOverview of Decoding Process

CheckNodes

Data File

Node ReceivedNode Not Received

Page 28: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

• A set of nodes are received, ideally with random distribution

• Check nodes can be used to recover missing data nodes

• Only check nodes that are missing one neighbor can recreate a data node

• The structure of the graph ensures [w.h.p.] that the encoding process will succeed

– Graph is designed so that there is always at least one check node that is missing only one child

– Data nodes can be used to recover check nodes, but is not important

Overview of Decoding ProcessOverview of Decoding Process

CheckNodes

Data File

Node ReceivedNode Not Received

Page 29: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

• A set of nodes are received, ideally with random distribution

• Check nodes can be used to recover missing data nodes

• Only check nodes that are missing one neighbor can recreate a data node

• The structure of the graph ensures [w.h.p.] that the encoding process will succeed

– Graph is designed so that there is always at least one check node that is missing only one child

– Data nodes can be used to recover check nodes, but is not important

Overview of Decoding ProcessOverview of Decoding Process

CheckNodes

Data File

Node ReceivedNode Not Received

Page 30: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

• A set of nodes are received, ideally with random distribution

• Check nodes can be used to recover missing data nodes

• Only check nodes that are missing one neighbor can recreate a data node

• The structure of the graph ensures [w.h.p.] that the encoding process will succeed

– Graph is designed so that there is always at least one check node that is missing only one child

– Data nodes can be used to recover check nodes, but is not important

Overview of Decoding ProcessOverview of Decoding Process

CheckNodes

Data File

Node ReceivedNode Not Received

Page 31: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

• A set of nodes are received, ideally with random distribution

• Check nodes can be used to recover missing data nodes

• Only check nodes that are missing one neighbor can recreate a data node

• The structure of the graph ensures [w.h.p.] that the encoding process will succeed

– Graph is designed so that there is always at least one check node that is missing only one child

– Data nodes can be used to recover check nodes, but is not important

Overview of Decoding ProcessOverview of Decoding Process

CheckNodes

Data File

Node ReceivedNode Not Received

Page 32: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

• A set of nodes are received, ideally with random distribution

• Check nodes can be used to recover missing data nodes

• Only check nodes that are missing one neighbor can recreate a data node

• The structure of the graph ensures [w.h.p.] that the encoding process will succeed

– Graph is designed so that there is always at least one check node that is missing only one child

– Data nodes can be used to recover check nodes, but is not important

Overview of Decoding ProcessOverview of Decoding Process

CheckNodes

Data File

Node ReceivedNode Not Received

Page 33: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

Overview of Decoding ProcessOverview of Decoding Process• Server sends file request announcement to a particular set of servers• Retrieves data from multiple servers simultaneously• Recovery process can be performed in parallel with receive (network-based RAID-1)• Depending on data loss pattern, a particular subset of the servers can be selected

• Fastest servers (closest servers, or least utilized servers)• Operational Servers (i.e., some portion of the set is not functioning)

• All servers might be needed in some cases, such as network congestion / packet loss

Page 34: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

Client

Naming/LocationCache

ArchitectureArchitecture

Page 35: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

ArchitectureArchitecture•What did we implement?

• Client, Cache, Naming and Location Mechanism, Replication mechanism, filestore.

•What did we test?• Communication

•Explicit communication Unicast request•Implicit communication Multicast request

• Network•Distributed servers throughout Berkeley domain.•Simulated network delay by randomizing response time.

• Caching•None for worst case

• Simulation•Strained the Typhoon system by creating requests at the same rate as a 24 hour NFS traces over a 3 hour period.

Page 36: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

Tornado - GET avg_proc_time

0

5

10

15

20

25

0 500000 1000000 1500000

File Size (bytes)

Tim

e (

se

c)

client

cache

namingloc

replication

f ilestore

Reed Solom on - GET avg_proc_tim e

0

200

400

600

800

1000

1200

1400

1600

1800

2000

0 1000000 2000000 3000000

File Size (bytes)

Tim

e (s

ec)

Client

Cache

Namingloc

Replication

Filestore

Page 37: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

Tornado - Replication

0

5

10

15

20

25

0 500000 1000000 1500000

File Size (bytes)

Tim

e (

se

c)

avg_proc_time

avg_dec_time

avg_comm_time

Reed Solomon - Replication

0

200

400

600

800

1000

1200

1400

1600

1800

2000

0 1000000 2000000 3000000

File Size (bytes)

Tim

e (

se

c)

avg_proc_time

avg_dec_time

avg_comm_time

Page 38: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

Reed Solomon - PUT avg_proc_time

0

500

1000

1500

2000

2500

3000

3500

4000

4500

0 1000000 2000000 3000000 4000000

File size (bytes)

Tim

e (s

ec)

Client

Cache

Namingloc

Replication

Filestore

Tornado - PUT avg_proc_time

0

1

2

3

4

5

6

0 1000000 2000000 3000000 4000000

File Size (bytes)

Tim

e (s

ec)

client

cache

namingloc

replication

filestore

Page 39: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

Reed Solomon - Replication

0

10

20

30

40

50

60

70

0 200000 400000 600000

File size (bytes)

Tim

e (

se

c)

avg_proc_time

avg_enc_time

avg_comm_time

Tornado - Replication

0

1

2

3

4

5

6

0 1000000 2000000 3000000 4000000

File Size (bytes)

Tim

e (

se

c) avg_proc_time

avg_enc_time

avg_comm_time

Page 40: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

Typhoon: An Ultra-Available Archive and Backup System

Utilizing Linear-Time Erasure Codes

Benefits of TyphoonBenefits of Typhoon• Data is ultra-available: up to half of the servers can fail before availability is affected

• Fast file retrieval: data can be retrieved simultaneously from multiple servers– System can choose to use the fastest machines in a set of servers

– Load balancing can be achieved because slow or heavily utilized servers are not used

– Information can be disbursed geographically • Increases the accessibility of data in the event of a major disaster, such as an earthquake

• Can benefit people who travel to remote locations, since data may be closer to them

– Multicast can be used to reduce latency

• Low-overhead algorithms: algorithms for encoding and decoding are linear-time

• Disk overhead of system can be adjusted (typically doubles the size of a file)

Page 41: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

ConclusionConclusion

• Tornado Codes are significantly faster than Cauchy-Reed Solomon

• A Typhoon based system can match the the request of a loaded NFS

• Typhoon is a viable solution for increasing the reliability and accessibility of data

Page 42: Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes

ArchitectureArchitecture•What did we implement?

• Client, Cache, Naming and Location Mechanism, Replication Mechanism, filestore.

•What did we test?• Communication

•Explicit communication TCP request, TCP Response.•Implicit communication Multicast request, TCP Response.

• Network•Distributed servers throughout Berkeley domain.•Simulated network delay by randomizing response time.

• Caching•None for worst case

• Simulation•Strained the Typhoon system by creating requests at the same rate as a 24 hour NFS traces over a 3 hour period.