25
The virtue of The virtue of dependent failures in dependent failures in multi-site systems multi-site systems Flavio Junqueira and Keith Marzullo University of California, San Diego Workshop on Hot Topics in System Dependability (HotDep), Yokohama, Japan, June 2005

The virtue of dependent failures in multi-site systems Flavio Junqueira and Keith Marzullo University of California, San Diego Workshop on Hot Topics in

Embed Size (px)

Citation preview

Page 1: The virtue of dependent failures in multi-site systems Flavio Junqueira and Keith Marzullo University of California, San Diego Workshop on Hot Topics in

The virtue of dependent failures in The virtue of dependent failures in multi-site systemsmulti-site systems

Flavio Junqueira and Keith Marzullo

University of California, San Diego

Workshop on Hot Topics in System Dependability (HotDep), Yokohama, Japan, June 2005

Page 2: The virtue of dependent failures in multi-site systems Flavio Junqueira and Keith Marzullo University of California, San Diego Workshop on Hot Topics in

2HotDep’05

Multi-site systemsMulti-site systems

Collection of sites across a WAN Multiple processors per site

Storage nodes Computing nodes

Share resources E.g. BIRN, Geon, TeraGrid

Failures Processors unavailable Services do not mask failures

Improve availability under failures Replication Minimize overhead

Page 3: The virtue of dependent failures in multi-site systems Flavio Junqueira and Keith Marzullo University of California, San Diego Workshop on Hot Topics in

3HotDep’05

IntroductionIntroduction

Failures in multi-site systems Processor failures Site failures

Processors of the site become unavailable

A new failure model

Availability through replication Replica placement Operations on replicas: quorums

Replicated data: quorum update Replicated functionality: state-machine

using Paxos

Quorum constructions

Failure model in practice Implement the model Site availability in BIRN Model for processor failures within a site

Misconfigured software Shared resources

1. Storage2. Power circuits3. Cooling pipes4. Air conditioning5. Network

Software and hardware faults

Page 4: The virtue of dependent failures in multi-site systems Flavio Junqueira and Keith Marzullo University of California, San Diego Workshop on Hot Topics in

4HotDep’05

A dependent failure modelA dependent failure model

Threshold model Limit on the number of processor failures Simple Model well homogeneous processors that fail independently

Multi-site: sites unavailable frequently enough Processor failures are not IID All processors become unavailable

The multi-site threshold model Two components

Threshold on the number of site failures (fs) One threshold per site on processor failures (t)

Assumptions Sites are homogeneous Processors within a site are homogeneous Processor failure = crash

Page 5: The virtue of dependent failures in multi-site systems Flavio Junqueira and Keith Marzullo University of California, San Diego Workshop on Hot Topics in

5HotDep’05

Quorum systemsQuorum systems

Quorum system Q Quorum system: set of quorums Quorum: set of processors Intersection property: every pair of quorums in Q intersect

Algorithms: access a quorum

Example: Majority system n processors Every subset of size (n+1)/2 is a quorum Optimal availability for IID processor failures

Page 6: The virtue of dependent failures in multi-site systems Flavio Junqueira and Keith Marzullo University of California, San Diego Workshop on Hot Topics in

6HotDep’05

A quorum construction: A quorum construction: QSiteQSite

QSite Select at least (2 fs +1) sites: S

Select at least (2t +1) processors from each site in S

Quorum Majority of sites in S Majority of processors in each site

An example (fs = 1, t = 1)Site 1

Site 2

Site 3

Quorums

Page 7: The virtue of dependent failures in multi-site systems Flavio Junqueira and Keith Marzullo University of California, San Diego Workshop on Hot Topics in

7HotDep’05

QSite vs. MajorityQSite vs. Majority

fst = 1 t = 2

Maj. QSite Maj. QSite

1 5 4 8 6

2 8 6 13 9

3 11 8 18 12

4 14 10 23 15

Properties of multi-site threshold model hold

Same replicas for QSite and Majority

Availability fs unavailable sites Remaining fs + 1 sites

t unavailable processors Majority: no quorum available

Requires:

Available: QSite: one quorum available

QSite has better availability Majority is not optimal

Quorum sizes QSite produces smaller

quorums Reduces load Increases capacity

2 fst + fs + t +1

fst + fs + t +1

Page 8: The virtue of dependent failures in multi-site systems Flavio Junqueira and Keith Marzullo University of California, San Diego Workshop on Hot Topics in

8HotDep’05

Reducing quorum sizes and sitesReducing quorum sizes and sites

QSite, fs = 2, t = 1: 5 sites 3 processors per site 6 processors per quorum

Compromise availability Site 1

Site 2

Site 3

Site 4

Quorums

Page 9: The virtue of dependent failures in multi-site systems Flavio Junqueira and Keith Marzullo University of California, San Diego Workshop on Hot Topics in

9HotDep’05

Site availabilitySite availability

Goals Show that sites are unavailable frequently enough Threshold on the number of site failures

BIRN - Biomedical Informatics Research Network Test bed projects centered around brain imaging Currently: 19 universities, 26 research groups

Availability Monthly basis Pings (BIRN-CC) Storage broker logs

Site availability Jan/04-Aug/04 Availability under 100%

On average in 5 out of the 8 months

Availability = Total hours - Unplanned outages

Total hours×100

Page 10: The virtue of dependent failures in multi-site systems Flavio Junqueira and Keith Marzullo University of California, San Diego Workshop on Hot Topics in

10HotDep’05

BIRN site availabilityBIRN site availability

10 sites experience at least one outage

One site under 97%

Page 11: The virtue of dependent failures in multi-site systems Flavio Junqueira and Keith Marzullo University of California, San Diego Workshop on Hot Topics in

11HotDep’05

Threshold on unavailable sitesThreshold on unavailable sites

Worst-case scenario Assumption: independent site failuresn most unavailable sites in each month Probability that all n sites are unavailable Each 1% of unavailability is approximately 7 hours

Number of sites (n) Unavailability in minutes

1 3288 (979)

2 87 (33)

3 1.9 (1.0)

4 0.017 (0.009)

Page 12: The virtue of dependent failures in multi-site systems Flavio Junqueira and Keith Marzullo University of California, San Diego Workshop on Hot Topics in

12HotDep’05

Homogeneous set of processors Independent processor failures Identical probability of failure

Processors are repaired Repair probabilities change with number of failures

Markov chain

From the model: threshold on the number of failures (t) Desired degree of availability Stationary probabilities

Modeling failures in a siteModeling failures in a site

Page 13: The virtue of dependent failures in multi-site systems Flavio Junqueira and Keith Marzullo University of California, San Diego Workshop on Hot Topics in

13HotDep’05€

limn → ∞

Pi0n = 0.96695

limn → ∞

Pi1n = 0.03223

limn → ∞

Pi2n = 0.00080

limn → ∞

Pi3n = 0.00002 Availability 0.001

t = 1

An exampleAn example

Three processors per site Probabilities

Failure probability much smaller than repair probabilities Repair probabilities increase with failures

Page 14: The virtue of dependent failures in multi-site systems Flavio Junqueira and Keith Marzullo University of California, San Diego Workshop on Hot Topics in

14HotDep’05

Discussion & Future workDiscussion & Future work

Multi-site systems: important class of distributed systems Share resources Collaboration among distant groups

Improve availability through replication A useful abstraction: quorum systems Algorithms built on top of quorum systems

Dependent failures Site failures Enables smaller, higher available quorums

Lessons to learn Considering dependent failures may improve results Models are not necessarily complex

Future work Validate model, evaluate constructions in practice, more

constructions, etc.

Page 15: The virtue of dependent failures in multi-site systems Flavio Junqueira and Keith Marzullo University of California, San Diego Workshop on Hot Topics in

15HotDep’05

END

Page 16: The virtue of dependent failures in multi-site systems Flavio Junqueira and Keith Marzullo University of California, San Diego Workshop on Hot Topics in

16HotDep’05

EquationsEquations

limn → ∞

Pi0n = 0.96695

limn → ∞

Pi1n = 0.03223

limn → ∞

Pi2n = 0.00080

limn → ∞

Pi3n = 0.00002

p = 0.01

r0 = 0.3

r1 = 0.4

r2 = 0.5

Availability = Total hours - Unplanned outages

Total hours×100

2 fst + fs + t +1

fst + fs + t +1

Page 17: The virtue of dependent failures in multi-site systems Flavio Junqueira and Keith Marzullo University of California, San Diego Workshop on Hot Topics in

17HotDep’05

IntroductionIntroduction

Failures in multi-site systems Processor failures Site failures

Processors of the site become unavailable A new failure model

Availability through replication Replica placement Operations on replicas: quorums Replicated data (quorum update) Replicated functionality (state-machine using

Paxos) Quorum constructions

Failure model in practice Implementability of the model Real system for site availability (BIRN) Model for processor failures within a site

1. Software incompatibility, misconfiguration

2. Shared resources (e.g. storage)3. Power failures4. Broken pipes5. Loss of air conditioning6. Network problems

Software and hardware faults

Page 18: The virtue of dependent failures in multi-site systems Flavio Junqueira and Keith Marzullo University of California, San Diego Workshop on Hot Topics in

18HotDep’05

IntroductionIntroduction

Failures in multi-site systems Processor failures

E.g. HW failures Site failures

Strategies for replica placement Large number of sites and nodes

Updates Naïve approach: every non-faulty replica up to date Quorum update: contact a quorum of processors

Distributed shared register (replicated data) Multiple copies of a data set (Quorum Update) E.g. Brain images (BIRN); Geological data (Geon)

Consensus (replicated functionality) State-machine approach (Paxos algorithm) E.g.: Parallel computation (TeraGrid)

1. Software incompatibility, misconfiguration

2. Shared resources (e.g. storage)3. Power failures4. Broken pipes5. Loss of air conditioning6. Network problems

Page 19: The virtue of dependent failures in multi-site systems Flavio Junqueira and Keith Marzullo University of California, San Diego Workshop on Hot Topics in

19HotDep’05

Why sites faWhy sites fa

1. Software incompatibility, misconfiguration2. Shared resources (e.g. storage)3. Power failures4. Broken pipes5. Loss of air conditioning6. Network problems

Page 20: The virtue of dependent failures in multi-site systems Flavio Junqueira and Keith Marzullo University of California, San Diego Workshop on Hot Topics in

20HotDep’05

Quorums in a multi-site systemQuorums in a multi-site system

Data replication Multiple copies of data sets

Functionality replication State-machine approach Paxos (Coteries for Classic Paxos)

Question: How do we choose nodes to replicate? Flat organization Organization into sites

Page 21: The virtue of dependent failures in multi-site systems Flavio Junqueira and Keith Marzullo University of California, San Diego Workshop on Hot Topics in

21HotDep’05

Quorum systemsQuorum systems

Quorum system Q Quorum system: set of quorums Quorum: set of processors Intersection property: every pair of quorums in Q intersect Algorithms: access a quorum when executing some operation

Examples Majority system:

n processors Every subset of size (n+1)/2 is a quorum Optimal availability for IID processor failures

Multi-colored: colors as sites

Processors

Quorums

Page 22: The virtue of dependent failures in multi-site systems Flavio Junqueira and Keith Marzullo University of California, San Diego Workshop on Hot Topics in

22HotDep’05

Quorum systems (cont.)Quorum systems (cont.)

In multi-site systems Replicated data

Multiple copies of a data set (Quorum update) E.g. Brain images(BIRN); Geological data (Geon)

Replicated functionality State-machine approach (Paxos algorithm) E.g.: Parallel computation (TeraGrid)

Quorums for multi-site systems Replicating on every node is excessive Quorum construction

Set of processors to replicate on Quorums

Page 23: The virtue of dependent failures in multi-site systems Flavio Junqueira and Keith Marzullo University of California, San Diego Workshop on Hot Topics in

23HotDep’05

Examples of quorum systemsExamples of quorum systems

Majority system: n processors Every subset of size (n+1)/2 is a quorum

Multi-colored: colors as sites

Majority has optimal availability for independent and identically distributed processor failures (IID)

Universe

Quorum patterns

Page 24: The virtue of dependent failures in multi-site systems Flavio Junqueira and Keith Marzullo University of California, San Diego Workshop on Hot Topics in

24HotDep’05

BIRN site availabilityBIRN site availability

10 sites have at least one outage

One site under 97%

Page 25: The virtue of dependent failures in multi-site systems Flavio Junqueira and Keith Marzullo University of California, San Diego Workshop on Hot Topics in

25HotDep’05

Discussion & Future workDiscussion & Future work

Multi-site systems: important class of distributed systems Share resources Collaboration among distant groups

Improve availability through replication A useful abstraction: quorum systems Algorithms built on top of quorum systems

Dependent failures Site failures Enables smaller, higher available quorums

Future work Validate multi-site threshold model Evaluate proposed constructions in practice More constructions More issues with dependent failures