100
Computing with Unreliable Resources: Design, Analysis and Algorithms Da Wang, Ph.D Candidate Thesis Committee: Gregory W. Wornell (Advisor) Yury Polyanskiy Devavrat Shah Signals, Information and Algorithms Laboratory Thesis Defense May 8, 2014

Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Embed Size (px)

Citation preview

Page 1: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Computing with Unreliable Resources:Design, Analysis and Algorithms

Da Wang, Ph.D Candidate

Thesis Committee: Gregory W. Wornell (Advisor)Yury PolyanskiyDevavrat Shah

Signals, Informationand AlgorithmsLaboratory

Thesis Defense

May 8, 2014

Page 2: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Computing with unreliable resources: an emerging paradigm

Large amount of data requires large amount of computingresources

I VLSI circuitI cloud computing/data centersI . . .

Why are resources unreliable?

technological constraints cost constraints

1 / 49

Page 3: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

A study via concrete applications

1 Circuit design with unreliable components

2 Scheduling parallel tasks with variableresponse times

3 Crowd-based ranking via noisy comparisons

2 / 49

Page 4: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Reliable circuit design withunreliable components

Page 5: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

In the news—April 2013

AMD claims 20nm transitionsignals the end of Moore’s law

Chief product architect at AMD:

“If you print too fewtransistors your chip willcost too much pertransistor . . .

. . . and if you put too manyit will cost too much pertransistor.”

3 / 49

Page 6: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

New challenge: fabrication flaws

Fabrication flaws

1 process variations2 fabrication defects

get worse aswe approach physical limits!

Flash ADC design with imprecise comparators

Captures process variations

Digital circuit design with faulty components

Captures fabrication defects

4 / 49

Page 7: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

New challenge: fabrication flaws

Fabrication flaws

1 process variations2 fabrication defects

get worse aswe approach physical limits!

Flash ADC design with imprecise comparators

Captures process variationsthis talk X

Digital circuit design with faulty components

Captures fabrication defectsin thesis

4 / 49

Page 8: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Reliable Analog-to-Digital Converter Design

with imprecise comparators

A theoretical framework

What are the fundamental performance limits?

Is optimal design for the precise comparators case still optimal?

Should we use imprecise comparators?

Joint work with: Yury Polyanskiy, Gregory Wornell

Acknowledgment: Frank Yaul, Anantha Chandrakasan

5 / 49

Page 9: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Reliable Analog-to-Digital Converter Design

with imprecise comparators

A theoretical framework

What are the fundamental performance limits?

Is optimal design for the precise comparators case still optimal?

Should we use imprecise comparators?

Joint work with: Yury Polyanskiy, Gregory Wornell

Acknowledgment: Frank Yaul, Anantha Chandrakasan

5 / 49

Page 10: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Reliable Analog-to-Digital Converter Design

with imprecise comparators

A theoretical framework

What are the fundamental performance limits?

Is optimal design for the precise comparators case still optimal?

Should we use imprecise comparators?

Joint work with: Yury Polyanskiy, Gregory Wornell

Acknowledgment: Frank Yaul, Anantha Chandrakasan

5 / 49

Page 11: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Reliable Analog-to-Digital Converter Design

with imprecise comparators

A theoretical framework

What are the fundamental performance limits?

Is optimal design for the precise comparators case still optimal?

Should we use imprecise comparators?

Joint work with: Yury Polyanskiy, Gregory Wornell

Acknowledgment: Frank Yaul, Anantha Chandrakasan

5 / 49

Page 12: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Reliable Analog-to-Digital Converter Design

with imprecise comparators

A theoretical framework

What are the fundamental performance limits?

Is optimal design for the precise comparators case still optimal?

Should we use imprecise comparators?

Joint work with: Yury Polyanskiy, Gregory Wornell

Acknowledgment: Frank Yaul, Anantha Chandrakasan5 / 49

Page 13: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Analog-to-Digital Converter (ADC)

ADCx 010→ x

vLSB

c1

c2

c3

c4

c5

c6

c7

c8

000

001

010

011

100

101

110

111

ADC Code

v1 v2 v3 v4 v5 v6 v7vlo = v0 v8 = vhi

vin

2b reconstructionvaluesn = 2b − 1referencevoltages

6 / 49

Page 14: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Analog-to-Digital Converter (ADC)

ADCx 010→ x

c1 c2 c3 c4 c5 c6 c7 c8

reconstruction values

v1 v2 v3 v4 v5 v6 v7

reference voltages

vlo = v0 v8 = vhi

2b reconstructionvaluesn = 2b − 1referencevoltages

6 / 49

Page 15: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

ADC and its key building block: comparator

c1 c2 · · · cn cn+1

v1 v2 · · · vn

Comparator

vin

vrefyout

yout =

{1 vin > vref

0 vin ≤ vref

The Flash ADC architecture

b

Enco

der

vin

......

...vn

yn

v2y2

v1y1

n = 2b − 1

7 / 49

Page 16: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

The imprecise comparator due to process variation

vin

vref

Vin

Vref

+

+

Zref ∼ N(0, σ2

2)

Zin ∼ N(0, σ2

1)

Yout

Zin and Zref:offsets due to process variationvariation↗ as comparator size↘independent, zero-meanGaussian distributed [Kinget 2005,Nuzzo 2008]

Note:fixed after fabricationrandomness: over a collection ofcomparatorsaggregate variation:

Z = Zref − Zin ∼ N(0, σ2)

8 / 49

Page 17: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Reference voltages impacts ADC performance

2-bit ADC

v1 v2 v3

x

x

error function:

e(x) = x− x

9 / 49

Page 18: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Reference voltages impacts ADC performance

2-bit ADC

v1 v2 v3

x

x

error function:

e(x) = x− x

9 / 49

Page 19: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

A call for mathematical framework

Existing theoretical error analysis (e.g., [Lundin 2005])

assumes small process variationdoes not attempt to change the design

ADC design with imprecise comparators

Practice ADC with redundancy [Flynn et al., 2003]

ADC with redundancy, calibration andreconfiguration [Daly et al., 2008]

Theory Little prior work

Related: scalar quantizer with random thresholds foruniform input [Goyal 2011]

10 / 49

Page 20: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

A call for mathematical framework

Existing theoretical error analysis (e.g., [Lundin 2005])

assumes small process variationdoes not attempt to change the design

ADC design with imprecise comparators

Practice ADC with redundancy [Flynn et al., 2003]

ADC with redundancy, calibration andreconfiguration [Daly et al., 2008]

Theory Little prior work

Related: scalar quantizer with random thresholds foruniform input [Goyal 2011]

10 / 49

Page 21: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

System model: ADC with redundancy and calibration

b-bit ADC

classical

comparatorbank

x

designed references vn

comparisonoutcomes yn encoder

x ∈{c1, . . . , c2b}

n = 2b − 1comparators

r: redundancy factor

11 / 49

Page 22: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

System model: ADC with redundancy and calibration

b-bit ADC

with redundancy

comparatorbank

x comparisonoutcomes Yn

actual references Vn

fabrication

designed references vn

Vi = vi + Zi

Zii.i.d.∼ N

(0, σ2

)

i = 1, 2, . . . , n

encoderx ∈{c1, . . . , c2b}

n = r · (2b − 1)comparators

r: redundancy factor

11 / 49

Page 23: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

System model: ADC with redundancy and calibration

b-bit ADC

with redundancy and calibration

comparatorbank

x comparisonoutcomes Yn

actual references Vn calibrationVn

fabrication

designed references vn

Vi = vi + Zi

Zii.i.d.∼ N

(0, σ2

)

i = 1, 2, . . . , n

encoderx ∈{c1, . . . , c2b}

n = r · (2b − 1)comparators

r: redundancy factor

11 / 49

Page 24: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Performance measures of ADC

v1 v2 v3

x

x

error function e(x) = x− x

mean-square error

MSE = EX[e(X)2]

maximum quantization error

emax = maxx|e(x)|

vn Vnanalyze

performanceMSE, emax

Is uniform v1, v2, . . . , vnstill optimal for uniforminput?

Is scaling down the sizeof comparators actuallybeneficial?

12 / 49

Page 25: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Challenge: randomness in reference voltages

What we design:

v0 v1 v2 v3 v4

After fabrication:

v0 v4v1 v2v3

v0 v4v1v2 v3

or . . .

ObservationsOrdering may change → order statisticsRandom interval sizes → ?

intervalsizes

density ofreferences

high resolutionapproximation

13 / 49

Page 26: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Challenge: randomness in reference voltages

What we design:

v0 v1 v2 v3 v4

After fabrication:

v0 v4v1 v2v3

v0 v4v1v2 v3

or . . .

ObservationsOrdering may change → order statisticsRandom interval sizes → ?

intervalsizes

density ofreferences

high resolutionapproximation

13 / 49

Page 27: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Challenge: randomness in reference voltages

What we design:

v0 v1 v2 v3 v4

After fabrication:

v0 v4v1 v2v3

v0 v4v1v2 v3

or . . .

ObservationsOrdering may change → order statisticsRandom interval sizes → ?

intervalsizes

density ofreferences

high resolutionapproximation

13 / 49

Page 28: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Challenge: randomness in reference voltages

What we design:

v0 v1 v2 v3 v4

After fabrication:

v0 v4v1 v2v3

v0 v4v1v2 v3

or . . .

ObservationsOrdering may change → order statisticsRandom interval sizes → ?

intervalsizes

density ofreferences

high resolutionapproximation

13 / 49

Page 29: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

High resolution approximation

Assume n→ ∞Represent vn by point density functions τ(x)

τ(x) dx ≈ number of vn in [x, x + dx]n

v1v2 vnx x + dx

τ(·)

Vn: point density functions λ(x)

λ(x) dx ≈ E[number of Vn in [x, x + dx]

]

n

Point density function simplifies analysis!

14 / 49

Page 30: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

High resolution approximation

Assume n→ ∞Represent vn by point density functions τ(x)

τ(x) dx ≈ number of vn in [x, x + dx]n

v1v2 vnx x + dx

τ(·)

Vn: point density functions λ(x)

λ(x) dx ≈ E[number of Vn in [x, x + dx]

]

n

Point density function simplifies analysis!14 / 49

Page 31: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Point density function guides references design

referencevoltages vn

point densityfunction τ(·)

high res. approx.

given n, “sample” τ(·)

Examples

τ ∼ Unif ([−1, 1])vn: n-point uniform grid on [-1, 1]

−a +a0

0.5

τ(x)

15 / 49

Page 32: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Point density function guides references design

referencevoltages vn

point densityfunction τ(·)

high res. approx.

given n, “sample” τ(·)

Examples

τ(x) = 0.5 · δ(x− a) + 0.5 · δ(x + a)vn:

I n/2 reference voltages at +aI n/2 reference voltages at −a −a +a

0

0.5

τ(x)

15 / 49

Page 33: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

With process variation, fabricated references matters

τ(·) vn

λ(·) Vn

what we can design

what determinesperformance

Performance characterization in λ(·)Want to find the optimal τ(·)

high res. approx.

“sample”

high res. approx.

sample

Vi = vi + Zi

Zii.i.d.∼ φ(·) ∼ N

(0, σ2)

[W., Polyanskiy &Wornell, ISIT’14]:

λ(x) = (τ ∗ φ)(x)

(∗: convolution)

−1 −0.5 0 0.5 10

0.2

0.4

0.6

τ(x)

λ(x)

16 / 49

Page 34: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

With process variation, fabricated references matters

τ(·) vn

λ(·) Vn

what we can design

what determinesperformance

Performance characterization in λ(·)Want to find the optimal τ(·)

high res. approx.

“sample”

high res. approx.

sample

Vi = vi + Zi

Zii.i.d.∼ φ(·) ∼ N

(0, σ2)

[W., Polyanskiy &Wornell, ISIT’14]:

λ(x) = (τ ∗ φ)(x)

(∗: convolution)

−1 −0.5 0 0.5 10

0.2

0.4

0.6

τ(x)

λ(x)

16 / 49

Page 35: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Process variation increases MSE 6-fold

MSE = EX[e(X)2]Input X ∼ fX(·),

classical case [Bennett 1948, Panter & Dite 1951]

MSE ' 112n2

∫ fX(x)λ2(x)

dx λ = τ

with process variations [W., Polyanskiy & Wornell, ISIT’14]

MSE ' 12n2

∫ fX(x)λ2(x)

dx λ = τ ∗ φ

Why 6 times?

uniform grid vs. random division of an interval(a topic in order statistics)

Optimal τ

a necessary and sufficient condition [W., Polyanskiy & Wornell, ISIT’14]17 / 49

Page 36: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

MSE-optimal designs can be quite differentUniform input distribution [W., Polyanskiy & Wornell, ISIT’14]

fX ∼ Unif ([−1, 1])

σ0 ≈ 0.7228

σ < σ0

locally optimal iterativeoptimization⇒ τ∗(x)

σ ≥ σ0

the necessary and sufficientcondition⇒ τ∗(x) = δ(x)

σ = 0.1

−1 −0.5 0 0.5 10

0.2

0.4

0.6

τ∗(x)

λ∗(x)

18 / 49

Page 37: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

MSE-optimal designs can be quite differentUniform input distribution [W., Polyanskiy & Wornell, ISIT’14]

fX ∼ Unif ([−1, 1])

σ0 ≈ 0.7228

σ < σ0

locally optimal iterativeoptimization⇒ τ∗(x)

σ ≥ σ0

the necessary and sufficientcondition⇒ τ∗(x) = δ(x)

σ = 0.2

−1 −0.5 0 0.5 10

0.2

0.4

0.6

τ∗(x)

λ∗(x)

18 / 49

Page 38: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

MSE-optimal designs can be quite differentUniform input distribution [W., Polyanskiy & Wornell, ISIT’14]

fX ∼ Unif ([−1, 1])

σ0 ≈ 0.7228

σ < σ0

locally optimal iterativeoptimization⇒ τ∗(x)

σ ≥ σ0

the necessary and sufficientcondition⇒ τ∗(x) = δ(x)

σ = 0.3

−1 −0.5 0 0.5 10

0.2

0.4

0.6

τ∗(x)

λ∗(x)

18 / 49

Page 39: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

MSE-optimal designs can be quite differentUniform input distribution [W., Polyanskiy & Wornell, ISIT’14]

fX ∼ Unif ([−1, 1])

σ0 ≈ 0.7228

σ < σ0

locally optimal iterativeoptimization⇒ τ∗(x)

σ ≥ σ0

the necessary and sufficientcondition⇒ τ∗(x) = δ(x)

σ = 0.4

−1 −0.5 0 0.5 10

0.2

0.4

0.6

τ∗(x)

λ∗(x)

18 / 49

Page 40: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

MSE-optimal designs can be quite differentUniform input distribution [W., Polyanskiy & Wornell, ISIT’14]

fX ∼ Unif ([−1, 1])

σ0 ≈ 0.7228

σ < σ0

locally optimal iterativeoptimization⇒ τ∗(x)

σ ≥ σ0

the necessary and sufficientcondition⇒ τ∗(x) = δ(x)

σ = 0.5

−1 −0.5 0 0.5 10

0.2

0.4

0.6

τ∗(x)

λ∗(x)

18 / 49

Page 41: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

MSE-optimal designs can be quite differentUniform input distribution [W., Polyanskiy & Wornell, ISIT’14]

fX ∼ Unif ([−1, 1])

σ0 ≈ 0.7228

σ < σ0

locally optimal iterativeoptimization⇒ τ∗(x)

σ ≥ σ0

the necessary and sufficientcondition⇒ τ∗(x) = δ(x)

σ = 0.6

−1 −0.5 0 0.5 10

0.2

0.4

0.6

τ∗(x)

λ∗(x)

18 / 49

Page 42: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

MSE-optimal designs can be quite differentUniform input distribution [W., Polyanskiy & Wornell, ISIT’14]

fX ∼ Unif ([−1, 1])

σ0 ≈ 0.7228

σ < σ0

locally optimal iterativeoptimization⇒ τ∗(x)

σ ≥ σ0

the necessary and sufficientcondition⇒ τ∗(x) = δ(x)

σ = 0.7

−1 −0.5 0 0.5 10

0.2

0.4

0.6

τ∗(x)

λ∗(x)

18 / 49

Page 43: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

MSE-optimal designs can be quite differentUniform input distribution [W., Polyanskiy & Wornell, ISIT’14]

fX ∼ Unif ([−1, 1])

σ0 ≈ 0.7228

σ < σ0

locally optimal iterativeoptimization⇒ τ∗(x)

σ ≥ σ0

the necessary and sufficientcondition⇒ τ∗(x) = δ(x)

σ = 0.8

−1 −0.5 0 0.5 10

0.2

0.4

0.6

0.8

1

τ∗(x)

λ∗(x)

18 / 49

Page 44: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Analyzing maximum quantization error

emax = maxx |e(x)|High resolution approximation for

c.d.f. P [emax ≤ x]Range [a, b] P [emax ∈ [a, b]] ≈ 1

Accurate for finite n (e.g., when n ≥ 100)emax often measured in

LSB ,input range

2b

19 / 49

Page 45: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

MSE-optimal designs improve yield

5%–10% for 6-bit FlashADC Yield , P [emax ≤ ∆max]

e.g. ∆max = 1LSB

0 0.2 0.4 0.6 0.8 1 1.20

0.2

0.4

0.6

0.8

1

σ

yield

r = 8, MSE-optimalr = 8, uniformr = 6, MSE-optimalr = 6, uniform

20 / 49

Page 46: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Scaling down the size of comparators is beneficial

For circuit fabrication [Kinget 2005, Nuzzo 2008],

process variation σ2 ∝1

component area

Given a fixed silicon area,

# components n ∝1

component area

Uniform input distribution, when σ ≥ σ0,

MSE ≈ 2πσ2/n2

emaxd.= Θ

(√π/2σ

bn

) σ2 ∝ n====⇒

MSE = Θ (1/n)

emax = Θ(

b√n

)

Building an ADC with more smaller but less precise comparatorsimproves accuracy!

21 / 49

Page 47: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Scaling down the size of comparators is beneficial

For circuit fabrication [Kinget 2005, Nuzzo 2008],

process variation σ2 ∝1

component area

Given a fixed silicon area,

# components n ∝1

component area

Uniform input distribution, when σ ≥ σ0,

MSE ≈ 2πσ2/n2

emaxd.= Θ

(√π/2σ

bn

) σ2 ∝ n====⇒

MSE = Θ (1/n)

emax = Θ(

b√n

)

Building an ADC with more smaller but less precise comparatorsimproves accuracy!

21 / 49

Page 48: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Computing with unreliable resources: circuit design

mathematicalabstraction

performancetrade-offs

designguidance

redundancyvs.

yield

distributionof fabrication

variation

asymptoticanalysis

quantizationtheory, order

statistics

22 / 49

Page 49: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Scheduling parallel taskswith variable response times

Page 50: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Executing parallel tasks: simple yet important

Executing parallel tasks

“Map” stage of MapReducedistributed parallel algorithms

I ADMM, MCMC, . . .time series analysis

I yearly data→ weekly data

crowd sourcing. . .

task 1

task 2

task 3

task n

... ...

In commonGiven:

a large collection of tasks that can be run in parallel

Want:results for all tasks

But this could be slow!

23 / 49

Page 51: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Executing parallel tasks: simple yet important

Executing parallel tasks

“Map” stage of MapReducedistributed parallel algorithms

I ADMM, MCMC, . . .time series analysis

I yearly data→ weekly data

crowd sourcing. . .

task 1

task 2

task 3

task n

... ...

In commonGiven:

a large collection of tasks that can be run in parallel

Want:results for all tasks

But this could be slow!23 / 49

Page 52: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Issue: variability in data centers

ObservationsThe response time of acomputer in a data centervaries

Latency: determined by theslowest machine

I worse as we get moremachines

In Google’s data center

Shared machine [Dean 2012]

How to reduce latency?

24 / 49

Page 53: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Issue: variability in data centers

ObservationsThe response time of acomputer in a data centervaries

Latency: determined by theslowest machine

I worse as we get moremachines

In Google’s data center

Execution times [Dean 2013]for a large collection of tasks

median completion timefor one: 1msmedian completion timefor all: 40ms

How to reduce latency?

24 / 49

Page 54: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Backup tasks in Google MapReduceDean et al., 2008

The backup task option

1 Run n tasks on n machines in parallel2 Replicate: when 10% of the tasks left3 Take the earliest results

EffectivenessReduce latency significantly (e.g., 1/3 for distributed sort)Handles the issue of “stragglers”

25 / 49

Page 55: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

A call for theoretical analysis

Considerable follow-up work in systems

1 [Zaharia et al., USENIX OSDI 2008]: adopted by Facebook2 [Ananthanarayanan et al., USENIX OSDI 2010]: adopted by Microsoft

Bing3 [Ananthanarayanan et al., USENIX NSDI 2013]

Existing theoretical work inadequate

Stochastic scheduling: task replication not consideredNeed to understand

I latency reduction vs. additional resource usageI when and how replication could be beneficial

26 / 49

Page 56: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Scheduling problem formulation

ProblemExecuting a collection of n parallel tasks.

n: hundreds or thousands [Reiss et al., 2012]

System model

Execution time of each task i.i.d.∼ FX.Scheduling actions

I Send a task to a new machine to runI Terminate all machines running a certain task

Feedback: instantaneous feedback upon completion

27 / 49

Page 57: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Latency

The j-th copy of task i{

launched at time ti,j

execution time Xi,ji.i.d.∼ FX

Completion time for task i:

Ti , minj(ti,j + Xi,j)

Latency:

T , maxi

Ti

N1

N2

N3

N4

X1,1

X1,2

X2,1

X2,2

T1 = 8 T2 = 100 t1,2 = 2 t2,2 = 5

t

T = max {T1, T2} = 10

28 / 49

Page 58: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Total machine time as cost measure

The j-th copy of task i{

launched at time ti,j

execution time Xi,ji.i.d.∼ FX

In data centersTotal machine time

C ,n

∑i=1

ri+1

∑j=1

∣∣Ti − ti,j∣∣+

N1

N2

N3

N4

X1,1

X1,2

X2,1

X2,2

T1 = 8 T2 = 100 t1,2 = 2 t2,2 = 5

t

Note: We focus on expected values E [T] and E [C]

29 / 49

Page 59: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Replication helps!

X =

{2 w.p. 0.97 w.p. 0.1

No replication:

E [T] = 2.5E [C] = E [T] = 2.5

2

0.9

7

0.1

0 t

Replicate task at t2 = 2:

E [T] = 2.23E [C] = 2.46

2

0.8

4

0.1 × 0.9

7

0.1 × 0.1

0 t

30 / 49

Page 60: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Execution time modeling

Discrete random variables [W., Joshi & Wornell, SIGMETRICS’14]

Arise directly from estimation (quantiles)Offers more flexible modeling

Continuous random variablesPareto, ExponentialAnalysis: an important class of policies

31 / 49

Page 61: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Execution time modeling

Discrete random variables [W., Joshi & Wornell, SIGMETRICS’14]

Arise directly from estimation (quantiles)Offers more flexible modeling

in thesis

Continuous random variablesPareto, ExponentialAnalysis: an important class of policies

this talk X

31 / 49

Page 62: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Single-fork policy

Run n tasks in parallel initiallyWhen there is p fraction of the tasks left, replicate the unfinishedtasks r times

I Let all the unfinished tasks keep runningI Relaunch all the unfinished tasks

Without relaunching

task koriginal copy

replica 1... ...replica r

“fork”

With relaunching

task k

new replica

replica 1... ...replica r

“fork”

Note:after replication, r + 1 tasks in totalwhen r = 0: only the relaunching case is interesting

32 / 49

Page 63: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Single-fork policy

Run n tasks in parallel initiallyWhen there is p fraction of the tasks left, replicate the unfinishedtasks r times

I Let all the unfinished tasks keep runningI Relaunch all the unfinished tasks

Without relaunching

task koriginal copy

replica 1... ...replica r

“fork”

With relaunching

task k

new replica

replica 1... ...replica r

“fork”

Note:after replication, r + 1 tasks in totalwhen r = 0: only the relaunching case is interesting

32 / 49

Page 64: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Latency analysis

Order statisticsGiven random variables X1, X2, . . . , Xn

Xmin = X1:n ≤ X2:n ≤ . . . ≤ Xn:n = Xmax

Latency

Replicated tasks: new execution time distribution FYI FY = g(FX , r, relaunch or not)

Replicate for the last p fraction of the tasks:

T = X(1−p)n:n + Ypn:pn

before forking after forking

33 / 49

Page 65: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Latency analysis

Order statisticsGiven random variables X1, X2, . . . , Xn

Xmin = X1:n ≤ X2:n ≤ . . . ≤ Xn:n = Xmax

Latency

Replicated tasks: new execution time distribution FYI FY = g(FX , r, relaunch or not)

Replicate for the last p fraction of the tasks:

T = X(1−p)n:n + Ypn:pn

before forking after forking

33 / 49

Page 66: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Central value theorem

p-quantile of adistribution FX

xp , F−1X (p)

xp

p

x

FX

As n→ ∞,

Xpn:n → N

(xp,

1n

p(1− p)f 2X(xp)

)

Time before forking:

E[

X(1−p)n:n

]= x1−p , F−1

X (1− p)

34 / 49

Page 67: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Extreme value theorem

Theorem (Fisher-Tippett-Gnedenko theorem)

Given X1, X2, . . . , Xni.i.d.∼ FX , if there exist sequences of constants an > 0 and

bn ∈ R such that

P

[Xn:n − bn

an≤ z]→ G(z)

as n→ ∞ and G is a non-degenerate distribution, then G belongs to one of thefollowing families:

Gumbel law G(z) = exp {− exp (−z)} for z ∈ R,

Frechet law G(z) =

{0 z ≤ 0

exp {−z−α} z > 0,

Weibull law G(z) =

{exp

{− (−z)α} z < 0

1 z ≥ 0,

where α > 0.35 / 49

Page 68: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Extreme value theorem

Theorem (Fisher-Tippett-Gnedenko theorem)

Given X1, X2, . . . , Xni.i.d.∼ FX , if there exist sequences of constants an > 0 and

bn ∈ R such that

P

[Xn:n − bn

an≤ z]→ G(z)

as n→ ∞ and G is a non-degenerate distribution, then G belongs to one of thefollowing families:

Gumbel law G(z) = exp {− exp (−z)} for z ∈ R,

Frechet law G(z) =

{0 z ≤ 0

exp {−z−α} z > 0,

Weibull law G(z) =

{exp

{− (−z)α} z < 0

1 z ≥ 0,

where α > 0.

Given FX, the distribution of Xn:n can be character-ized as n→ ∞.

Key: tail behavior of 1− FX

Also applicable to X1:n

35 / 49

Page 69: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Single-fork analysis

executiontime beforeforking: FX

Xn:n → GX

X1:r → G′X

time before forking:X(1−p)n:n →F−1

X (1− p)

executiontime afterforking: FY

time after fork-ing: Ypn:pn → GY

central value theorem

extreme value theorem

Y = g(FX, r, relaunch or not)

extreme value theorem

T = X(1−p)n:n + Ypn:pn

Cost can be analyzed similarly.36 / 49

Page 70: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Execution time: Pareto distribution

Pareto (α, xm).

FX(x; α, xm) ,

{1−

( xmx

)α x ≥ xm

0 x < xm

heavy-tail distributionobserved in data centers [Reisset al., 2012].

0 2 4 6 80

0.5

1

x

fPareto(2,2)(x)

Extremes

X1:n ∼ Pareto (nα, xm)

Xn:n

xmn1/α∼ Frechet

E [Xn:n] = xmn1/αΓ (1− 1/α) ∝ n1/α ·E [X]

37 / 49

Page 71: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Recall: single-fork policy

Parametersnumber of tasks: nfraction to replicate: padditional replicas: r

Without relaunching

task koriginal copy

replica 1... ...replica r

“fork”

With relaunching

task k

new replica

replica 1... ...replica r

“fork”

38 / 49

Page 72: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Asymptotic characterization accurate in finite regime

X ∼ Pareto (2, 2) and replication fraction p = 0.2

200 400 600 800 1,000

8

10

12

14

16

r = 1 no relaunch

r = 1 & relaunch

r = 2 no relaunch

r = 2 & relaunch

n

Latency

39 / 49

Page 73: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Latency & Cost vs. replication fraction p

X ∼ Pareto (2, 2) with n = 400

0 0.1 0.2 0.3 0.4 0.5 0.60

20

40

60

p

E [T]r = 0 & relaunchr = 1 & relaunchr = 1 no relaunchr = 2 & relaunchr = 2 no relaunch

40 / 49

Page 74: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Latency & Cost vs. replication fraction p

X ∼ Pareto (2, 2) with n = 400

0 0.1 0.2 0.3 0.4 0.5 0.63

4

5

6

7

p

E [Ccloud]

r = 0 & relaunchr = 1 & relaunchr = 1 no relaunchr = 2 & relaunchr = 2 no relaunch

40 / 49

Page 75: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Latency-cost trade-off

X ∼ Pareto (2, 2) with n = 400

3 4 5 6 7 8 90

10

20

30

40

50

60

70

Cost

Latency

r = 0 & relaunch

41 / 49

Page 76: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Latency-cost trade-off

X ∼ Pareto (2, 2) with n = 400

3 4 5 6 7 8 90

10

20

30

40

50

60

70

Cost

Latency

r = 0 & relaunchr = 1 & relaunchr = 1 no relaunch

41 / 49

Page 77: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Latency-cost trade-off

X ∼ Pareto (2, 2) with n = 400

3 4 5 6 7 8 90

10

20

30

40

50

60

70

Cost

Latency

r = 0 & relaunchr = 1 & relaunchr = 1 no relaunchr = 2 & relaunchr = 2 no relaunch

41 / 49

Page 78: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Latency-cost trade-off

X ∼ Pareto (2, 2) with n = 400

3 4 5 6 7 8 90

10

20

30

40

50

60

70

Cost

Latency

r = 0 & relaunchr = 1 & relaunchr = 1 no relaunchr = 2 & relaunchr = 2 no relaunch

3 4 5 6 7 8 90

10

20

30

40

50

60

70

Cost

Latency

r = 0 & relaunchr = 1 & relaunchr = 1 no relaunchr = 2 & relaunchr = 2 no relaunch

41 / 49

Page 79: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Recap

A framework for analyzing single-fork policiesI Can be extended to multi-fork

Applicable to any distributions that can be analyzed via EVTI Pareto, Exponential, Erlang, etc..

log files executiontime FX

latency-cost

trade-off

replicationstrategy

densityestimation

our frameworkapplication

scenario

42 / 49

Page 80: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Computing with unreliable resources: scheduling

mathematicalabstraction

performancetrade-offs

designguidance

latencyvs.

cost

tail behavior ofexecution time

distribution

asymptoticanalysis

order statistics,extreme

value theory

43 / 49

Page 81: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Crowd-based ranking vianoisy comparisons

Page 82: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Crowd-based ranking

Rank by score assignment

Examples: IMDb.com, Yelp.com

Rank by pairwise comparison

Examples: admission/recruiting, knockout stage of tournamentsChallenges:

I human comparisons are noisy: answer flipped w.p. εI human comparisons are expensive (economic cost, time, . . . )

What is the fundamental trade-off betweenranking accuracy and the number of comparisons?

approximate sorting via (noisy) comparisons

44 / 49

Page 83: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Information-theoretic lower bounds on #comparisons

Distortion measure`1 distance of permutations d`1 (π1, π2) ,

n

∑i=1|π1(i)− π2(i)|

To achieve distortion D ≤ Θ(n1+δ

)

Approximate sorting with noiseless comparisonsI [W., Mazumdar & Wornell, ISIT’14]: at least

(1− δ)n log n comparisons

I Tight: the multiple selection algorithm in [Kaligosi 2005] achievesthis bound

Approximate sorting with noisy comparisonsI [W., Mazumdar & Wornell, ISIT’14]: at least

(1− δ)

1− Hb (ε)n log n comparisons

I Existing algorithms only known to be O (n log n)

45 / 49

Page 84: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Information-theoretic lower bounds on #comparisons

Distortion measure`1 distance of permutations d`1 (π1, π2) ,

n

∑i=1|π1(i)− π2(i)|

To achieve distortion D ≤ Θ(n1+δ

)

Approximate sorting with noiseless comparisonsI [W., Mazumdar & Wornell, ISIT’14]: at least

(1− δ)n log n comparisons

I Tight: the multiple selection algorithm in [Kaligosi 2005] achievesthis bound

Approximate sorting with noisy comparisonsI [W., Mazumdar & Wornell, ISIT’14]: at least

(1− δ)

1− Hb (ε)n log n comparisons

I Existing algorithms only known to be O (n log n)45 / 49

Page 85: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

More results

More distortion measures [W., Mazumdar & Wornell, ISIT’14]

Kendall tau distance, Chebyshev distance, . . .Relationships among distortion measures

Other distributional modelMallows distributional model

46 / 49

Page 86: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Computing with unreliable resources: ranking

mathematicalabstraction

performancetrade-offs

designguidance

accuracyvs.

#comparisons

error proba-bility of noisycomparisons

asymptoticanalysis

rate-distortiontheory,

combinatorics

47 / 49

Page 87: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Computing with unreliable resources

mathematicalabstraction

performancetrade-offs

designguidance

reliabilityvs.

resource usage

statisticalproperties ofunreliability

asymptoticanalysis

suitable forlarge scalesystems!

A unified “coding theory” for these computing problems?

A journey of a thousand miles begins with a single step.— Lao Tzu

48 / 49

Page 88: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Computing with unreliable resources

mathematicalabstraction

performancetrade-offs

designguidance

reliabilityvs.

resource usage

statisticalproperties ofunreliability

asymptoticanalysis

suitable forlarge scalesystems!

A unified “coding theory” for these computing problems?

A journey of a thousand miles begins with a single step.— Lao Tzu

48 / 49

Page 89: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

Computing with unreliable resources

mathematicalabstraction

performancetrade-offs

designguidance

reliabilityvs.

resource usage

statisticalproperties ofunreliability

asymptoticanalysis

suitable forlarge scalesystems!

A unified “coding theory” for these computing problems?

A journey of a thousand miles begins with a single step.— Lao Tzu

48 / 49

Page 90: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

This thesis is impossible without . . .

Thesis committee: Greg, Yury and Devavrat

Collaborators: Gauri Joshi, Arya Mazumdar

The community: SiA, RLE, LIDS, CSAIL, MTL, EECS, GSA, GSC,MIT, . . .

I friendsI colleaguesI constructive criticisersI poker opponentsI . . .

My parents

Hengni

Thank you!

49 / 49

Page 91: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

This thesis is impossible without . . .

Thesis committee: Greg, Yury and Devavrat

Collaborators: Gauri Joshi, Arya Mazumdar

The community: SiA, RLE, LIDS, CSAIL, MTL, EECS, GSA, GSC,MIT, . . .

I friendsI colleaguesI constructive criticisersI poker opponentsI . . .

My parents

Hengni

Thank you!

49 / 49

Page 92: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

This thesis is impossible without . . .

Thesis committee: Greg, Yury and Devavrat

Collaborators: Gauri Joshi, Arya Mazumdar

The community: SiA, RLE, LIDS, CSAIL, MTL, EECS, GSA, GSC,MIT, . . .

I friendsI colleaguesI constructive criticisersI poker opponentsI . . .

My parents

Hengni

Thank you!

49 / 49

Page 93: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

This thesis is impossible without . . .

Thesis committee: Greg, Yury and Devavrat

Collaborators: Gauri Joshi, Arya Mazumdar

The community: SiA, RLE, LIDS, CSAIL, MTL, EECS, GSA, GSC,MIT, . . .

I friends

I colleaguesI constructive criticisersI poker opponentsI . . .

My parents

Hengni

Thank you!

49 / 49

Page 94: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

This thesis is impossible without . . .

Thesis committee: Greg, Yury and Devavrat

Collaborators: Gauri Joshi, Arya Mazumdar

The community: SiA, RLE, LIDS, CSAIL, MTL, EECS, GSA, GSC,MIT, . . .

I friendsI colleagues

I constructive criticisersI poker opponentsI . . .

My parents

Hengni

Thank you!

49 / 49

Page 95: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

This thesis is impossible without . . .

Thesis committee: Greg, Yury and Devavrat

Collaborators: Gauri Joshi, Arya Mazumdar

The community: SiA, RLE, LIDS, CSAIL, MTL, EECS, GSA, GSC,MIT, . . .

I friendsI colleaguesI constructive criticisers

I poker opponentsI . . .

My parents

Hengni

Thank you!

49 / 49

Page 96: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

This thesis is impossible without . . .

Thesis committee: Greg, Yury and Devavrat

Collaborators: Gauri Joshi, Arya Mazumdar

The community: SiA, RLE, LIDS, CSAIL, MTL, EECS, GSA, GSC,MIT, . . .

I friendsI colleaguesI constructive criticisersI poker opponents

I . . .

My parents

Hengni

Thank you!

49 / 49

Page 97: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

This thesis is impossible without . . .

Thesis committee: Greg, Yury and Devavrat

Collaborators: Gauri Joshi, Arya Mazumdar

The community: SiA, RLE, LIDS, CSAIL, MTL, EECS, GSA, GSC,MIT, . . .

I friendsI colleaguesI constructive criticisersI poker opponentsI . . .

My parents

Hengni

Thank you!

49 / 49

Page 98: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

This thesis is impossible without . . .

Thesis committee: Greg, Yury and Devavrat

Collaborators: Gauri Joshi, Arya Mazumdar

The community: SiA, RLE, LIDS, CSAIL, MTL, EECS, GSA, GSC,MIT, . . .

I friendsI colleaguesI constructive criticisersI poker opponentsI . . .

My parents

Hengni

Thank you!

49 / 49

Page 99: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

This thesis is impossible without . . .

Thesis committee: Greg, Yury and Devavrat

Collaborators: Gauri Joshi, Arya Mazumdar

The community: SiA, RLE, LIDS, CSAIL, MTL, EECS, GSA, GSC,MIT, . . .

I friendsI colleaguesI constructive criticisersI poker opponentsI . . .

My parents

Hengni

Thank you!

49 / 49

Page 100: Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da Wang, ... Large amountof data requireslarge amountof computing ... 0.4 0.6 t(x) l(x)

This thesis is impossible without . . .

Thesis committee: Greg, Yury and Devavrat

Collaborators: Gauri Joshi, Arya Mazumdar

The community: SiA, RLE, LIDS, CSAIL, MTL, EECS, GSA, GSC,MIT, . . .

I friendsI colleaguesI constructive criticisersI poker opponentsI . . .

My parents

Hengni

Thank you!

49 / 49