Computing with Unreliable Resources: Design, Analysis … · Design, Analysis and Algorithms Da...

Preview:

Citation preview

Computing with Unreliable Resources:Design, Analysis and Algorithms

Da Wang, Ph.D Candidate

Thesis Committee: Gregory W. Wornell (Advisor)Yury PolyanskiyDevavrat Shah

Signals, Informationand AlgorithmsLaboratory

Thesis Defense

May 8, 2014

Computing with unreliable resources: an emerging paradigm

Large amount of data requires large amount of computingresources

I VLSI circuitI cloud computing/data centersI . . .

Why are resources unreliable?

technological constraints cost constraints

1 / 49

A study via concrete applications

1 Circuit design with unreliable components

2 Scheduling parallel tasks with variableresponse times

3 Crowd-based ranking via noisy comparisons

2 / 49

Reliable circuit design withunreliable components

In the news—April 2013

AMD claims 20nm transitionsignals the end of Moore’s law

Chief product architect at AMD:

“If you print too fewtransistors your chip willcost too much pertransistor . . .

. . . and if you put too manyit will cost too much pertransistor.”

3 / 49

New challenge: fabrication flaws

Fabrication flaws

1 process variations2 fabrication defects

get worse aswe approach physical limits!

Flash ADC design with imprecise comparators

Captures process variations

Digital circuit design with faulty components

Captures fabrication defects

4 / 49

New challenge: fabrication flaws

Fabrication flaws

1 process variations2 fabrication defects

get worse aswe approach physical limits!

Flash ADC design with imprecise comparators

Captures process variationsthis talk X

Digital circuit design with faulty components

Captures fabrication defectsin thesis

4 / 49

Reliable Analog-to-Digital Converter Design

with imprecise comparators

A theoretical framework

What are the fundamental performance limits?

Is optimal design for the precise comparators case still optimal?

Should we use imprecise comparators?

Joint work with: Yury Polyanskiy, Gregory Wornell

Acknowledgment: Frank Yaul, Anantha Chandrakasan

5 / 49

Reliable Analog-to-Digital Converter Design

with imprecise comparators

A theoretical framework

What are the fundamental performance limits?

Is optimal design for the precise comparators case still optimal?

Should we use imprecise comparators?

Joint work with: Yury Polyanskiy, Gregory Wornell

Acknowledgment: Frank Yaul, Anantha Chandrakasan

5 / 49

Reliable Analog-to-Digital Converter Design

with imprecise comparators

A theoretical framework

What are the fundamental performance limits?

Is optimal design for the precise comparators case still optimal?

Should we use imprecise comparators?

Joint work with: Yury Polyanskiy, Gregory Wornell

Acknowledgment: Frank Yaul, Anantha Chandrakasan

5 / 49

Reliable Analog-to-Digital Converter Design

with imprecise comparators

A theoretical framework

What are the fundamental performance limits?

Is optimal design for the precise comparators case still optimal?

Should we use imprecise comparators?

Joint work with: Yury Polyanskiy, Gregory Wornell

Acknowledgment: Frank Yaul, Anantha Chandrakasan

5 / 49

Reliable Analog-to-Digital Converter Design

with imprecise comparators

A theoretical framework

What are the fundamental performance limits?

Is optimal design for the precise comparators case still optimal?

Should we use imprecise comparators?

Joint work with: Yury Polyanskiy, Gregory Wornell

Acknowledgment: Frank Yaul, Anantha Chandrakasan5 / 49

Analog-to-Digital Converter (ADC)

ADCx 010→ x

vLSB

c1

c2

c3

c4

c5

c6

c7

c8

000

001

010

011

100

101

110

111

ADC Code

v1 v2 v3 v4 v5 v6 v7vlo = v0 v8 = vhi

vin

2b reconstructionvaluesn = 2b − 1referencevoltages

6 / 49

Analog-to-Digital Converter (ADC)

ADCx 010→ x

c1 c2 c3 c4 c5 c6 c7 c8

reconstruction values

v1 v2 v3 v4 v5 v6 v7

reference voltages

vlo = v0 v8 = vhi

2b reconstructionvaluesn = 2b − 1referencevoltages

6 / 49

ADC and its key building block: comparator

c1 c2 · · · cn cn+1

v1 v2 · · · vn

Comparator

vin

vrefyout

yout =

{1 vin > vref

0 vin ≤ vref

The Flash ADC architecture

b

Enco

der

vin

......

...vn

yn

v2y2

v1y1

n = 2b − 1

7 / 49

The imprecise comparator due to process variation

vin

vref

Vin

Vref

+

+

Zref ∼ N(0, σ2

2)

Zin ∼ N(0, σ2

1)

Yout

Zin and Zref:offsets due to process variationvariation↗ as comparator size↘independent, zero-meanGaussian distributed [Kinget 2005,Nuzzo 2008]

Note:fixed after fabricationrandomness: over a collection ofcomparatorsaggregate variation:

Z = Zref − Zin ∼ N(0, σ2)

8 / 49

Reference voltages impacts ADC performance

2-bit ADC

v1 v2 v3

x

x

error function:

e(x) = x− x

9 / 49

Reference voltages impacts ADC performance

2-bit ADC

v1 v2 v3

x

x

error function:

e(x) = x− x

9 / 49

A call for mathematical framework

Existing theoretical error analysis (e.g., [Lundin 2005])

assumes small process variationdoes not attempt to change the design

ADC design with imprecise comparators

Practice ADC with redundancy [Flynn et al., 2003]

ADC with redundancy, calibration andreconfiguration [Daly et al., 2008]

Theory Little prior work

Related: scalar quantizer with random thresholds foruniform input [Goyal 2011]

10 / 49

A call for mathematical framework

Existing theoretical error analysis (e.g., [Lundin 2005])

assumes small process variationdoes not attempt to change the design

ADC design with imprecise comparators

Practice ADC with redundancy [Flynn et al., 2003]

ADC with redundancy, calibration andreconfiguration [Daly et al., 2008]

Theory Little prior work

Related: scalar quantizer with random thresholds foruniform input [Goyal 2011]

10 / 49

System model: ADC with redundancy and calibration

b-bit ADC

classical

comparatorbank

x

designed references vn

comparisonoutcomes yn encoder

x ∈{c1, . . . , c2b}

n = 2b − 1comparators

r: redundancy factor

11 / 49

System model: ADC with redundancy and calibration

b-bit ADC

with redundancy

comparatorbank

x comparisonoutcomes Yn

actual references Vn

fabrication

designed references vn

Vi = vi + Zi

Zii.i.d.∼ N

(0, σ2

)

i = 1, 2, . . . , n

encoderx ∈{c1, . . . , c2b}

n = r · (2b − 1)comparators

r: redundancy factor

11 / 49

System model: ADC with redundancy and calibration

b-bit ADC

with redundancy and calibration

comparatorbank

x comparisonoutcomes Yn

actual references Vn calibrationVn

fabrication

designed references vn

Vi = vi + Zi

Zii.i.d.∼ N

(0, σ2

)

i = 1, 2, . . . , n

encoderx ∈{c1, . . . , c2b}

n = r · (2b − 1)comparators

r: redundancy factor

11 / 49

Performance measures of ADC

v1 v2 v3

x

x

error function e(x) = x− x

mean-square error

MSE = EX[e(X)2]

maximum quantization error

emax = maxx|e(x)|

vn Vnanalyze

performanceMSE, emax

Is uniform v1, v2, . . . , vnstill optimal for uniforminput?

Is scaling down the sizeof comparators actuallybeneficial?

12 / 49

Challenge: randomness in reference voltages

What we design:

v0 v1 v2 v3 v4

After fabrication:

v0 v4v1 v2v3

v0 v4v1v2 v3

or . . .

ObservationsOrdering may change → order statisticsRandom interval sizes → ?

intervalsizes

density ofreferences

high resolutionapproximation

13 / 49

Challenge: randomness in reference voltages

What we design:

v0 v1 v2 v3 v4

After fabrication:

v0 v4v1 v2v3

v0 v4v1v2 v3

or . . .

ObservationsOrdering may change → order statisticsRandom interval sizes → ?

intervalsizes

density ofreferences

high resolutionapproximation

13 / 49

Challenge: randomness in reference voltages

What we design:

v0 v1 v2 v3 v4

After fabrication:

v0 v4v1 v2v3

v0 v4v1v2 v3

or . . .

ObservationsOrdering may change → order statisticsRandom interval sizes → ?

intervalsizes

density ofreferences

high resolutionapproximation

13 / 49

Challenge: randomness in reference voltages

What we design:

v0 v1 v2 v3 v4

After fabrication:

v0 v4v1 v2v3

v0 v4v1v2 v3

or . . .

ObservationsOrdering may change → order statisticsRandom interval sizes → ?

intervalsizes

density ofreferences

high resolutionapproximation

13 / 49

High resolution approximation

Assume n→ ∞Represent vn by point density functions τ(x)

τ(x) dx ≈ number of vn in [x, x + dx]n

v1v2 vnx x + dx

τ(·)

Vn: point density functions λ(x)

λ(x) dx ≈ E[number of Vn in [x, x + dx]

]

n

Point density function simplifies analysis!

14 / 49

High resolution approximation

Assume n→ ∞Represent vn by point density functions τ(x)

τ(x) dx ≈ number of vn in [x, x + dx]n

v1v2 vnx x + dx

τ(·)

Vn: point density functions λ(x)

λ(x) dx ≈ E[number of Vn in [x, x + dx]

]

n

Point density function simplifies analysis!14 / 49

Point density function guides references design

referencevoltages vn

point densityfunction τ(·)

high res. approx.

given n, “sample” τ(·)

Examples

τ ∼ Unif ([−1, 1])vn: n-point uniform grid on [-1, 1]

−a +a0

0.5

τ(x)

15 / 49

Point density function guides references design

referencevoltages vn

point densityfunction τ(·)

high res. approx.

given n, “sample” τ(·)

Examples

τ(x) = 0.5 · δ(x− a) + 0.5 · δ(x + a)vn:

I n/2 reference voltages at +aI n/2 reference voltages at −a −a +a

0

0.5

τ(x)

15 / 49

With process variation, fabricated references matters

τ(·) vn

λ(·) Vn

what we can design

what determinesperformance

Performance characterization in λ(·)Want to find the optimal τ(·)

high res. approx.

“sample”

high res. approx.

sample

Vi = vi + Zi

Zii.i.d.∼ φ(·) ∼ N

(0, σ2)

[W., Polyanskiy &Wornell, ISIT’14]:

λ(x) = (τ ∗ φ)(x)

(∗: convolution)

−1 −0.5 0 0.5 10

0.2

0.4

0.6

τ(x)

λ(x)

16 / 49

With process variation, fabricated references matters

τ(·) vn

λ(·) Vn

what we can design

what determinesperformance

Performance characterization in λ(·)Want to find the optimal τ(·)

high res. approx.

“sample”

high res. approx.

sample

Vi = vi + Zi

Zii.i.d.∼ φ(·) ∼ N

(0, σ2)

[W., Polyanskiy &Wornell, ISIT’14]:

λ(x) = (τ ∗ φ)(x)

(∗: convolution)

−1 −0.5 0 0.5 10

0.2

0.4

0.6

τ(x)

λ(x)

16 / 49

Process variation increases MSE 6-fold

MSE = EX[e(X)2]Input X ∼ fX(·),

classical case [Bennett 1948, Panter & Dite 1951]

MSE ' 112n2

∫ fX(x)λ2(x)

dx λ = τ

with process variations [W., Polyanskiy & Wornell, ISIT’14]

MSE ' 12n2

∫ fX(x)λ2(x)

dx λ = τ ∗ φ

Why 6 times?

uniform grid vs. random division of an interval(a topic in order statistics)

Optimal τ

a necessary and sufficient condition [W., Polyanskiy & Wornell, ISIT’14]17 / 49

MSE-optimal designs can be quite differentUniform input distribution [W., Polyanskiy & Wornell, ISIT’14]

fX ∼ Unif ([−1, 1])

σ0 ≈ 0.7228

σ < σ0

locally optimal iterativeoptimization⇒ τ∗(x)

σ ≥ σ0

the necessary and sufficientcondition⇒ τ∗(x) = δ(x)

σ = 0.1

−1 −0.5 0 0.5 10

0.2

0.4

0.6

τ∗(x)

λ∗(x)

18 / 49

MSE-optimal designs can be quite differentUniform input distribution [W., Polyanskiy & Wornell, ISIT’14]

fX ∼ Unif ([−1, 1])

σ0 ≈ 0.7228

σ < σ0

locally optimal iterativeoptimization⇒ τ∗(x)

σ ≥ σ0

the necessary and sufficientcondition⇒ τ∗(x) = δ(x)

σ = 0.2

−1 −0.5 0 0.5 10

0.2

0.4

0.6

τ∗(x)

λ∗(x)

18 / 49

MSE-optimal designs can be quite differentUniform input distribution [W., Polyanskiy & Wornell, ISIT’14]

fX ∼ Unif ([−1, 1])

σ0 ≈ 0.7228

σ < σ0

locally optimal iterativeoptimization⇒ τ∗(x)

σ ≥ σ0

the necessary and sufficientcondition⇒ τ∗(x) = δ(x)

σ = 0.3

−1 −0.5 0 0.5 10

0.2

0.4

0.6

τ∗(x)

λ∗(x)

18 / 49

MSE-optimal designs can be quite differentUniform input distribution [W., Polyanskiy & Wornell, ISIT’14]

fX ∼ Unif ([−1, 1])

σ0 ≈ 0.7228

σ < σ0

locally optimal iterativeoptimization⇒ τ∗(x)

σ ≥ σ0

the necessary and sufficientcondition⇒ τ∗(x) = δ(x)

σ = 0.4

−1 −0.5 0 0.5 10

0.2

0.4

0.6

τ∗(x)

λ∗(x)

18 / 49

MSE-optimal designs can be quite differentUniform input distribution [W., Polyanskiy & Wornell, ISIT’14]

fX ∼ Unif ([−1, 1])

σ0 ≈ 0.7228

σ < σ0

locally optimal iterativeoptimization⇒ τ∗(x)

σ ≥ σ0

the necessary and sufficientcondition⇒ τ∗(x) = δ(x)

σ = 0.5

−1 −0.5 0 0.5 10

0.2

0.4

0.6

τ∗(x)

λ∗(x)

18 / 49

MSE-optimal designs can be quite differentUniform input distribution [W., Polyanskiy & Wornell, ISIT’14]

fX ∼ Unif ([−1, 1])

σ0 ≈ 0.7228

σ < σ0

locally optimal iterativeoptimization⇒ τ∗(x)

σ ≥ σ0

the necessary and sufficientcondition⇒ τ∗(x) = δ(x)

σ = 0.6

−1 −0.5 0 0.5 10

0.2

0.4

0.6

τ∗(x)

λ∗(x)

18 / 49

MSE-optimal designs can be quite differentUniform input distribution [W., Polyanskiy & Wornell, ISIT’14]

fX ∼ Unif ([−1, 1])

σ0 ≈ 0.7228

σ < σ0

locally optimal iterativeoptimization⇒ τ∗(x)

σ ≥ σ0

the necessary and sufficientcondition⇒ τ∗(x) = δ(x)

σ = 0.7

−1 −0.5 0 0.5 10

0.2

0.4

0.6

τ∗(x)

λ∗(x)

18 / 49

MSE-optimal designs can be quite differentUniform input distribution [W., Polyanskiy & Wornell, ISIT’14]

fX ∼ Unif ([−1, 1])

σ0 ≈ 0.7228

σ < σ0

locally optimal iterativeoptimization⇒ τ∗(x)

σ ≥ σ0

the necessary and sufficientcondition⇒ τ∗(x) = δ(x)

σ = 0.8

−1 −0.5 0 0.5 10

0.2

0.4

0.6

0.8

1

τ∗(x)

λ∗(x)

18 / 49

Analyzing maximum quantization error

emax = maxx |e(x)|High resolution approximation for

c.d.f. P [emax ≤ x]Range [a, b] P [emax ∈ [a, b]] ≈ 1

Accurate for finite n (e.g., when n ≥ 100)emax often measured in

LSB ,input range

2b

19 / 49

MSE-optimal designs improve yield

5%–10% for 6-bit FlashADC Yield , P [emax ≤ ∆max]

e.g. ∆max = 1LSB

0 0.2 0.4 0.6 0.8 1 1.20

0.2

0.4

0.6

0.8

1

σ

yield

r = 8, MSE-optimalr = 8, uniformr = 6, MSE-optimalr = 6, uniform

20 / 49

Scaling down the size of comparators is beneficial

For circuit fabrication [Kinget 2005, Nuzzo 2008],

process variation σ2 ∝1

component area

Given a fixed silicon area,

# components n ∝1

component area

Uniform input distribution, when σ ≥ σ0,

MSE ≈ 2πσ2/n2

emaxd.= Θ

(√π/2σ

bn

) σ2 ∝ n====⇒

MSE = Θ (1/n)

emax = Θ(

b√n

)

Building an ADC with more smaller but less precise comparatorsimproves accuracy!

21 / 49

Scaling down the size of comparators is beneficial

For circuit fabrication [Kinget 2005, Nuzzo 2008],

process variation σ2 ∝1

component area

Given a fixed silicon area,

# components n ∝1

component area

Uniform input distribution, when σ ≥ σ0,

MSE ≈ 2πσ2/n2

emaxd.= Θ

(√π/2σ

bn

) σ2 ∝ n====⇒

MSE = Θ (1/n)

emax = Θ(

b√n

)

Building an ADC with more smaller but less precise comparatorsimproves accuracy!

21 / 49

Computing with unreliable resources: circuit design

mathematicalabstraction

performancetrade-offs

designguidance

redundancyvs.

yield

distributionof fabrication

variation

asymptoticanalysis

quantizationtheory, order

statistics

22 / 49

Scheduling parallel taskswith variable response times

Executing parallel tasks: simple yet important

Executing parallel tasks

“Map” stage of MapReducedistributed parallel algorithms

I ADMM, MCMC, . . .time series analysis

I yearly data→ weekly data

crowd sourcing. . .

task 1

task 2

task 3

task n

... ...

In commonGiven:

a large collection of tasks that can be run in parallel

Want:results for all tasks

But this could be slow!

23 / 49

Executing parallel tasks: simple yet important

Executing parallel tasks

“Map” stage of MapReducedistributed parallel algorithms

I ADMM, MCMC, . . .time series analysis

I yearly data→ weekly data

crowd sourcing. . .

task 1

task 2

task 3

task n

... ...

In commonGiven:

a large collection of tasks that can be run in parallel

Want:results for all tasks

But this could be slow!23 / 49

Issue: variability in data centers

ObservationsThe response time of acomputer in a data centervaries

Latency: determined by theslowest machine

I worse as we get moremachines

In Google’s data center

Shared machine [Dean 2012]

How to reduce latency?

24 / 49

Issue: variability in data centers

ObservationsThe response time of acomputer in a data centervaries

Latency: determined by theslowest machine

I worse as we get moremachines

In Google’s data center

Execution times [Dean 2013]for a large collection of tasks

median completion timefor one: 1msmedian completion timefor all: 40ms

How to reduce latency?

24 / 49

Backup tasks in Google MapReduceDean et al., 2008

The backup task option

1 Run n tasks on n machines in parallel2 Replicate: when 10% of the tasks left3 Take the earliest results

EffectivenessReduce latency significantly (e.g., 1/3 for distributed sort)Handles the issue of “stragglers”

25 / 49

A call for theoretical analysis

Considerable follow-up work in systems

1 [Zaharia et al., USENIX OSDI 2008]: adopted by Facebook2 [Ananthanarayanan et al., USENIX OSDI 2010]: adopted by Microsoft

Bing3 [Ananthanarayanan et al., USENIX NSDI 2013]

Existing theoretical work inadequate

Stochastic scheduling: task replication not consideredNeed to understand

I latency reduction vs. additional resource usageI when and how replication could be beneficial

26 / 49

Scheduling problem formulation

ProblemExecuting a collection of n parallel tasks.

n: hundreds or thousands [Reiss et al., 2012]

System model

Execution time of each task i.i.d.∼ FX.Scheduling actions

I Send a task to a new machine to runI Terminate all machines running a certain task

Feedback: instantaneous feedback upon completion

27 / 49

Latency

The j-th copy of task i{

launched at time ti,j

execution time Xi,ji.i.d.∼ FX

Completion time for task i:

Ti , minj(ti,j + Xi,j)

Latency:

T , maxi

Ti

N1

N2

N3

N4

X1,1

X1,2

X2,1

X2,2

T1 = 8 T2 = 100 t1,2 = 2 t2,2 = 5

t

T = max {T1, T2} = 10

28 / 49

Total machine time as cost measure

The j-th copy of task i{

launched at time ti,j

execution time Xi,ji.i.d.∼ FX

In data centersTotal machine time

C ,n

∑i=1

ri+1

∑j=1

∣∣Ti − ti,j∣∣+

N1

N2

N3

N4

X1,1

X1,2

X2,1

X2,2

T1 = 8 T2 = 100 t1,2 = 2 t2,2 = 5

t

Note: We focus on expected values E [T] and E [C]

29 / 49

Replication helps!

X =

{2 w.p. 0.97 w.p. 0.1

No replication:

E [T] = 2.5E [C] = E [T] = 2.5

2

0.9

7

0.1

0 t

Replicate task at t2 = 2:

E [T] = 2.23E [C] = 2.46

2

0.8

4

0.1 × 0.9

7

0.1 × 0.1

0 t

30 / 49

Execution time modeling

Discrete random variables [W., Joshi & Wornell, SIGMETRICS’14]

Arise directly from estimation (quantiles)Offers more flexible modeling

Continuous random variablesPareto, ExponentialAnalysis: an important class of policies

31 / 49

Execution time modeling

Discrete random variables [W., Joshi & Wornell, SIGMETRICS’14]

Arise directly from estimation (quantiles)Offers more flexible modeling

in thesis

Continuous random variablesPareto, ExponentialAnalysis: an important class of policies

this talk X

31 / 49

Single-fork policy

Run n tasks in parallel initiallyWhen there is p fraction of the tasks left, replicate the unfinishedtasks r times

I Let all the unfinished tasks keep runningI Relaunch all the unfinished tasks

Without relaunching

task koriginal copy

replica 1... ...replica r

“fork”

With relaunching

task k

new replica

replica 1... ...replica r

“fork”

Note:after replication, r + 1 tasks in totalwhen r = 0: only the relaunching case is interesting

32 / 49

Single-fork policy

Run n tasks in parallel initiallyWhen there is p fraction of the tasks left, replicate the unfinishedtasks r times

I Let all the unfinished tasks keep runningI Relaunch all the unfinished tasks

Without relaunching

task koriginal copy

replica 1... ...replica r

“fork”

With relaunching

task k

new replica

replica 1... ...replica r

“fork”

Note:after replication, r + 1 tasks in totalwhen r = 0: only the relaunching case is interesting

32 / 49

Latency analysis

Order statisticsGiven random variables X1, X2, . . . , Xn

Xmin = X1:n ≤ X2:n ≤ . . . ≤ Xn:n = Xmax

Latency

Replicated tasks: new execution time distribution FYI FY = g(FX , r, relaunch or not)

Replicate for the last p fraction of the tasks:

T = X(1−p)n:n + Ypn:pn

before forking after forking

33 / 49

Latency analysis

Order statisticsGiven random variables X1, X2, . . . , Xn

Xmin = X1:n ≤ X2:n ≤ . . . ≤ Xn:n = Xmax

Latency

Replicated tasks: new execution time distribution FYI FY = g(FX , r, relaunch or not)

Replicate for the last p fraction of the tasks:

T = X(1−p)n:n + Ypn:pn

before forking after forking

33 / 49

Central value theorem

p-quantile of adistribution FX

xp , F−1X (p)

xp

p

x

FX

As n→ ∞,

Xpn:n → N

(xp,

1n

p(1− p)f 2X(xp)

)

Time before forking:

E[

X(1−p)n:n

]= x1−p , F−1

X (1− p)

34 / 49

Extreme value theorem

Theorem (Fisher-Tippett-Gnedenko theorem)

Given X1, X2, . . . , Xni.i.d.∼ FX , if there exist sequences of constants an > 0 and

bn ∈ R such that

P

[Xn:n − bn

an≤ z]→ G(z)

as n→ ∞ and G is a non-degenerate distribution, then G belongs to one of thefollowing families:

Gumbel law G(z) = exp {− exp (−z)} for z ∈ R,

Frechet law G(z) =

{0 z ≤ 0

exp {−z−α} z > 0,

Weibull law G(z) =

{exp

{− (−z)α} z < 0

1 z ≥ 0,

where α > 0.35 / 49

Extreme value theorem

Theorem (Fisher-Tippett-Gnedenko theorem)

Given X1, X2, . . . , Xni.i.d.∼ FX , if there exist sequences of constants an > 0 and

bn ∈ R such that

P

[Xn:n − bn

an≤ z]→ G(z)

as n→ ∞ and G is a non-degenerate distribution, then G belongs to one of thefollowing families:

Gumbel law G(z) = exp {− exp (−z)} for z ∈ R,

Frechet law G(z) =

{0 z ≤ 0

exp {−z−α} z > 0,

Weibull law G(z) =

{exp

{− (−z)α} z < 0

1 z ≥ 0,

where α > 0.

Given FX, the distribution of Xn:n can be character-ized as n→ ∞.

Key: tail behavior of 1− FX

Also applicable to X1:n

35 / 49

Single-fork analysis

executiontime beforeforking: FX

Xn:n → GX

X1:r → G′X

time before forking:X(1−p)n:n →F−1

X (1− p)

executiontime afterforking: FY

time after fork-ing: Ypn:pn → GY

central value theorem

extreme value theorem

Y = g(FX, r, relaunch or not)

extreme value theorem

T = X(1−p)n:n + Ypn:pn

Cost can be analyzed similarly.36 / 49

Execution time: Pareto distribution

Pareto (α, xm).

FX(x; α, xm) ,

{1−

( xmx

)α x ≥ xm

0 x < xm

heavy-tail distributionobserved in data centers [Reisset al., 2012].

0 2 4 6 80

0.5

1

x

fPareto(2,2)(x)

Extremes

X1:n ∼ Pareto (nα, xm)

Xn:n

xmn1/α∼ Frechet

E [Xn:n] = xmn1/αΓ (1− 1/α) ∝ n1/α ·E [X]

37 / 49

Recall: single-fork policy

Parametersnumber of tasks: nfraction to replicate: padditional replicas: r

Without relaunching

task koriginal copy

replica 1... ...replica r

“fork”

With relaunching

task k

new replica

replica 1... ...replica r

“fork”

38 / 49

Asymptotic characterization accurate in finite regime

X ∼ Pareto (2, 2) and replication fraction p = 0.2

200 400 600 800 1,000

8

10

12

14

16

r = 1 no relaunch

r = 1 & relaunch

r = 2 no relaunch

r = 2 & relaunch

n

Latency

39 / 49

Latency & Cost vs. replication fraction p

X ∼ Pareto (2, 2) with n = 400

0 0.1 0.2 0.3 0.4 0.5 0.60

20

40

60

p

E [T]r = 0 & relaunchr = 1 & relaunchr = 1 no relaunchr = 2 & relaunchr = 2 no relaunch

40 / 49

Latency & Cost vs. replication fraction p

X ∼ Pareto (2, 2) with n = 400

0 0.1 0.2 0.3 0.4 0.5 0.63

4

5

6

7

p

E [Ccloud]

r = 0 & relaunchr = 1 & relaunchr = 1 no relaunchr = 2 & relaunchr = 2 no relaunch

40 / 49

Latency-cost trade-off

X ∼ Pareto (2, 2) with n = 400

3 4 5 6 7 8 90

10

20

30

40

50

60

70

Cost

Latency

r = 0 & relaunch

41 / 49

Latency-cost trade-off

X ∼ Pareto (2, 2) with n = 400

3 4 5 6 7 8 90

10

20

30

40

50

60

70

Cost

Latency

r = 0 & relaunchr = 1 & relaunchr = 1 no relaunch

41 / 49

Latency-cost trade-off

X ∼ Pareto (2, 2) with n = 400

3 4 5 6 7 8 90

10

20

30

40

50

60

70

Cost

Latency

r = 0 & relaunchr = 1 & relaunchr = 1 no relaunchr = 2 & relaunchr = 2 no relaunch

41 / 49

Latency-cost trade-off

X ∼ Pareto (2, 2) with n = 400

3 4 5 6 7 8 90

10

20

30

40

50

60

70

Cost

Latency

r = 0 & relaunchr = 1 & relaunchr = 1 no relaunchr = 2 & relaunchr = 2 no relaunch

3 4 5 6 7 8 90

10

20

30

40

50

60

70

Cost

Latency

r = 0 & relaunchr = 1 & relaunchr = 1 no relaunchr = 2 & relaunchr = 2 no relaunch

41 / 49

Recap

A framework for analyzing single-fork policiesI Can be extended to multi-fork

Applicable to any distributions that can be analyzed via EVTI Pareto, Exponential, Erlang, etc..

log files executiontime FX

latency-cost

trade-off

replicationstrategy

densityestimation

our frameworkapplication

scenario

42 / 49

Computing with unreliable resources: scheduling

mathematicalabstraction

performancetrade-offs

designguidance

latencyvs.

cost

tail behavior ofexecution time

distribution

asymptoticanalysis

order statistics,extreme

value theory

43 / 49

Crowd-based ranking vianoisy comparisons

Crowd-based ranking

Rank by score assignment

Examples: IMDb.com, Yelp.com

Rank by pairwise comparison

Examples: admission/recruiting, knockout stage of tournamentsChallenges:

I human comparisons are noisy: answer flipped w.p. εI human comparisons are expensive (economic cost, time, . . . )

What is the fundamental trade-off betweenranking accuracy and the number of comparisons?

approximate sorting via (noisy) comparisons

44 / 49

Information-theoretic lower bounds on #comparisons

Distortion measure`1 distance of permutations d`1 (π1, π2) ,

n

∑i=1|π1(i)− π2(i)|

To achieve distortion D ≤ Θ(n1+δ

)

Approximate sorting with noiseless comparisonsI [W., Mazumdar & Wornell, ISIT’14]: at least

(1− δ)n log n comparisons

I Tight: the multiple selection algorithm in [Kaligosi 2005] achievesthis bound

Approximate sorting with noisy comparisonsI [W., Mazumdar & Wornell, ISIT’14]: at least

(1− δ)

1− Hb (ε)n log n comparisons

I Existing algorithms only known to be O (n log n)

45 / 49

Information-theoretic lower bounds on #comparisons

Distortion measure`1 distance of permutations d`1 (π1, π2) ,

n

∑i=1|π1(i)− π2(i)|

To achieve distortion D ≤ Θ(n1+δ

)

Approximate sorting with noiseless comparisonsI [W., Mazumdar & Wornell, ISIT’14]: at least

(1− δ)n log n comparisons

I Tight: the multiple selection algorithm in [Kaligosi 2005] achievesthis bound

Approximate sorting with noisy comparisonsI [W., Mazumdar & Wornell, ISIT’14]: at least

(1− δ)

1− Hb (ε)n log n comparisons

I Existing algorithms only known to be O (n log n)45 / 49

More results

More distortion measures [W., Mazumdar & Wornell, ISIT’14]

Kendall tau distance, Chebyshev distance, . . .Relationships among distortion measures

Other distributional modelMallows distributional model

46 / 49

Computing with unreliable resources: ranking

mathematicalabstraction

performancetrade-offs

designguidance

accuracyvs.

#comparisons

error proba-bility of noisycomparisons

asymptoticanalysis

rate-distortiontheory,

combinatorics

47 / 49

Computing with unreliable resources

mathematicalabstraction

performancetrade-offs

designguidance

reliabilityvs.

resource usage

statisticalproperties ofunreliability

asymptoticanalysis

suitable forlarge scalesystems!

A unified “coding theory” for these computing problems?

A journey of a thousand miles begins with a single step.— Lao Tzu

48 / 49

Computing with unreliable resources

mathematicalabstraction

performancetrade-offs

designguidance

reliabilityvs.

resource usage

statisticalproperties ofunreliability

asymptoticanalysis

suitable forlarge scalesystems!

A unified “coding theory” for these computing problems?

A journey of a thousand miles begins with a single step.— Lao Tzu

48 / 49

Computing with unreliable resources

mathematicalabstraction

performancetrade-offs

designguidance

reliabilityvs.

resource usage

statisticalproperties ofunreliability

asymptoticanalysis

suitable forlarge scalesystems!

A unified “coding theory” for these computing problems?

A journey of a thousand miles begins with a single step.— Lao Tzu

48 / 49

This thesis is impossible without . . .

Thesis committee: Greg, Yury and Devavrat

Collaborators: Gauri Joshi, Arya Mazumdar

The community: SiA, RLE, LIDS, CSAIL, MTL, EECS, GSA, GSC,MIT, . . .

I friendsI colleaguesI constructive criticisersI poker opponentsI . . .

My parents

Hengni

Thank you!

49 / 49

This thesis is impossible without . . .

Thesis committee: Greg, Yury and Devavrat

Collaborators: Gauri Joshi, Arya Mazumdar

The community: SiA, RLE, LIDS, CSAIL, MTL, EECS, GSA, GSC,MIT, . . .

I friendsI colleaguesI constructive criticisersI poker opponentsI . . .

My parents

Hengni

Thank you!

49 / 49

This thesis is impossible without . . .

Thesis committee: Greg, Yury and Devavrat

Collaborators: Gauri Joshi, Arya Mazumdar

The community: SiA, RLE, LIDS, CSAIL, MTL, EECS, GSA, GSC,MIT, . . .

I friendsI colleaguesI constructive criticisersI poker opponentsI . . .

My parents

Hengni

Thank you!

49 / 49

This thesis is impossible without . . .

Thesis committee: Greg, Yury and Devavrat

Collaborators: Gauri Joshi, Arya Mazumdar

The community: SiA, RLE, LIDS, CSAIL, MTL, EECS, GSA, GSC,MIT, . . .

I friends

I colleaguesI constructive criticisersI poker opponentsI . . .

My parents

Hengni

Thank you!

49 / 49

This thesis is impossible without . . .

Thesis committee: Greg, Yury and Devavrat

Collaborators: Gauri Joshi, Arya Mazumdar

The community: SiA, RLE, LIDS, CSAIL, MTL, EECS, GSA, GSC,MIT, . . .

I friendsI colleagues

I constructive criticisersI poker opponentsI . . .

My parents

Hengni

Thank you!

49 / 49

This thesis is impossible without . . .

Thesis committee: Greg, Yury and Devavrat

Collaborators: Gauri Joshi, Arya Mazumdar

The community: SiA, RLE, LIDS, CSAIL, MTL, EECS, GSA, GSC,MIT, . . .

I friendsI colleaguesI constructive criticisers

I poker opponentsI . . .

My parents

Hengni

Thank you!

49 / 49

This thesis is impossible without . . .

Thesis committee: Greg, Yury and Devavrat

Collaborators: Gauri Joshi, Arya Mazumdar

The community: SiA, RLE, LIDS, CSAIL, MTL, EECS, GSA, GSC,MIT, . . .

I friendsI colleaguesI constructive criticisersI poker opponents

I . . .

My parents

Hengni

Thank you!

49 / 49

This thesis is impossible without . . .

Thesis committee: Greg, Yury and Devavrat

Collaborators: Gauri Joshi, Arya Mazumdar

The community: SiA, RLE, LIDS, CSAIL, MTL, EECS, GSA, GSC,MIT, . . .

I friendsI colleaguesI constructive criticisersI poker opponentsI . . .

My parents

Hengni

Thank you!

49 / 49

This thesis is impossible without . . .

Thesis committee: Greg, Yury and Devavrat

Collaborators: Gauri Joshi, Arya Mazumdar

The community: SiA, RLE, LIDS, CSAIL, MTL, EECS, GSA, GSC,MIT, . . .

I friendsI colleaguesI constructive criticisersI poker opponentsI . . .

My parents

Hengni

Thank you!

49 / 49

This thesis is impossible without . . .

Thesis committee: Greg, Yury and Devavrat

Collaborators: Gauri Joshi, Arya Mazumdar

The community: SiA, RLE, LIDS, CSAIL, MTL, EECS, GSA, GSC,MIT, . . .

I friendsI colleaguesI constructive criticisersI poker opponentsI . . .

My parents

Hengni

Thank you!

49 / 49

This thesis is impossible without . . .

Thesis committee: Greg, Yury and Devavrat

Collaborators: Gauri Joshi, Arya Mazumdar

The community: SiA, RLE, LIDS, CSAIL, MTL, EECS, GSA, GSC,MIT, . . .

I friendsI colleaguesI constructive criticisersI poker opponentsI . . .

My parents

Hengni

Thank you!

49 / 49

Recommended