81
Reliable Intermittent Computers Brandon Lucia | Assistant Professor, CMU ECE ABSTRACT Research Group - https:// intermittent.systems with PhD Students: Kiwan Maeng, Emily Ruppel, Vignesh Balaji Graham Gobieski, Milijana Surbatovich, Brad Denby, Harsh Desai PhD Alumni: Alexei Colin Academic Collaborators: Nathan Beckman (CMU), Michael Bond (OSU), Joseph Devietti (UPenn),… Industry Collaborators: Zac Manchester (Stanford/KickSat) Alanson Sample (Disney Research Pittsburgh)

Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Reliable Intermittent ComputersBrandon Lucia | Assistant Professor, CMU ECEABSTRACT Research Group - https://intermittent.systemswith PhD Students: Kiwan Maeng, Emily Ruppel, Vignesh Balaji

Graham Gobieski, Milijana Surbatovich, Brad Denby, Harsh DesaiPhD Alumni: Alexei ColinAcademic Collaborators: Nathan Beckman (CMU), Michael Bond (OSU), Joseph Devietti (UPenn),…Industry Collaborators: Zac Manchester (Stanford/KickSat)

Alanson Sample (Disney Research Pittsburgh)

Page 2: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 2

Wearables, sensors, “Internet of Things”, tiny satellites, medical

implants, environment monitoring, security, computational art,

interactive clothing, smart furniture, smart carpets, smart toilet paper

Emerging Devices Everywhere

Page 3: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 4

Tethered to a power source

Page 4: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 6

computer (microcontroller)

energy buffer (capacitor)

energy receiver

energy source

Energy harvesting untethers devices

empty space

Page 5: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 7

Page 6: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 8

No battery replacements

No battery lifetime limitation

No environmental battery waste

No bulky battery

Page 7: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 9

energy source

empty space

…but energy is intermittentsoftware becomes inherently unreliable.

—and as a result—

computer (microcontroller)

energy buffer (capacitor)

energy receiver

Page 8: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 10

Designing for IntermittenceSystem Support for Reliable Intermittence

The Intermittent Execution ModelReasoning About Intermittently-powered Devices

Intermittent & Physically Constrained Computing Platforms

Hardware, Programming Models, System Software

Page 9: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 11

Running on intermittent energy

Available

Energy

Time

Charging, dead.

Page 10: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 12

Running on intermittent energy

Available

Energy

Time

“Life Line”

Operating!

Page 11: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 13

Running on intermittent energy

Available

Energy

Time

“Death Line”Not charging, dead.

Page 12: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 14

Running on intermittent energy

Available

Energy

Time

Charging, dead.

Page 13: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 15

Running on intermittent energy

Available

Energy

Time

“Life Line”

Operating!

Page 14: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 16

Available

Energy

Time

“Death Line”

Computer

shuts down

Computer

reboots

“Life Line”

Intermittent Operation on Harvested Energy

Page 15: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 17

Available

Energy

Time

Compute ComputeRebootReboot

Page 16: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 18

Time

Compute Reboot Compute

The Intermittent Execution Model

Page 17: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 19

Time

The Intermittent Execution Model

Goal: Run programs that take longer than one green box

Page 18: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 20

void main(void){

for( i = 1 .. 10)

append()

}

void append(){

sz++

buf[sz] = ‘a’

}

/*Store a letter ‘a’ in buf[sz]

and increase sz by one*/

/*Store 10 letters ‘a’ in the buffer*/

Page 19: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 21

void main(void){

for( i = 1 .. 10)

append()

}

void append(){

sz++

buf[sz] = ‘a’

}

for( i = 1 )

append()sz++

buf[sz] = ‘a’

for( i = 2 )

append()sz++

buf[sz] = ‘a’

REBOOT

Page 20: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 22

void main(void){

for( i = 1 .. 10)

append()

}

void append(){

sz++

buf[sz] = ‘a’

}

for( i = 1 )

append()

for( i = 1 )

append()sz++

buf[sz] = ‘a’

for( i = 2 )

append()sz++

buf[sz] = ‘a’

REBOOT

REBOOT

Page 21: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 23

void main(void){

for( i = 1 .. 10)

append()

}

void append(){

sz++

buf[sz] = ‘a’

}

for( i = 1 )

append()

for( i = 2 )

append()

for( i = 3 )

append()

sz++

buf[sz] = ‘a’

sz++

buf[sz] = ‘a’

sz++

buf[sz] = ‘a’

for( i = 1 )

append()

for( i = 1 )

append()sz++

buf[sz] = ‘a’

for( i = 2 )

append()sz++

buf[sz] = ‘a’

REBOOT

REBOOT

REBOOT

Page 22: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 24

void main(void){

for( i = 1 .. 10)

append()

}

void append(){

sz++

buf[sz] = ‘a’

}

main()append()

for( i = …)

call append()

<loop>

end of main()

Control-flow Graph

buf[sz] = ‘a’

sz++

We can model intermittence as a control-flow problem

Page 23: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 25

main()append()

for( i = …)

call append()

<loop>

end of main()

Control-flow Graph

Failure Induces

Implicit Control-flow

Intermittent Execution

Challenge:

Implicit control-flow

“back in time” on reboot.

buf[sz] = ‘a’

sz++

Back in time!

Page 24: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 26

Mixture of Volatile & Non-volatile State

volatile memory (e.g.,

DRAM, SRAM registers)

non-volatile memory

(e.g., Flash, FRAM)

Access latencies growing

similar w/ new technology

Page 25: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 27

Time

Reboots clear volatile state and

preserve non-volatile state

Persists!

Volatile

State

Non-volatile

State

Clear! Clear! Clear! Clear!

Page 26: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 28

main()append()

for( i = …)

call append()

<loop>

end of main()

Control-flow Graph

Checkpoint introduce more

compleximplicit control-flow

Intermittent Execution

Challenge:

Implicit control-flow

“back in time” on reboot.

buf[sz] = ‘a’

sz++

Checkpoint Volatile State

Page 27: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 29

void main(void){

for( i = 1 .. 10)

append()

}

void append(){

sz++

buf[sz] = ‘a’

}

for( i = 1 )

append()

for( i = 2 )

append()

for( i = 3 )

append()

sz++

buf[sz] = ‘a’

sz++

buf[sz] = ‘a’

sz++

buf[sz] = ‘a’

for( i = 1 )

append()

for( i = 1 )

append()sz++

buf[sz] = ‘a’

for( i = 2 )

append()sz++

buf[sz] = ‘a’

REBOOT

REBOOT

REBOOT

Intermittence is a Challenge

Software that is correct with continuous power

can be incorrect in an intermittent execution

Page 28: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 31

main()append()

for( i = …)

call append()

<loop>

end of main()

buf[sz] = ‘a’

sz++

Intermittence Bug: Out-of-thin-air Values

sz

buf

-1

ckpt

[MSPC ’14; PLDI ‘15]

Page 29: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 32

main()append()

for( i = …)

call append()

<loop>

end of main()

buf[sz] = ‘a’

sz++

for( i = 1)

sz

buf

0

call append()sz++buf[sz] = ‘a’

‘a’

Intermittence Bug: Out-of-thin-air Values[MSPC ’14; PLDI ‘15]

Page 30: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 33

main()append()

for( i = …)

call append()

<loop>

end of main()

buf[sz] = ‘a’

sz++

sz

buf

1

‘a’

Error: ’a’ is appended to buf

twice on the i = 1 iteration

for( i = 1)

call append()sz++buf[sz] = ‘a’

for( i = 1)

call append()sz++buf[sz] = ‘a’

‘a’

Intermittence Bug: Out-of-thin-air Values[MSPC ’14; PLDI ‘15]

Page 31: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 34

for( i = 1)

call append()sz++buf[sz] = ‘a’

for( i = 1)

call append()sz++buf[sz] = ‘a’

Stuck in an implicit

loop for i = 1

for( i = 1)

call append()sz++buf[sz] = ‘a’

for( i = 1)

call append()sz++buf[sz] = ‘a’

for( i = 1)

call append()sz++buf[sz] = ‘a’

call append()

buf[sz] = ‘a’

sz++

for( i = 1)

Intermittence Bug: Out-of-thin-air Values[MSPC ’14; PLDI ‘15]

Page 32: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 35

sz

buf

11

‘a’‘a’ ‘a’‘a’‘a’‘a’ ‘a’‘a’ ‘a’‘a’ ‘a’‘a’

Stuck in an implicit

loop for i = 1

call append()

buf[sz] = ‘a’

sz++

for( i = 1)

Memory contents depend on the availability of

harvestable energy and capacitor size!

sz & buf contain corrupt values.

“Out of thin air” value not permitted by any

continuous execution!

Intermittence Bug: Out-of-thin-air Values[MSPC ’14; PLDI ‘15]

Page 33: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 37

main()

senseProcTx()for( i = …)

call senseProcTx()

<loop>

end of main()

process()

sense()

NV

for( i = 1)

Energy cost atomic span exceeds energy capacity

Intermittence Challenge: Livelock & Non-termination

Affected code need not even

modify NV storage!

[CASES ’15, ASPLOS ’16, ASPLOS ‘18, CC ‘18, OSDI ’18, PLDI ‘19]

Code reboots indefinitely on intermittent power.

If energy cost exceeds

full capacitor charge:

never completes 1 iteration

for( i = 1)

call senseProcTx()

transmit()

Variable E cost?

Unpredictable livelock

call senseProcTx()

call sense () call sense ()

call process () call process ()

Page 34: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 38

main()

checkpoint

Sense_pressure()

end of main()

Checkpoint between I/O operations that should

execute together leaves I/O inconsistent

Intermittence Challenge: I/O Atomicity & Timeliness

Inconsistent data read at different moments in time

No atomicity control w/ automatic checkpoints

Sense_temp()

checkpoint

Sense_pressure()restore

Sense_temp()

A very long time …

[ASPLOS ’18, PLDI ‘19]

Page 35: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 39

checkpoint

If( X > max) { BLINK = 1 }

Re-execution of I/O operations leads to non-

idempotent re-execution of part of a program

Intermittence Challenge: I/O Non-Idempotence & Event-Driven I/O

Variation in input values makes normally-idempotent writes non-idempotent

Not all I/O dependent data checkpointed

Sense_temp()checkpoint

Sense_ temp()restore

X = Sense_temp()

Restart from Checkpoint

If( X <= max) { NOBLINK = 1 }

X = Sense_temp()

BLINK = 1 NOBLINK = 1

assert( BLINK ^ NOBLINK )Power Failure

[PLDI ‘19]

Page 36: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 40

Intermittent software is

unreliable.

Need easy programming

and reliable execution

Page 37: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 41

Reliable Intermittent ExecutionTask-based Programming Models

Page 38: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 42

Task-based Intermittent Execution

Chinchilla[OSDI ’18]

DINO[PLDI’15] Chain[OOPSLA’16] Alpaca[OOPSLA’17]

Automatic Intermittent Execution

without Non-termination

Page 39: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 43

Task-based Intermittent Execution

Task Boundary

Task Boundary

Task Boundary

Task

Task Task

Task

Tasks have atomic all-or-

nothing semantics

Task boundaries selectively

checkpoint volatile and

non-volatile state

No implicit control-flow!

Page 40: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 44

Chain: Static data multi-versioning

out in in

Task

Channels statically multiversion data

separating inputs from outputs

[Colin, et al OOPSLA ’16]

Tasks exchange data via abstract

channels of non-volatile memory

out

Task

Task

Task

Programmer explicitly defines tasks

and task-flow graph in code

Key limitation: Programmer must

learn new programming model

X X’ X” X””

Y

Y’

Page 41: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 45

Task

Compiler adds code to automatically

commit copied state back to main memory

[Maeng, et al OOPSLA ’17]

Task

Task

Task

Compiler automatically privatizes data

creating a copy safe to use in a task.

Alpaca: Tasks with Dynamic Privatization

Privatize

Commit

Alpaca tasks are restartable

without a checkpoint of volatile state

Page 42: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing

task Alert(){

report(S, Cnt);

Cnt++

NextTask Sense }

46

[Maeng, et al OOPSLA ’17]

origin task Sense(){

S=read_sensor()

NextTask CmpAvg }

task CmpAvg(){

for(int i=0; i < 5; i++){

sum += A[i] }

avg = sum / 5

A[head] = S

head = (head+1)%5

if( S > avg*2 ){

NextTask Alert

}

NextTask Sense }

Alpaca: Tasks with Dynamic Privatization

Shared S, Cnt,

A[5],head Alpaca selectively privatizes task-shared variables

Function-as-task means

no need to checkpoint

stack & regs

Page 43: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 47

for( i = 1)

call append()buf[sz] = ‘a’

sz++

task_boundary

for( i = 2)

task_boundary

call append()

sz++

task_boundary

-1sz

buf

0sz

-1sz

buf

‘a’

0sz

buf

buf[sz] = ‘a’‘a’buf Commit buf and sz

Correctly UpdatedOnly Once!

Tasks Eliminate Surprising Behavior

Task privatizes

buf and sz

Re-execute with consistent

buf and sz!

Implicit flow targetmade explicit

Page 44: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 48

int NV_Array[1000000];

task_boundary

for( i = 0; i < 1000000; i++){

NV_Array[i] = i;

}

task_boundary

Key Question:

Which data to selectively privatize?

Page 45: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 49

task_boundary

i = rand(…)

NV_Array[i]++;

task_boundary

NV_Array[1000000] = {0};

NV_Array[1000000] = {…};

int NV_Array[1000000];

task_boundary

i = rand(1000000);

NV_Array[i]++;

task_boundary

Failure-induced

NV-data flow!

Which task-shared data to privatize?

Data written before a failure that may be read after reboot

Failure-induced

NV-data flow!WAR

Dependence

Page 46: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 50

Prototype Implementations

DINO, Chain, Alpaca

Programming

Model

Task-aware

Compiler

Task Data-flow Analysis

Link to Task Runtime

Channels, Versioning,

Privatization

Task, Channel Definitions

Runtime Library

DINO: Checkpoint & Recovery

Chain: Task Management

Alpaca: Privatization/Commit

Page 47: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 51

Energy-harvesting

Platform Prototypes

WISP5 Energy-harvesting Device

RF Power Setup

ApplicationsActivity Recognition

(accelerometer+ML)

Cold-chain Equipment Monitor

Multi-granular Sensor Log

Key Result:Applications suffer errors & failures

without our system support for tasks

Deep Neural Networks (MNIST,

OKGoogle, Activity Recognition)[ASPLOS’19]

Custom PCBs

Page 48: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 52

0

2

4

6

8

10

12

14

16

18

CEM AR RSA CF BF BC

No

rma

lize

d R

un

Tim

e

Alpaca Chain DINO

Alpaca outperforms DINO and Chain

Page 49: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 53

Chinchilla[OSDI ’18]

Efficient Automatic Checkpoints

without Non-termination

Page 50: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 54

Chinchilla Supports Safe, Efficient

Checkpoint-based Intermittent Execution

i

A

sum

Non-volatile

i 1

A 2 3 4

sum 2

Page 51: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 55

Too Few Checkpoints May Cause Non-termination

Cannot reach the

next checkpoint

→ Non-termination

Page 52: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 57

Too Many Checkpoints Incur High Overhead

High overhead!

Page 53: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 58

Checkpoint Placement Determines

Efficiency and Correctness

High overhead!Non-terminating!

Page 54: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 59

Requirements for Checkpoint Insertion

Frequent Low energy variance Compiler-friendly

Basic Block checkpoints avoid non-

termination if no block exceeds device energy

E

Page 55: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing

0.01

0.1

1

10

100

CEM CF RSA AR BF BC

En

erg

y (

uJ)

Min Avg Max

60

Chinchilla Checks Block Energy to Avoid Non-termination

Energy of an 47uF (small) Capacitor

Page 56: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 61

Block Energy is Far Below Device EnergyChinchilla’s Block Checkpoints Avoid Non-termination

Energy of an 47uF Capacitor

0.01

0.1

1

10

100

CEM CF RSA AR BF BC

En

erg

y (

uJ)

Min Avg Max

21x

headroom

Page 57: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 62

Checkpoint Placement Determines

Efficiency and Correctness

High overhead!Non-terminating!Safe!

Page 58: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 63

Checkpoint Placement Determines

Efficiency and Correctness

High overhead!Non-terminating!

Page 59: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 64

Chinchilla Selectively Checkpoints to Avoid High Overheads

• Key Idea:

– Checkpointing on each block is worst case reasoning.

– Average case: sufficient energy for multiple blocks to complete

– Chinchilla’s idea: selectively skip checkpoints to avoid high cost

– Gracefully Degrade to worst case checkpoints if necessary

Page 60: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 65

Chinchilla Uses Timer-Based Selective Checkpointing

Ti:=0

sum:=0

i < A_LEN?

sum:=sum+A[i]

i:=i+1

Page 61: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 66

i:=0

sum:=0

i < A_LEN?

sum:=sum+A[i]

i:=i+1

Skip

Chinchilla Uses Timer-Based Selective Checkpointing

Page 62: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 67

i:=0

sum:=0

i < A_LEN?

sum:=sum+A[i]

i:=i+1

Chinchilla Uses Timer-Based Selective Checkpointing

Page 63: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing

i:=0

sum:=0

i < A_LEN?

sum:=sum+A[i]

i:=i+1

Skip

Chinchilla Uses Timer-Based Selective Checkpointing

Page 64: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 69

i:=0

sum:=0

i < A_LEN?

sum:=sum+A[i]

i:=i+1

Skip

Chinchilla Uses Timer-Based Selective Checkpointing

Page 65: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 70

i:=0

sum:=0

i < A_LEN?

sum:=sum+A[i]

i:=i+1

Chinchilla Uses Timer-Based Selective Checkpointing

Page 66: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 71

i:=0

sum:=0

i < A_LEN?

sum:=sum+A[i]

i:=i+1

…Checkpoint!

Chinchilla Uses Timer-Based Selective Checkpointing

Page 67: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 72

Chinchilla Adapts its Checkpointing Interval

• Dynamically find the right checkpointing interval

during execution by binary search

• Automatically adjusted on environmental changes

?

Page 68: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 73

Checkpoint Placement Determines

Efficiency and Correctness

Non-terminating!Safe!

High overhead!Fast!

Page 69: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing

Chinchilla Is Safe and Efficient

Non-termination if

Alpaca task too large

[van Der Woude

et al OSDI ‘16

Page 70: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 75

[Lucia, et al PLDI ’15]

Use Chinchilla, Alpaca, Chain and DINO for your research!

http://intermittent.systems | http://github.com/CMUAbstract

Page 71: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing

80

Intermittent & Physically Constrained Computing Platforms

Hardware, Programming Models, System Software

Page 72: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 81

Key Insight: Co-design PL, OS, Arch, and power system for intermittent operation

Intermittence-safe HW/SW Systems [ASPLOS 2018]

EDBSat Chip-scale

satellite• 3cmx3cmx4mm, solar power

• Magnetometer + gyro sensors

• Low-power radio (to earth)

• Custom “burst” power system

• Integrated HW/SW diagnostics

• Programmed in intermittence-

safe Chain programming lang.

• Launch partner: KickSat/NASA

• 5cmx5cm, solar/RF power

• 10 sensor array

• Bluetooth Low-energy Radio

• Software-switchable variable-

capacity “burst” power system

• Integrated HW/SW diagnostics

• Supports Alpaca & Chain

intermittent programming langs.

Capybara multi-

sensor platform

Page 73: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 82

EDBSat is in LEO now!

KickSat-2 Mission w/ Zac Manchester (Stanford)

Page 74: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 84

Capybara’s Co-designed HW & SW Meet

Dynamically Varying Energy DemandConfigurable HW

Energy StorageProgrammable Task Energy Modes

Page 75: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 85

EDB: The Energy-interference-free Debugging Toolchain

Energy-interference-free debugging and monitoring

for intermittent, energy-harvesting computers.

Energy-interference-free

debugger

Reboot… …bug!

AssertionsBreakpointsTracingInteractive debuggingEnergy tracingDebug-time I/O& More!

Simultaneously monitor & manipulate program state and energy state.

[Colin, et al. CASES ’15, ASPLOS ’16, IEEE Micro Top Picks ‘16]

Page 76: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Intelligence beyond the Edge

Deep neural network inference optimized for intermittent operation

First demonstration of DNN inference on an intermittent device

SpaceWildlife Monitoring

ASPLOS 2019

Software-Only Neural Intermittent Computing (SONIC) &

TI Accelerated Intermittent Linear algebra Solver (TAILS)

<256 KB Memory 16MHz Clock 100s of Failures/min

Challenge: Resource constrained & intermittent

Page 77: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Jetson TX2 Compute Module

Capacitor-based energy storage

3U CubeSat Nanosatellite

Orbital Edge Computing: Machine inference in camera-equipped computational nanosatellites[GoMACTech’19, CAL’19]

Page 78: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing

Orbital Edge Computing

Distributed platform designs for

high-performance orbital edge

computing[GoMACTech’19, CAL’19]

Page 79: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing

Edge Graph Processing

Single-machine processing of

network structured data.[IISWC’18 Best Paper, HPDC ‘19 (in sub)]

Page 80: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Brandon Lucia - Carnegie Mellon University - Challenges of Intermittent Computing 105

Designing for IntermittenceSystem Support for Reliable Intermittence

The Intermittent Execution ModelReasoning About Intermittently-powered Devices

Intermittent Computing PlatformsHardware, Programming Models, System Software

Page 81: Reliable Intermittent Computers - Stanford University Talks/2019/Brandon_Lucia.… · NV for( i = 1) Energy cost atomic span exceeds energy capacity Intermittence Challenge: Livelock

Solving the System Challenges ofIntermittent Energy-harvesting DevicesBrandon Lucia | Assistant Professor, CMU ECEABSTRACT Research Group - https://intermittent.systemswith PhD Students: Kiwan Maeng, Emily Ruppel, Vignesh Balaji

Graham Gobieski, Milijana Surbatovich, Brad Denby, Harsh DesaiPhD Alumni: Alexei Colin,Industry Collaborators: Zac Manchester (Harvard/KickSat)

Alanson Sample (Disney Research Pittsburgh)