24
Live Code Update for IoT Devices Chi Zhang, Wonsun Ahn, Bruce Childers, Youtao Zhang Department of Computer Science University of Pittsburgh NVMSA, August 2016

Live Code Update for IoT Devices - University of Pittsburghwahn/papers/PRES/present_nvmsa16.pdfLive Code Update for IoT Devices 8 Device Down Time Reprogramming Energy Memory Requirement

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Live Code Update for IoT Devices - University of Pittsburghwahn/papers/PRES/present_nvmsa16.pdfLive Code Update for IoT Devices 8 Device Down Time Reprogramming Energy Memory Requirement

Live Code Update for IoT Devices

Chi Zhang, Wonsun Ahn, Bruce Childers, Youtao Zhang Department of Computer Science

University of Pittsburgh

NVMSA, August 2016

Page 2: Live Code Update for IoT Devices - University of Pittsburghwahn/papers/PRES/present_nvmsa16.pdfLive Code Update for IoT Devices 8 Device Down Time Reprogramming Energy Memory Requirement

Battery-Powered IoT Device

2 Live Code Update for IoT Devices

Page 3: Live Code Update for IoT Devices - University of Pittsburghwahn/papers/PRES/present_nvmsa16.pdfLive Code Update for IoT Devices 8 Device Down Time Reprogramming Energy Memory Requirement

Energy-Harvesting IoT Device

3 Live Code Update for IoT Devices

Page 4: Live Code Update for IoT Devices - University of Pittsburghwahn/papers/PRES/present_nvmsa16.pdfLive Code Update for IoT Devices 8 Device Down Time Reprogramming Energy Memory Requirement

Code Update for IoT Devices

•  Need for code update post-deployment –  Debugging: some bugs only manifest in the field –  Tuning: tune parameters based on feedback –  New / modified functionality: collect new data

•  Code update post-deployment is a challenge –  Difficult to collect devices scattered in the field –  Devices often at hard-to-reach locations

(e.g. embedded in body, hazardous locations …)

4 Live Code Update for IoT Devices

➜ Code update often needs to be performed wirelessly in the field, relying on local power source

Page 5: Live Code Update for IoT Devices - University of Pittsburghwahn/papers/PRES/present_nvmsa16.pdfLive Code Update for IoT Devices 8 Device Down Time Reprogramming Energy Memory Requirement

Design Goals for Code Update

•  Minimize device down time –  Prolonged down time can impact QoS

(e.g. A sensor device deployment at Reventador Volcano could be up for only 69% of the time due to code updates)

–  Some applications have real-time deadlines •  Minimize reprogramming time

–  Consists of code transfer time + code write time –  For battery-powered devices: improves lifetime of device –  For energy-harvesting devices: reduces turn around time

•  Important in iterative debug / tuning scenarios •  Minimize extra memory required for code update

–  Larger memory means more costly devices

5 Live Code Update for IoT Devices

Page 6: Live Code Update for IoT Devices - University of Pittsburghwahn/papers/PRES/present_nvmsa16.pdfLive Code Update for IoT Devices 8 Device Down Time Reprogramming Energy Memory Requirement

Previous Code Update Approaches

•  Naïve: Whole image transfer / whole image write 1.  Rebuild code image at base station 2.  Transfer image wirelessly to each IoT device 3.  Write image in separate area in code memory in device 4.  Reboot device from image

•  Incremental: Partial image transfer / whole image write 1.  Rebuild code image at base station 2.  Do a diff between old and new image and calculate delta 3.  Transfer delta wirelessly to each IoT device 4.  Write new image from delta and old image in device 5.  Reboot device from new image

6 Live Code Update for IoT Devices

Page 7: Live Code Update for IoT Devices - University of Pittsburghwahn/papers/PRES/present_nvmsa16.pdfLive Code Update for IoT Devices 8 Device Down Time Reprogramming Energy Memory Requirement

Our Code Update Approach

•  In-place: Partial image transfer / partial image write 1.  Generate code patches at

base station 2.  Transfer patches wirelessly

to each IoT device 3.  Apply patches to live image

in device in-place

7 Live Code Update for IoT Devices

Old Image

New Image

patch1

patch2

(b)

Old Image

jump <patch1>

jump <patch2>

(a)

Code memory updates

Previous Approaches Our Approach

➜ Patches are deltas designed to be applied in-place

Page 8: Live Code Update for IoT Devices - University of Pittsburghwahn/papers/PRES/present_nvmsa16.pdfLive Code Update for IoT Devices 8 Device Down Time Reprogramming Energy Memory Requirement

Code Update Approach Comparison

8 Live Code Update for IoT Devices

Device Down Time

Reprogramming Energy

Memory Requirement

Naïve ✗ (Need reboot)

✗ (Transfer and write

whole image)

✗ (Need twice

code image size) Incremental ✗

(Need reboot) Δ

(Transfer only delta but write whole image)

✗ (Need twice

code image size)

In-place ✓ (No reboot

needed due to live update)

✓ (Transfer only patches

and apply in-place)

✓ (Need only code

image size)

Page 9: Live Code Update for IoT Devices - University of Pittsburghwahn/papers/PRES/present_nvmsa16.pdfLive Code Update for IoT Devices 8 Device Down Time Reprogramming Energy Memory Requirement

return:

(c)

Deleted code

return:

Atomic code updates Non−atomic code updates

(a) (b)

patch:

Modified code

jump <return>

return:

patch:

Inserted code

jump <return>

patch:

jump <patch> jump <patch> jump <patch>

jump <return>

Live In-Place Code Patching Scenarios

9 Live Code Update for IoT Devices

[Insertion] [Deletion] [Modification]

Page 10: Live Code Update for IoT Devices - University of Pittsburghwahn/papers/PRES/present_nvmsa16.pdfLive Code Update for IoT Devices 8 Device Down Time Reprogramming Energy Memory Requirement

Code Consistency with Live In-place Patching

•  Updating live code while it is executing presents new issues •  What if inconsistent code executes in the middle of update? •  Divide code update into two phases

–  Phase 1: Preparatory phase •  Write new or modified code to free space in memory without

touching the original image •  Never introduces any inconsistencies

–  Phase 2: Patching phase •  Write jumps into original image to new / modified code •  Potential for inconsistencies è patching must be atomic

(e.g. by turning off interrupts that may trigger execution) •  Call set of written jumps the atomic write set •  Size of atomic write set dictates device down time

10 Live Code Update for IoT Devices

Page 11: Live Code Update for IoT Devices - University of Pittsburghwahn/papers/PRES/present_nvmsa16.pdfLive Code Update for IoT Devices 8 Device Down Time Reprogramming Energy Memory Requirement

In-Place Code Patching Revisited

11 Live Code Update for IoT Devices

[Insertion] [Deletion] [Modification]

return:

(c)

Deleted code

return:

Atomic code updates Non−atomic code updates

(a) (b)

patch:

Modified code

jump <return>

return:

patch:

Inserted code

jump <return>

patch:

jump <patch> jump <patch> jump <patch>

jump <return>

Atomic Write Set

Page 12: Live Code Update for IoT Devices - University of Pittsburghwahn/papers/PRES/present_nvmsa16.pdfLive Code Update for IoT Devices 8 Device Down Time Reprogramming Energy Memory Requirement

In-Place Code Patching w/ Trampolines

•  Still some drawbacks with in-place code patching –  Still needs updated part of device to be temporarily disabled –  Also, what if the code update software itself needs to be patched?

•  No way to disable the already running code update software •  Reduce atomic write set to a single word using trampolines

–  Eliminates all device down time –  Allows even the code update software to be safely patched

•  Trampolines (a.k.a indirect jump vectors) –  Memory locations holding addresses pointing to pieces of code –  Used to implement interrupt service routines in OSes

12 Live Code Update for IoT Devices

➜  Idea: change functionality instantaneously by switching over to another set of jump vectors atomically

Page 13: Live Code Update for IoT Devices - University of Pittsburghwahn/papers/PRES/present_nvmsa16.pdfLive Code Update for IoT Devices 8 Device Down Time Reprogramming Energy Memory Requirement

Phase 1: Trampoline Insertion

Code updates that need to be atomic

Deleted code

trampoline2:

Global Variable Memory

Code updates that do not need to be atomic

temp3:

base patch_targetsorig_targets

Inserted code

jump <return1>

patch1:

return1:

trampoline1:

patch2:

return2:

temp2:

Modified code

trampoline3:

jump <return3>

patch3:

return3:

load R2, [R1 + 0] load R2, [R1 + 1]jump R2

return1temp2

jump <return2>

orig_targets

jump R2

temp3

jump <trampoline2>

patch1

load R1, [<base>]

patch2

jump <trampoline1> jump <trampoline3>

patch3

load R1, [<base>]load R2, [R1 + 2]jump R2

load R1, [<base>]

13 Live Code Update for IoT Devices

Page 14: Live Code Update for IoT Devices - University of Pittsburghwahn/papers/PRES/present_nvmsa16.pdfLive Code Update for IoT Devices 8 Device Down Time Reprogramming Energy Memory Requirement

Phase 2: Trampoline Switching

trampoline2:

Global Variable Memory

Code updates that do not need to be atomic

Code updates that need to be atomic

Deleted code

temp3:

base patch_targetsorig_targets

Inserted code

return1:

trampoline1:

jump <return1>

patch1: patch2:

return2:

temp2:

Modified code

trampoline3:

jump <return3>

patch3:

return3:

load R2, [R1 + 0]

temp2

jump <trampoline2>

patch_targets

jump <return2>

temp3

jump R2load R2, [R1 + 1]

patch1

load R1, [<base>]

patch2patch3

load R1, [<base>]

jump <trampoline3>

load R1, [<base>]load R2, [R1 + 2]jump R2

jump <trampoline1>

return1

jump R2

14 Live Code Update for IoT Devices

Atomic Write Set

Page 15: Live Code Update for IoT Devices - University of Pittsburghwahn/papers/PRES/present_nvmsa16.pdfLive Code Update for IoT Devices 8 Device Down Time Reprogramming Energy Memory Requirement

Phase 3: Trampoline Removal

15 Live Code Update for IoT Devices

Code updates that need to be atomic

Deleted code

trampoline2:

Code updates that do not need to be atomic

temp3:

base patch_targetsorig_targets

Inserted code

return1:

trampoline1:

jump <return1>

patch1:

Global Variable Memory

patch2:

return2:

temp2:

Modified code

trampoline3:

jump <return3>

patch3:

return3:

load R1, [<base>]load R2, [R1 + 1]jump R2

return1

jump <patch1>

temp2

jump <return2>

patch_targets

jump <patch2>

temp3

patch1

load R2, [R1 + 0]

patch2

jump R2

jump <patch3>

patch3

load R1, [<base>]load R2, [R1 + 2]jump R2

load R1, [<base>]

➜ End result is exactly the same as original in-place patching ➜ Trampolines are recycled to implement next patch

Page 16: Live Code Update for IoT Devices - University of Pittsburghwahn/papers/PRES/present_nvmsa16.pdfLive Code Update for IoT Devices 8 Device Down Time Reprogramming Energy Memory Requirement

Evaluation Setup

•  Tested 4 code update strategies: –  RSync: Baseline incremental update with diff deltas –  Zephyr: State-of-art incremental update that minimizes size of delta by

mitigating diffs due to “shifts” when inserting or deleting code. Uses function indirection tables.

–  In-place: Our baseline in-place update proposal –  Trampolines: In-place update w/ trampolines

•  Measured 3 metrics: –  # of bytes in delta or patch transferred over wireless network –  # of bytes in atomic write set written to code memory –  # of bytes in total written to code memory

•  Translated metrics to energy numbers for a typical IoT device –  Energy required to recover from device down time –  Total energy required for reprogramming

16 Live Code Update for IoT Devices

Page 17: Live Code Update for IoT Devices - University of Pittsburghwahn/papers/PRES/present_nvmsa16.pdfLive Code Update for IoT Devices 8 Device Down Time Reprogramming Energy Memory Requirement

A Typical IoT Device

Name TI-MSP430FR5739 CPU 24 MHz Execution Memory 1 KB SRAM Code Memory 16 KB FRAM (Ferroelectric) Write Energy / byte 4.25 nJ Transfer Energy / byte 104 nJ (ZL70250 RF Link) Avg Energy / cycle 330.12 pJ

17 Live Code Update for IoT Devices

Page 18: Live Code Update for IoT Devices - University of Pittsburghwahn/papers/PRES/present_nvmsa16.pdfLive Code Update for IoT Devices 8 Device Down Time Reprogramming Energy Memory Requirement

TinyOS Update Cases & Atomic Write Set

Case # Description # of Patches 1 In Blink: Change timer0 frequency 1 2 In Oscilloscope : insert one local variable and one use 17 3 n MultihopOscilloscope : insert one local var and use it

within a loop 17

4 In RadioCountToLeds : insert one local var and use it inside and outside loop

11

5 In RadioSenseToLeds : Change a + operator to - 1 6 In Blink: remove one function call in NesC code 2 7 In Blink: remove an entire timer 33 8 In RadioCountToLeds : add an else branch 5 9 In Oscilloscope : insert a global var, and use it in three

different functions 38

18 Live Code Update for IoT Devices

Atomic Write Set for In-place (Atomic Write Set for Trampolines is always 1)

Page 19: Live Code Update for IoT Devices - University of Pittsburghwahn/papers/PRES/present_nvmsa16.pdfLive Code Update for IoT Devices 8 Device Down Time Reprogramming Energy Memory Requirement

Device Down Time Recovery Energy for In-Place

•  RSync / Zephyr: 89 uJ for reboot (more than 100 X difference) •  Trampolines: no device down time since atomic write set is 1

19 Live Code Update for IoT Devices

0

50

100

150

200

250

300

350

1 2 3 4 5 6 7 8 9

Rec

over

y En

ergy

(nJ)

Page 20: Live Code Update for IoT Devices - University of Pittsburghwahn/papers/PRES/present_nvmsa16.pdfLive Code Update for IoT Devices 8 Device Down Time Reprogramming Energy Memory Requirement

Normalized # of Bytes Transferred over Wireless

0% 1% 2% 3% 4% 5%

Case-1 Case-2 Case-3 Case-4 Case-5 Case-6 Case-7 Case-8 Case-9

20% 40% 60% 80%

100%

Norm

alize

d Tr

ansm

it Co

st (%

) Rsync

ZephyrIn-place

Trampolines

20 Live Code Update for IoT Devices

➜  Drastic reduction for in-place due to no code shifting (Zephyr mitigates shifting but must send function indirection table)

Page 21: Live Code Update for IoT Devices - University of Pittsburghwahn/papers/PRES/present_nvmsa16.pdfLive Code Update for IoT Devices 8 Device Down Time Reprogramming Energy Memory Requirement

Normalized # of Bytes Written to Code Memory

0% 0.2% 0.4% 0.6% 0.8%

1%

Case-1 Case-2 Case-3 Case-4 Case-5 Case-6 Case-7 Case-8 Case-9

20% 40% 60% 80%

100%

Norm

alize

d Up

date

Cos

t (%

) Rsync

ZephyrIn-place

Trampolines

21 Live Code Update for IoT Devices

➜  RSync & Zephyr both must write whole new image ➜  In-place & trampolines need only write patches into original image

Page 22: Live Code Update for IoT Devices - University of Pittsburghwahn/papers/PRES/present_nvmsa16.pdfLive Code Update for IoT Devices 8 Device Down Time Reprogramming Energy Memory Requirement

Total Reprogramming Energy

•  Zephyr: Reduces transfer energy but does not reduce write energy •  In-place & Trampolines: Reduces both transfer and write energy

22 Live Code Update for IoT Devices

0

0.5

1

1.5

2

2.5

1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9

Rsync Zephyr In-place Trampolines

Tota

l En

ergy

(mJ)

write transfer reboot

Page 23: Live Code Update for IoT Devices - University of Pittsburghwahn/papers/PRES/present_nvmsa16.pdfLive Code Update for IoT Devices 8 Device Down Time Reprogramming Energy Memory Requirement

Runtime Overhead

•  Potential for overhead due to jumps to and from patched code •  Our test cases showed negligible overhead

–  At worst 2.7% performance loss for just one case

23 Live Code Update for IoT Devices

Page 24: Live Code Update for IoT Devices - University of Pittsburghwahn/papers/PRES/present_nvmsa16.pdfLive Code Update for IoT Devices 8 Device Down Time Reprogramming Energy Memory Requirement

Summary

•  Proposed live code update scheme for IoT devices that can: –  Improve QoS by eliminating device down time –  Improve debug / tune / upgrade turn around time by

minimizing device reprogramming energy •  Future work

–  How to generate minimal patches from compiler –  How to deal with fragmentation

•  Frequent patching can leave “holes” in your code –  How to do updates to data

•  How to modify semantics of variables •  How to add / delete variables (changing layout) è May need to do “data migration”

24 Live Code Update for IoT Devices