35
Hardening an L4 Microkernel Against Soft Errors by Aspect-Oriented Programming and Whole-Program Analysis Christoph Borchert and Olaf Spinczyk http://ess.cs.tu-dortmund.de/ Embedded System Software Group Computer Science 12, TU Dortmund

Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Hardening an L4 Microkernel Against Soft Errors

by Aspect-Oriented Programming andWhole-Program Analysis

Christoph Borchertand Olaf Spinczyk

http://ess.cs.tu-dortmund.de/

Embedded System Software GroupComputer Science 12, TU Dortmund

Page 2: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

Memory Errors are Commonplace!● DRAM fault rate: FIT/bit [1,2]

– FIT: expected failures per hours

– Scales with Moore's Law

● Example: “Jaguar” supercomputer at Oak Ridge, Tennessee

– 300 terabytes → “one failure approximately every six hours” [2]

[1] V. Sridharan, J. Stearley, N. DeBardeleben, S. Blanchard, andS. Gurumurthi, “Feng shui of supercomputer memory: Positionaleffects in DRAM and SRAM faults,” in Int. Conf. for High Perf.Computing, Networking, Storage and Analysis (SC ’13)

[2] V. Sridharan and D. Liberty, “A study of DRAM failures in thefield,” in Int. Conf. for High Perf. Computing, Networking, Storageand Analysis (SC ’12)

1

10−8

109

Page 3: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

Considering Integrity of OS Kernel Data● Kernels are …

– … small (1 % RAM)

– … essential for all application programs

– … exposed to memory faults all the OS uptime

● Memory faults should be mitigated there!

– Need for software-based error correction

– Problem: Manual implementation in C/C++ → tedious, error-prone

2

Page 4: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

Programming Language Support● Memory-error correction as generic module

– “Pluggable” into various kernel data structures (C/C++ structs/objects)

● AspectC++ compiler support

– Aspect-Oriented Programming (AOP)

KernelKernelError-

correctionmodule

3

Page 5: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

Outline● Motivation and Idea

● Generic Object Protection with AspectC++

● Whole-Program Optimization

● Evaluation

4

Page 6: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

Idea: Generic Object Protection (GOP)● Extend kernel objects by error-

correcting code

● Check that code before …

– Invocation of a member function

– Field access (within non-member function)

… and update it afterwards

● When leaving the object's scope:

– Update the code, and check on return

5

Page 7: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

GOP (1/3): Class Extension

aspect GOP { pointcut critical() = "Cpu" || "Timeout_q";

advice critical() : slice class { HammingCode<JoinPoint> code; };

advice construction(critical()) : after() { tjp->target()->code.update(); }…

6

Page 8: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

GOP (1/3): Class Extension

aspect GOP { pointcut critical() = "Cpu" || "Timeout_q";

advice critical() : slice class { HammingCode<JoinPoint> code; };

advice construction(critical()) : after() { tjp->target()->code.update(); }…

Reusable alias for names (type signatures)

Reusable alias for names (type signatures)

6

Page 9: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

GOP (1/3): Class Extension

aspect GOP { pointcut critical() = "Cpu" || "Timeout_q";

advice critical() : slice class { HammingCode<JoinPoint> code; };

advice construction(critical()) : after() { tjp->target()->code.update(); }…

Introduce new members to classes

Introduce new members to classes

6

Page 10: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

GOP (1/3): Class Extension

aspect GOP { pointcut critical() = "Cpu" || "Timeout_q";

advice critical() : slice class { HammingCode<JoinPoint> code; };

advice construction(critical()) : after() { tjp->target()->code.update(); }…

Interface to a compile-time introspection API

Interface to a compile-time introspection API

6

Page 11: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors 6

GOP (1/3): Class Extension

aspect GOP { pointcut critical() = "Cpu" || "Timeout_q";

advice critical() : slice class { HammingCode<JoinPoint> code; };

advice construction(critical()) : after() { tjp->target()->code.update(); }…

Interface to a compile-time introspection API

Interface to a compile-time introspection API

Page 12: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

GOP (1/3): Class Extension

aspect GOP { pointcut critical() = "Cpu" || "Timeout_q";

advice critical() : slice class { HammingCode<JoinPoint> code; };

advice construction(critical()) : after() { tjp->target()->code.update(); }…

Matches constructor execution

Matches constructor execution

6

Page 13: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

GOP (1/3): Class Extension

aspect GOP { pointcut critical() = "Cpu" || "Timeout_q";

advice critical() : slice class { HammingCode<JoinPoint> code; };

advice construction(critical()) : after() { tjp->target()->code.update(); }…

Triggers after the actual constructor has finished

Triggers after the actual constructor has finished

6

Page 14: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

GOP (1/3): Class Extension

aspect GOP { pointcut critical() = "Cpu" || "Timeout_q";

advice critical() : slice class { HammingCode<JoinPoint> code; };

advice construction(critical()) : after() { tjp->target()->code.update(); }…

Yields a pointer to the particular object

Yields a pointer to the particular object

6

Page 15: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

GOP (2/3): Advice for Object Access

pointcut check() = call(member(critical())) || get(member(critical())) || set(member(critical())); pointcut update() = /* only call and set */

advice check() : before () { if (tjp->that() != tjp->target()) { tjp->target()->code.check(); }}

advice update() : after () { if (tjp->that() != tjp->target()) { tjp->target()->code.update(); }}

7

Page 16: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

GOP (2/3): Advice for Object Access

pointcut check() = call(member(critical())) || get(member(critical())) || set(member(critical())); pointcut update() = /* only call and set */

advice check() : before () { if (tjp->that() != tjp->target()) { tjp->target()->code.check(); }}

advice update() : after () { if (tjp->that() != tjp->target()) { tjp->target()->code.update(); }}

7

Page 17: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

GOP (2/3): Advice for Object Access

pointcut check() = call(member(critical())) || get(member(critical())) || set(member(critical())); pointcut update() = /* only call and set */

advice check() : before () { if (tjp->that() != tjp->target()) { tjp->target()->code.check(); }}

advice update() : after () { if (tjp->that() != tjp->target()) { tjp->target()->code.update(); }}

Matches every invocation of a member function

Matches every invocation of a member function

7

Page 18: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

GOP (2/3): Advice for Object Access

pointcut check() = call(member(critical())) || get(member(critical())) || set(member(critical())); pointcut update() = /* only call and set */

advice check() : before () { if (tjp->that() != tjp->target()) { tjp->target()->code.check(); }}

advice update() : after () { if (tjp->that() != tjp->target()) { tjp->target()->code.update(); }}

Matches every access to a member variable

Matches every access to a member variable

7

Page 19: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

GOP (2/3): Advice for Object Access

pointcut check() = call(member(critical())) || get(member(critical())) || set(member(critical())); pointcut update() = /* only call and set */

advice check() : before () { if (tjp->that() != tjp->target()) { tjp->target()->code.check(); }}

advice update() : after () { if (tjp->that() != tjp->target()) { tjp->target()->code.update(); }}

Before call/get/set events ...

Before call/get/set events ...

… invoke check()… invoke check()

7

Page 20: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

GOP (2/3): Advice for Object Access

pointcut check() = call(member(critical())) || get(member(critical())) || set(member(critical())); pointcut update() = /* only call and set */

advice check() : before () { if (tjp->that() != tjp->target()) { tjp->target()->code.check(); }}

advice update() : after () { if (tjp->that() != tjp->target()) { tjp->target()->code.update(); }}

After call/set events ...After call/set events ...

… invoke update()… invoke update()

7

Page 21: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

GOP (2/3): Advice for Object Access

pointcut check() = call(member(critical())) || get(member(critical())) || set(member(critical())); pointcut update() = /* only call and set */

advice check() : before () { if (tjp->that() != tjp->target()) { tjp->target()->code.check(); }}

advice update() : after () { if (tjp->that() != tjp->target()) { tjp->target()->code.update(); }}

Don't check when the caller and callee are identical (“recursion”)

Don't check when the caller and callee are identical (“recursion”)

7

Page 22: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

GOP (3/3): Leaving an Object's Scope

pointcut leave() = call("% ...::%(...)") && within(member(critical()));

advice leave() : before () { if (tjp->that() != tjp->target()) { tjp->that()->code.update(); }}

advice leave() : after () { if (tjp->that() != tjp->target()) { tjp->that()->code.check(); }}};

8

Page 23: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

More GOP Features● Protection of

– Virtual-function pointers (vptr)

– Static data members

● Choice of Hamming code or CRC32 (SSE4 instructions)

● Optimizations for read-only (const) functions

● Inheritance and polymorphism

● Non-blocking synchronization

9

Page 24: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

Outline● Motivation and Idea

● Generic Object Protection with AspectC++

● Whole-Program Optimization

● Evaluation

10

Page 25: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

Problem: There are Unneeded Checks!● Short-running functions

– e.g., inline getters and setters

● Call sequences on the same object

11

Page 26: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

Problem: There are Unneeded Checks!● Short-running functions

– e.g., inline getters and setters

● Call sequences on the same object

Idea: Optimize-out unneededchecks at compile time!

Idea: Optimize-out unneededchecks at compile time!

11

Page 27: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

Whole-Program Analysis/Optimization

ProjectRepository

(XML)

Point-cutsXQueryXQuery

GOPAspect

HardenedOS KernelHardenedOS Kernel

.cc

1

2

3

#include

Static analysis Optimization

12

Page 28: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

Outline● Motivation and Idea

● Generic Object Protection with AspectC++

● Whole-Program Optimization

● Evaluation

13

Page 29: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

Case Study: L4/Fiasco.OC¹ µ-kernel

● 7 benchmark programs (shipped with L4/Fiasco.OC)

– Testing the µ-kernel essentials

● Thread scheduling● Inter-process communication

(ipc)● Interrupt requests (irq)● Shared-memory management● Access control

● 4 kernel variants

– Baseline

– VPtr (virtual-function pointers)

– GOP (all data members + vptr)

– GOP-S (static optimization)

● Real-time kernel for x86/x64/ARM, open source (C++)

¹ http://os.inf.tu-dresden.de/fiasco/

14

Hardening 26 classes

Page 30: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

Assessment of Fault Tolerance● Fault model: Single-bit errors in memory

– Uniformly distributed over the kernel address space

● Fault injection: One random bit flips in one benchmark run

– 100,000 runs per kernel variant and benchmark program

– Extrapolate the counted number of failed program runs

● Fault injection tool: FAIL*, a modified Bochs x86 emulator

– Trace-based optimizations only injecting faults into live memory

15

Page 31: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

Fault Injection: Failed Program Runs

16

Page 32: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

Fault Injection: Failed Program Runs

● Total reduction

– VPtr: -12 %

– GOP: -59 %

– GOP-S: -60 %

16

Page 33: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

Overhead: Dynamic CPU Instructions

● Total overhead

– VPtr: 1.01x

– GOP: 3.5x

– GOP-S: 2.3x

17

Page 34: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

Overhead: Dynamic CPU Instructions

● Total overhead

– VPtr: 1.01x

– GOP: 3.5x

– GOP-S: 2.3x

17

This is only kernel time!

Runtime overhead <1% (kernel + application)

This is only kernel time!

Runtime overhead <1% (kernel + application)

Page 35: Hardening an L4 Microkernel Against Soft Errors...Generic Object Protection prevents 60% of kernel failures – Only 26 classes protected, yet Whole-program analysis improves fault

Christoph Borchert – Hardening an L4 Microkernel Against Soft Errors

Summary and Future Work● Generic Object Protection prevents 60% of kernel failures

– Only 26 classes protected, yet

● Whole-program analysis improves fault tolerance

– Dynamic instruction overhead: 3.5x → 2.3x

● Embed whole-program analysis into the AspectC++ language

– Query the call graph (“is function x reachable from here?”)

– Advice for call sequences via regular expressions (call?, call*, call+)

18