25
Support for Symmetric Shadow Memory in Multiprocessors Vijay Nagarajan Rajiv Gupta University of California, Riverside

Support for Symmetric Shadow Memory in Multiprocessors

  • Upload
    maegan

  • View
    49

  • Download
    0

Embed Size (px)

DESCRIPTION

Support for Symmetric Shadow Memory in Multiprocessors. Vijay Nagarajan Rajiv Gupta University of California, Riverside. Runtime Monitoring. Applications of monitoring Security DIFT Debugging Memcheck, Redux, OnTrac Performance Speculation Requirements of monitoring - PowerPoint PPT Presentation

Citation preview

Page 1: Support for Symmetric Shadow Memory in Multiprocessors

Support for Symmetric Shadow Memory in Multiprocessors

Vijay Nagarajan Rajiv Gupta

University of California, Riverside

Page 2: Support for Symmetric Shadow Memory in Multiprocessors

Runtime Monitoring• Applications of monitoring

– Security • DIFT

– Debugging • Memcheck, Redux, OnTrac

– Performance• Speculation

• Requirements of monitoring– Shadow Memory (SM)

• Meta-data associated with memory locations– Shadow memory instructions (SMIs)

• Instruction for maintenance of meta-data

Page 3: Support for Symmetric Shadow Memory in Multiprocessors

DIFT: Example

• Each word/reg associated with “taint” value– Data from input channels are considered tainted– Flow of tainted data is tracked– Usage of tainted data in “malicious” fashion

detected

Original Instruction Shadow Memory OperationLd reg, mem Taint-val[reg]Taint-val[Mem]

St reg, mem Taint-val[mem]Taint-val[reg]

Add reg1, reg2 Taint-val[reg1]Taint-val[reg1] or Taint-val[reg2]

Jmp reg1 If Taint-val[reg1] raise exception

Page 4: Support for Symmetric Shadow Memory in Multiprocessors

Shadow Memory Observations

• Single vs Multiple Shadow values– DIFT associates one taint value– Other applications associate multiple shadow values

• DDG computes dynamic dependence graph on the fly• For each memory word, maintains (instruction, instance) pair

that wrote to it last.

• Symmetric SMIs– Original stores (loads) associated with shadow stores

(loads)

• Atomic SMIs– OMI and SMIs must be executed atomically

Page 5: Support for Symmetric Shadow Memory in Multiprocessors

Atomic SMIs

Proc A

St1

S St1

St2

S St2

Inconsistent ViewAtomicity

Proc B

Ld

S Ld

Proc A

St1

S St1

St2

S St2

Proc B

Ld

S Ld

Proc A

St1

S St1

St2

S St2

Proc B

LdS Ld

Page 6: Support for Symmetric Shadow Memory in Multiprocessors

Robust & Efficient SM• Each SM access involves

– Calculating effective and shadow address– Accessing the shadow values

• Half-and-Half scheme– Reserve half of virtual space for shadow memory– Efficient SM access– Not Robust [Nethercote and Seward VEE ’07]

• Valgrind’s s/w page table like scheme– Robust– Inefficient (Valgrind’s Memcheck causes 22x slowdown)

• Need to be efficient and robust!

Page 7: Support for Symmetric Shadow Memory in Multiprocessors

Research Question

• Can we make SMIs and OMIs atomic?

• Can we make SM accesses efficient without sacrificing robustness?

• Can we do the above with minimal HW support?

Page 8: Support for Symmetric Shadow Memory in Multiprocessors

Our Approach• Convey atomic block to the processor

– Simple ISA support: shadow-start, shadow-end– SMIs implicitly identified

• Coupled Coherence– Coherence of SMIs and OMIs are coupled– Enforces the effect of atomicity

• OS Support– Couple allocation of original and shadow pages– Efficient addressing without sacrificing robustness

Page 9: Support for Symmetric Shadow Memory in Multiprocessors

ISA Support• Shadow-start / Shadow-end

instructions– OMIs and SMIs enclosed– Conveys atomic block to the

processor– Guides actions of cache-

coherence protocol

• Implicitly distinguishing SMIs– First instruction is an OMI– All others with same VA treated

as SMIs– Multiple accesses implicitly

assumed to access different shadow values

EXAMPLE

0. shadow-start

// Original load

1. ld reg1, vaddr // 1st shadow load

2. ld reg2, vaddr

// 2nd shadow load

3. ld reg3, vaddr

4. shadow-end

Page 10: Support for Symmetric Shadow Memory in Multiprocessors

Coupled Coherence

• Dependence Mirroring– Dependences

among SMIs mirror those of the OMIs

– If OMI2 OMI1 then SMI2 SMI1

– Couple coherence enforces this

Proc A

St1

S St1

St2

S St2

Proc B

Ld

S Ld

Page 11: Support for Symmetric Shadow Memory in Multiprocessors

Coupled Coherence• Coupled Coherence involves

– No Explicit Shadow coherence messages• SMIs do not trigger coherence messages• Shadow stores do not trigger invalidates• Shadow loads do not cause misses

– Co-transfer• Data replies of original blocks are piggybacked with shadow

blocks

– Co-existence• Original blocks and shadow blocks co-exist in the cache• Brought in together• Replaced together

Page 12: Support for Symmetric Shadow Memory in Multiprocessors

Dependence Mirroring: RAW

Shared

shared

Block ‘B’

Proc A Proc B

St

Shared

shared

Shadow Block ‘B’Proc A send invalidate for B and B’

Exc

Exc

Inv

Inv

S St Ld S Ld

Proc B send read miss for B and B’Proc A sends blocks B and B’

Page 13: Support for Symmetric Shadow Memory in Multiprocessors

Dependence Mirroring: RAW

Block ‘B’

Proc A Proc B

St

Proc A send invalidate for B and B’

Exc Inv

S St Ld

S Ld

Proc B send read miss for B and B’Proc A waits until ready bit set

shadow-st

shadow-end

0

Ready bit

Proc A sends blocks B and B’

1

Page 14: Support for Symmetric Shadow Memory in Multiprocessors

Dependence Mirroring: WAR

Proc A Proc BSt1 S St1

Ld

S Ld

St2 S St2

Proc A send invalidatesProc B send read miss for B and B’Proc A sends blocks B and B’

Page 15: Support for Symmetric Shadow Memory in Multiprocessors

Coupled Coherence

• On a cache miss– Original Ld / St

• Place read miss for original, shadow block(s)• Write back dirty blocks

– Shadow Ld / St• //No coherence events

• Shadow-start– Set ready bit to 0

• Shadow-end– Set ready bit to 1

Page 16: Support for Symmetric Shadow Memory in Multiprocessors

Symmetric/General SM

• Symmetric SM– Original loads (stores) accompanied by shadow loads

(stores)

• General SM– Original load can be accompanied by both shadow

loads and stores• Eg. Eraser: Online race detection

– Need to enforce shadow coherence for RAR• Typically no coherence events for RAR• Future Work

Page 17: Support for Symmetric Shadow Memory in Multiprocessors

Addressing Support• Shadow pages allocated adjacent to original pages

– Virtual Memory space unaffected– Retains robustness – OS treats them as a single “superpage”

• Swapped in and swapped out together

• Address Translation– During Address translation add offset to access shadow page– Provides efficiency– No separate TLB for shadow pages

V.Page

OffOMI

V.Page

Off

SMI

Ph.page

TLB

Ori.Page

Shadow Page 1

Shadow Page 2

Memory

ShadowValue cnt

Page 18: Support for Symmetric Shadow Memory in Multiprocessors

Experiments• Implementation in SESC Simulator

– Cycle Accurate, targets MIPS architecture• Shadow-start, Shadow-end instructions

– Models cache coherence protocol• Coupled Coherence implementation• Bus based protocol

– Models basic OS services• Coupled page allocation

• Monitoring Applications– DIFT: Detection of security attacks– DDG: Computes Dynamic dependence graph online

• Benchmarks– SPLASH-2

Page 19: Support for Symmetric Shadow Memory in Multiprocessors

Efficiency of SM• Three versions:

– SM• Our SM implementation• ISA support• OS support for address translation • Coupled Coherence protocol for atomicity

– VAL: serial• Valgrind’s SM support.• Address Translation: involves software page table accesses• Atomicity: Enforced by thread serialization

– VAL:lb• Valgrind’s SM support with no atomicity guarantees• Means of comparison of our address translation support

Page 20: Support for Symmetric Shadow Memory in Multiprocessors

Efficiency of SM: DIFT

0

10

20

30

40

50

60N

orm

aliz

ed E

xecu

tion

Ove

rhea

d

barn

es

fmm

ocea

n

radi

osity

Ray

trac

e

wat

er-n

sq

wat

er-s

p

aver

age

VAL:serialVAL:lbSM

• VAL:serial causes 41 times overhead on an average– Effect of serialization

• SM causes only 7 times overhead– Efficient Address translation + coupled coherence

• Even without serialization VAL:lb causes 12 times overhead– With coupled coherence this reduces to 7 times

Page 21: Support for Symmetric Shadow Memory in Multiprocessors

Efficiency of SM:DDG

0

20

40

60

80

100

120N

orm

aliz

ed E

xecu

tion

Ove

rhea

d

barn

es

fmm

ocea

n

radi

osity

Ray

trac

e

wat

er-n

sq

wat

er-s

p

aver

age

VAL:serialVAL:lbSM

• VAL:serial causes 78 times overhead on an average– Effect of serialization

• SM causes only 23 times overhead– Efficient Address translation + coupled coherence

• Even without serialization VAL:lb causes 27 times overhead– With coupled coherence this reduces to 23 times– Effect not as pronounced as in DIFT

Page 22: Support for Symmetric Shadow Memory in Multiprocessors

Effect of Coupled Coherence

0

0.2

0.4

0.6

0.8

1

1.2Pe

rcen

tage

Ove

rhea

d

barn

es

fmm

ocea

n

radi

osity

Ray

trace

wat

er-n

sq

wat

er-s

p

aver

age

DIFT:1

DDG:2

3-DColumn 3

• Performance overhead < 0.6% for DIFT and DDG– Total amount of traffic is about the same– Coupled coherence sees more bursts in traffic

Page 23: Support for Symmetric Shadow Memory in Multiprocessors

Related Work• Enforcing Atomicity

– Valgrind [Nethercote et al. PLDI ‘07] through thread serialization• Not efficient

– TM [Chung et al. HPCA ‘08] can be used.• Requires additional HW changes• Support for rollback and re-execution.

• Address Translation– Valgrind [Nethercote VEE ’07] software page table structure

• Proposed application specific optimizations• Still inefficient

– Half-and-Half scheme [Qin et al MICRO ’07]• Divides virtual address space• Not Robust

Page 24: Support for Symmetric Shadow Memory in Multiprocessors

Conclusion• SM used extensively for performing monitoring

– Performance– Security– Debugging

• Support for improving SM performance– ISA Support– Coupled coherence atomicity– Coupled allocation efficient addressing– Significant performance advantage

• Future Work– Extend system to not only symmetric SMIs– Look at other techniques for providing atomicity without changes

to coherence protocol

Page 25: Support for Symmetric Shadow Memory in Multiprocessors

Questions?