34
Octet: Capturing and Controlling Cross-Thread Dependences Efficiently Michael Bond Milind Kulkarni Man Cao Minjia Zhang Meisam Fathi Salmi Swarnendu Biswas Aritra Sengupta Jipeng Huang Ohio State Purdue

Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

  • Upload
    dylan

  • View
    54

  • Download
    0

Embed Size (px)

DESCRIPTION

Michael Bond Milind Kulkarni Man Cao Minjia Zhang Meisam Fathi Salmi Swarnendu Biswas Aritra Sengupta Jipeng Huang. Purdue. Octet: Capturing and Controlling Cross-Thread Dependences Efficiently. Ohio State. Parallel programming is mainstream. Shared memory with locks Challenge: - PowerPoint PPT Presentation

Citation preview

Page 1: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

Michael BondMilind KulkarniMan CaoMinjia ZhangMeisam Fathi SalmiSwarnendu BiswasAritra SenguptaJipeng Huang

Ohio State

Purdue

Page 2: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

Parallel programming is mainstream

Shared memory with locks

Challenge:performance & correctness

Page 3: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

• Help express parallelism better• Eliminate concurrency errors• Diagnose production bugs• Deal with nondeterminism

Need practical runtime support

Page 4: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

• Atomicity checking• Data race

detection• Record & replay

• Transactional memory• DRF/SC enforcement• Deterministic execution

Need practical runtime support

Page 5: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

• Atomicity checking• Data race

detection• Record & replay

• Transactional memory• DRF/SC enforcement• Deterministic execution

Track dependences Control dependences

Need practical runtime support

o.f = …… = o.f

Page 6: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

• Atomicity checking• Data race

detection• Record & replay

• Transactional memory• DRF/SC enforcement• Deterministic execution

Track dependences Control dependences

Need practical runtime support

o.f = …… = o.f

Page 7: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

• Atomicity checking• Data race

detection• Record & replay

• Transactional memory• DRF/SC enforcement• Deterministic execution

Track dependences Control dependences

Need practical runtime support

o.f = …… = o.f

Commodity (software-only) approachesslow programs by several times

Page 8: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

• Atomicity checking• Data race

detection• Record & replay

• Transactional memory• DRF/SC enforcement• Deterministic execution

Track dependences Control dependences

o.f = …check

… = o.fcheck

Need practical runtime support

Commodity (software-only) approachesslow programs by several times

Page 9: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

• Atomicity checking• Data race

detection• Record & replay

• Transactional memory• DRF/SC enforcement• Deterministic execution

Track dependences Control dependences

o.f = …check

… = o.fcheck

Need practical runtime support

Any access could race add synchronization at every access

Page 10: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

Octet

Framework for runtime supportHB edges all dependencesAtomicity of analysis & access

Concurrency control mechanismSynchronization cross-thread dependence Qualitative performance improvement

Page 11: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

Octet

Framework for runtime supportHB edges all dependencesAtomicity of analysis & access

Concurrency control mechanismSynchronization cross-thread dependence Qualitative performance improvement

Proofs!

Page 12: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

Octet tracks ownership

Each object’s state Є { WrExT , RdExT , RdShc }

Page 13: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

wr o.f

T1 T2

write check

o’s state = WrExT1Ti

me

Page 14: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

wr o.f

T1 T2

read check

write check

o’s state = WrExT1Ti

me

Page 15: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

wr o.f

T1 T2

read check

write check

o’s state = WrExT1Ti

me

Page 16: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

wr o.f

T1 T2

safe point

write check

read check

o’s state = WrExT1Ti

me

Page 17: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

wr o.f

T1 T2

safe point

write check

read check

o’s state = WrExT1

Implicit safe pointTi

me

Page 18: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

wr o.f

T1 T2

safe point

write check

read check

o’s state = WrExT1Ti

me

Page 19: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

wr o.f

T1 T2

safe point

write check

read check

o’s state = RdExT2Ti

me

Page 20: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

wr o.f

T1 T2

rd o.f

safe point

write check

read check

o’s state = RdExT2Ti

me

Page 21: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

wr o.f

T1 T2

rd o.f

safe point

read check

write check

o’s state = RdExT2

T3 T4

Page 22: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

wr o.f

T1 T2

rd o.f

T3

safe point

T4

read check

write check

read check

o’s state = RdExT2

Page 23: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

wr o.f

T1 T2

rd o.f

rd o.f

T3

safe point

T4

read check

write check

read check

o’s state = RdShc

Page 24: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

wr o.f

T1 T2

rd o.f

rd o.f

T3

safe point

T4

read check

write check

read check

read check

o’s state = RdShc

Page 25: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

wr o.f

T1 T2

rd o.f

rd o.f

T3

safe point

rd o.f

T4

read check

write check

read check

read check

o’s state = RdShc

Page 26: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

wr o.f

T1 T2

rd o.f

rd o.f

T3

safe point

rd o.f

T4

read check

write check

read check

read check

o’s state = RdShc

Sharing detection[von Praun & Gross ’01]Comparison in our paper

Distributed shared memoryShasta [Scales et al. ’96]

Biased locking[Kawachiya et al. ’02][Russell & Detlefs ’06][Hindman & Grossman ’06]

Page 27: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

• Atomicity checking• Data race

detection• Record & replay

• Transactional memory• DRF/SC enforcement• Deterministic execution

Practical runtime support

Track dependences Control dependences

Framework for runtime supportConcurrency control mechanism O

ctet

Page 28: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

wr o.f

T1 T2

rd o.f

rd o.f

T3

safe point

rd o.f

T4

read check

write check

read check

read check

Dependence recorder records happens-before edges

Page 29: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

Implementation in Jikes RVMPublicly availablehttp://jikesrvm.org/Research+Archive

Parallel programsDaCapo Benchmarks 2006 & 2009SPEC JBB 2000 & 2005

Parallel platform32 cores (AMD Opteron 6272)

Page 30: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

eclips

e6

hsqldb

6

lusea

...

xalan

6

avror

a9

jytho

n9

luind

ex9

lusea

...pm

d9

sunflo

w9xa

lan9

jbb20

00

jbb20

05 geo

0

100

200

300

400

500

600

700

800

900

1000

Pessimi...

Ove

rhea

d (%

)34,600% 3,000%

Page 31: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

eclips

e6

hsqldb

6

lusea

...

xalan

6

avror

a9

jytho

n9

luind

ex9

lusea

...pm

d9

sunflo

w9xa

lan9

jbb20

00

jbb20

05 geo

0

20

40

60

80

100

120 Octet w/o coord

Octet w/o coord

Ove

rhea

d (%

)

Page 32: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

eclips

e6

hsqldb

6

lusea

...

xalan

6

avror

a9

jytho

n9

luind

ex9

lusea

...pm

d9

sunflo

w9xa

lan9

jbb20

00

jbb20

05 geo

0

20

40

60

80

100

120

OctetOctet w/o coord

Ove

rhea

d (%

)

Page 33: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

eclips

e6

hsqldb

6

lusea

...

xalan

6

avror

a9

jytho

n9

luind

ex9

lusea

...pm

d9

sunflo

w9xa

lan9

jbb20

00

jbb20

05 geo

0

20

40

60

80

100

120

RecorderOctetOctet w/o coord

Ove

rhea

d (%

)

Page 34: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently

Octet helps enable practical runtime support for reliable, scalable concurrency

Framework for runtime supportHB edges all dependencesAtomicity of analysis & access

Concurrency control mechanismSynchronization cross-thread dependence Qualitative performance improvement