33
1 Evaluating the Impact of Thread Escape Analysis on Memory Consistency Optimizations Chi-Leung Wong, Zehra Sura, Xing Fang, Kyungwoo Lee, Samuel P. Midkiff, Jaejin Lee and David Padua University of Illinois at Urbana- Champaign IBM T.J. Watson Research Center Purdue University Seoul National University

Evaluating the Impact of Thread Escape Analysis on Memory Consistency Optimizations

  • Upload
    ledell

  • View
    24

  • Download
    0

Embed Size (px)

DESCRIPTION

Evaluating the Impact of Thread Escape Analysis on Memory Consistency Optimizations. Chi-Leung Wong , Zehra Sura , Xing Fang , Kyungwoo Lee , Samuel P. Midkiff , Jaejin Lee and David Padua University of Illinois at Urbana-Champaign IBM T.J. Watson Research Center Purdue University - PowerPoint PPT Presentation

Citation preview

Page 1: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

1

Evaluating the Impact of Thread Escape Analysis on

Memory Consistency Optimizations

Chi-Leung Wong, Zehra Sura, Xing Fang, Kyungwoo Lee, Samuel P. Midkiff, Jaejin Lee and David Padua

University of Illinois at Urbana-Champaign

IBM T.J. Watson Research Center

Purdue University

Seoul National University

Page 2: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

2

Outline

• Memory Models• The Pensieve System• Escape Analyses• Qualitative Impact of Escape Analyses

on Delay Set Analysis and Synchronization Analysis

• Experimental Results• Conclusion

Page 3: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

3

Memory Models

• Consider the following code segments:– Thread 1 : data = 100; data_ready = true;– Thread 2 : while (!data_ready); t = data;

• Can t == 0?– Yes if reordering happens

• Thread 1 : data_ready = true; data = 100;• Can be done by compiler and hardware

– Memory models tell us the answer• Sequential Consistency says no

Page 4: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

4

Objective of the Pensieve Project

• Sequential consistency (SC) on top of Intel x86 memory models– Implementation based on Jikes RVM

• All analyses done in JIT time• Need to minimize both analysis and application

execution time

Page 5: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

5

Enforcing SC

• Done by enforcing memory accesses orders– not all orderings need to be enforced– only enforce orders really needed

• Delay Set Analysis (DSA) [SS88] computes such orders• Our approach : Approximation of DSA

– Orders enforced by inserting fences in generated code

Page 6: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

6

Original DSA

• Program edge– x executes before y in

the same thread

• Conflict edge– x and x’ conflict accesses

• Order of access affects program outcome

• In this paper:– to the same memory

location– one of them is a write

x x’

x y

y’

y

x

x’

Page 7: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

7

Original DSA (Cont’d)

• Critical cycle– Minimal

• Cannot form smaller cycle using subset of nodes

– Mixed• Contains both edges

• Enforce program edges on a critical cycle

y’

y

x

x’

Minimal

Not minimal

y’

y

x

x’

z Not mixed

y

x

Mixed

y

x

Page 8: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

8

Approximate DSA

• Approximate of critical cycle– x precedes y– Conflict accesses for

• x and x’• y and y’

– y’ precedes x’

• Enforce program edges on approx critical cycle

x

y x’

y’

Page 9: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

9

Source Program

Code Optimizations

Fence Insertion& Optimization

Program Analyses

Thread EscapeAnalysis

Program Analyses

The Pensieve System

Target Program

Orders toEnforce

SynchronizationAnalysis

Delay SetAnalysis

Page 10: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

10

Escape Analyses

• Identify objects which may be accessed by two or more threads

• Output: set of variables– {v | v points to an object may be accessed by >= 2 threads}

Page 11: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

11

Impact on Delay Set Analysis

• x, y, y’, x’ must be escaping accesses– Cannot form a cycle if

one of them is not escaping access

• Fewer escaping accesses implies fewer possible pairs of (x,y)– Fewer checks to be done– Fewer delays

y x’

y’x

Page 12: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

12

Impact on Synchronization Analysis

• Synchronization analysis reduces number of conflict edges considered by DSA– Consider synchronized construct– Calls to start() and join()

• Our system only consider t1.join() – if it can match some t2.start() call– t1 and t2 are not escaping

• More precise escape info more join() calls matched more precise DSA result

Page 13: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

13

Escape Analyses Comparison

• In this study, we compare 4 algorithms:– Connectivity Analysis (Pensieve)– Field Base Analysis (Pensieve)

• For comparison purposes

– Bogda’s Analysis• Removing Unnecessary Synchronization in Java. (OOPSLA

1999)

– Ruf’s Analysis• Effective Synchronization Removal for Java. (PLDI 2000)

Page 14: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

14

Connectivity Escape Analysis

• An object is escaping if both– Reachable by more than one thread due to two

possible cases:• Reachable by a static field• Passed from a thread constructor

– Accessed by more than one thread

• Do not assume this escaping in run() by default

• Field insensitive for most memory accesses– I.e. do not distinguish x.f vs x.g– Except accesses to Runnable objects

Page 15: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

15

Field Base Escape Analysis

• An object is escaping if– Reachable from a static field– Passed from a thread constructor

• Do not assume this escaping in run() by default– Similar to connectivity base analysis,

• Field sensitive– Suppose O1, O2 of same type

• O1.f different from O1.g• O1.f same as O2.f

Page 16: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

16

Bogda’s Escape Analysis

• An object is escaping if it is reachable:– By a static field– By a Runnable object– Via more than 1 field reference

Page 17: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

17

Ruf’s Escape Analysis

• An object is escaping if both– Reachable from either

• A static field or •A Runnable object

– Synchronized by more than one thread

• Adapted for our own use– “synchronized” “accessed”

Page 18: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

18

Experimental Settings (Machine)

• Intel (Dell PowerEdge 6600 SMP)– 4 Intel hyperthreaded 1.5Ghz Xeon processors– with 1MB cache each– 6G system memory.

Page 19: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

19

Experimental Settings (Software)

• Original– default Jikes RVM implementation– base case for performance comparison

• Enforcing SC– Empty– Arg Escaping– Connectivity analysis– Field-base analysis– Bogda’s analysis (bogda)– Ruf’s analysis

Page 20: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

20

Measurements

• Escape Analysis Time• Impact on Delay Set Analysis Time• Impact on Synchronization Analysis

Time• Slowdown due to fence insertion

– Delay Set Analysis only– Delay Set Analysis with Synchronization

Analysis

Page 21: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

21

Escape Analysis Time

1

10

100

1000

10000

100000

1000000

mtrt moldyn montecarlo raytracer boundedbuf disksched geneticalgo hashmap seive jbb AVG

Escape Analysis Time in ms

empty argEscape connect ruf5 field-base bogda

Page 22: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

22

Impact on Delay Set Analysis Time

0

50

100

150

200

250

300

350

mtrt moldyn montecarlo raytracer boundedbuf disksched geneticalgo hashmap seive jbb AVG

Delay Set Analysis Time in ms

connect ruf5 bogda field-base argEscape empty

Page 23: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

23

Impact on Synchronization Analysis Time

1

10

100

1000

10000

100000

1000000

mtrt moldyn montecarlo raytracer boundedbuf disksched geneticalgo hashmap seive jbb AVG

Synchronization Time in ms

field-base bogda empty argEscape connect ruf5

Page 24: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

24

Escape+DSA+ Synchronization Analysis Time / Compilation Time

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

mtrt moldyn montecarlo raytracer boundedbuf disksched geneticalgo hashmap seive jbb AVG

Analysis Time / Compilation Time

empty argEscape connect ruf5 field-base bogda

Page 25: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

25

Slowdown (DSA Only)

0

2

4

6

8

10

12

14

mtrt moldyn montecarlo raytracer boundedbuf disksched geneticalgo hashmap seive jbb AVG

Slowdown (DSA only)

connect ruf5 bogda field-base argEscape empty

Page 26: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

26

Slowdown (DSA+Sync Analysis)

0

2

4

6

8

10

12

14

mtrt moldyn montecarlo raytracer boundedbuf disksched geneticalgo hashmap seive jbb AVG

Slowdown (DSA+Synchronization Analysis)

connect ruf5 bogda field-base argEscape empty

Page 27: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

27

Slowdown of connect (DSA+Sync Analysis)

0

0.5

1

1.5

2

2.5

3

3.5

4

mtrt moldyn montecarlo raytracer boundedbuf disksched geneticalgo hashmap seive jbb AVG

Slowdown of connect (DSA+Synchronization Analysis)

connect

Page 28: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

28

Conclusions

• Evaluate interaction between escape analysis and synchronization/delay set analysis

• Montecarlo and jbb motivates enabling field sensitivity for connectivity base analysis

Page 29: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

29

Backup Slides Follow

Page 30: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

30

Number of Delay Checks Performed

1

10

100

1000

10000

100000

1000000

10000000

mtrt moldyn montecarlo raytracer boundedbuf disksched geneticalgo hashmap seive jbb AVG

Number of Delay Check Performed

connect ruf5 bogda field-base argEscape empty

Page 31: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

31

Total Compilation Time

1

10

100

1000

10000

100000

1000000

mtrt moldyn montecarlo raytracer boundedbuf disksched geneticalgo hashmap seive jbb AVG

Total Compilation Time in ms

connect ruf5 argEscape empty field-base bogda

Page 32: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

32

Number of Delays Found (DSA Only)

1

10

100

1000

10000

100000

1000000

10000000

mtrt moldyn montecarlo raytracer boundedbuf disksched geneticalgo hashmap seive jbb AVG

Number of Delays Found (DSA Only)

connect ruf5 bogda field-base argEscape empty

Page 33: Evaluating the Impact of Thread Escape Analysis on  Memory Consistency Optimizations

33

Number of Delays Found (DSA + Sync Analysis)

1

10

100

1000

10000

100000

1000000

10000000

mtrt moldyn montecarlo raytracer boundedbuf disksched geneticalgo hashmap seive jbb AVG

Number of Delays Found (DSA+Sync Analysis)

connect ruf5 bogda field-base argEscape empty