21
September 11, 2003 Beltway: Getting Around GC Gridlock Steve Blackburn, Kathryn McKinley Richard Jones, Eliot Moss Modified by: Weiming Zhao Oct. 2008

September 11, 2003 Beltway: Getting Around GC Gridlock Steve Blackburn, Kathryn McKinley Richard Jones, Eliot Moss Modified by: Weiming Zhao Oct. 2008

Embed Size (px)

Citation preview

Page 1: September 11, 2003 Beltway: Getting Around GC Gridlock Steve Blackburn, Kathryn McKinley Richard Jones, Eliot Moss Modified by: Weiming Zhao Oct. 2008

September 11, 2003

Beltway:Getting Around GC Gridlock

Steve Blackburn, Kathryn McKinley

Richard Jones, Eliot Moss

Modified by: Weiming ZhaoOct. 2008

Page 2: September 11, 2003 Beltway: Getting Around GC Gridlock Steve Blackburn, Kathryn McKinley Richard Jones, Eliot Moss Modified by: Weiming Zhao Oct. 2008

September 11, 2003

Outline

• Motivations & Background• Beltway GC framework• Subsumes existing copying GC• New collectors: Beltway X.X & Beltway

X.X.100• Novel mechanisms to make them efficient

• Experimental Results• Conclusion

Page 3: September 11, 2003 Beltway: Getting Around GC Gridlock Steve Blackburn, Kathryn McKinley Richard Jones, Eliot Moss Modified by: Weiming Zhao Oct. 2008

September 11, 2003

Where is GC?

• Renewed interest in GC– Object-oriented languages with automatic memory

management (Java, C#) • Major outstanding issues– GC throughput--10 to 70% penalty• Beltway - high throughput with new copying

framework

– Combing high throughput and low pause times• Ulterior Reference Counting

– Understanding & exploiting locality with GC– Dynamically adaptive GC

Page 4: September 11, 2003 Beltway: Getting Around GC Gridlock Steve Blackburn, Kathryn McKinley Richard Jones, Eliot Moss Modified by: Weiming Zhao Oct. 2008

September 11, 2003

Object Demographic Observations

• Lifetimes– Young objects: Most very short lived• Infant mortality: ~90% die young (within 4MB of

alloc)– Old objects: most very long lived (bimodal)• Mature morality: ~5% die each 4MB of new alloc

• Pointer mutations– Older to younger pointers across many objects are

rare• less than 1%

– Most mutations among young objects• 92 to 98% of pointer mutations

Page 5: September 11, 2003 Beltway: Getting Around GC Gridlock Steve Blackburn, Kathryn McKinley Richard Jones, Eliot Moss Modified by: Weiming Zhao Oct. 2008

September 11, 2003

5 Key Ideas in Copying GC

• Generational hypothesis (Ungar, Lieberman/Hewitt)–Young objects: most die, so collect frequently–Old objects: most live, so collect infrequently

• Older-first principle (Stefanovic/McKinley/Moss)–Give objects as much time to die as possible

• Incrementality improves responsiveness• Copying GC can improve locality

Can one GC incorporate them all?

Page 6: September 11, 2003 Beltway: Getting Around GC Gridlock Steve Blackburn, Kathryn McKinley Richard Jones, Eliot Moss Modified by: Weiming Zhao Oct. 2008

September 11, 2003

Beltway

• A new framework for GC algorithms– Subsumes existing copying GCs– Includes new, faster copying GCs

• Novelty– Generality – New algorithms– New GC mechanisms

Page 7: September 11, 2003 Beltway: Getting Around GC Gridlock Steve Blackburn, Kathryn McKinley Richard Jones, Eliot Moss Modified by: Weiming Zhao Oct. 2008

September 11, 2003

Two Organizational Principles

• Increments– Independently collectable regions

• Belts– FIFO groupings of increments

Page 8: September 11, 2003 Beltway: Getting Around GC Gridlock Steve Blackburn, Kathryn McKinley Richard Jones, Eliot Moss Modified by: Weiming Zhao Oct. 2008

September 11, 2003

Configurations of Beltway

Page 9: September 11, 2003 Beltway: Getting Around GC Gridlock Steve Blackburn, Kathryn McKinley Richard Jones, Eliot Moss Modified by: Weiming Zhao Oct. 2008

September 11, 2003

• Notice:– Need for copy reserve (rightmost increment)

– Incompleteness (w.r.t. cross-increment cycles)

– Corresponds to “OF-Mix” GC [Stefanovic 99]

– Semi-Space (degenerate form, 2 increments)

A Simple Example

0 1 2 3 4 5 6 7

Increments

Belt

0 1 2 3 4 5 6 71 2 3 4 5 6 7 82 3 4 5 6 7 8 9

Page 10: September 11, 2003 Beltway: Getting Around GC Gridlock Steve Blackburn, Kathryn McKinley Richard Jones, Eliot Moss Modified by: Weiming Zhao Oct. 2008

September 11, 2003

? Rules:–Only N increments available (N=8 here)

–Must reserve one for copying– Always collect leftmost, lowest full

increment

A More Interesting Example

• Notice:– Generational & older first principles– Still incomplete

• This heap organization is Beltway X.X– X is the increment size (e.g. Beltway 20.20 or Beltway

14.14)

• Beltway 100.100 = Appel-style generational

0 1 2 3 4 5 6 7

0 1 2 3 4 5 6 7

1 2 3 4 5 6 7 8

etc. etc. …0 1 2 3 4 5 6 7

33 34 35 36 37 38 39 40

1 2 3 4 5 6 7 82 3 4 5 6 7 8 9

34 35 36 37 38 39 40 41

Page 11: September 11, 2003 Beltway: Getting Around GC Gridlock Steve Blackburn, Kathryn McKinley Richard Jones, Eliot Moss Modified by: Weiming Zhao Oct. 2008

September 11, 2003

Completeness

• Add a third belt (“Beltway X.X.100”)– Large increment (100% of available)– Only collect third belt when it is full• Reduce incrementality, gain completeness

3 4 5 6 7 8 9 10

33 34 35 36 37 38 39 40

0 1

Page 12: September 11, 2003 Beltway: Getting Around GC Gridlock Steve Blackburn, Kathryn McKinley Richard Jones, Eliot Moss Modified by: Weiming Zhao Oct. 2008

September 11, 2003

Efficient Implementation

• Crucial issues:– Write barrier cost (consequence of incrementality)

– Knowing when to collect– Minimizing ‘copy reserve’ overhead

Page 13: September 11, 2003 Beltway: Getting Around GC Gridlock Steve Blackburn, Kathryn McKinley Richard Jones, Eliot Moss Modified by: Weiming Zhao Oct. 2008

September 11, 2003

New Mechanisms

• Frames– Incrementality with cheap w/b– One for each <source, target> frame pair• 2n-aligned contiguous virtual memory space• Accommodates an increment*• Allows fast inter-increment w/b

7x2f 8x2f 9x2f 10x2f

frameincrement

2f

• Dynamic, conservative copy reserve– Only reserve as much as necessary

• GC triggers– Don’t necessarily collect only when full

• Remembered sets– One for each <source, target> frame pair

Page 14: September 11, 2003 Beltway: Getting Around GC Gridlock Steve Blackburn, Kathryn McKinley Richard Jones, Eliot Moss Modified by: Weiming Zhao Oct. 2008

September 11, 2003

Results

• Implemented in Jikes RVM with GCTk• Compare to state of the art copying GC

- Appel-style generational• Measure over 33 heap sizes– 1.0 through to 3.0 x minimum heap size– GC time– Total execution time

• Normalize results to best

Page 15: September 11, 2003 Beltway: Getting Around GC Gridlock Steve Blackburn, Kathryn McKinley Richard Jones, Eliot Moss Modified by: Weiming Zhao Oct. 2008

September 11, 2003

Fixed nursery size VS. Appel(geometric mean of 6 benchmarks)

Page 16: September 11, 2003 Beltway: Getting Around GC Gridlock Steve Blackburn, Kathryn McKinley Richard Jones, Eliot Moss Modified by: Weiming Zhao Oct. 2008

September 11, 2003

Impact of increment size of Beltway X.X.100

(geometric mean of 6 benchmarks)

Use 25.25.100 configuration in the remainder of the results

Page 17: September 11, 2003 Beltway: Getting Around GC Gridlock Steve Blackburn, Kathryn McKinley Richard Jones, Eliot Moss Modified by: Weiming Zhao Oct. 2008

September 11, 2003

Beltway X.X VS. X.X.100 VS. Appel

Page 18: September 11, 2003 Beltway: Getting Around GC Gridlock Steve Blackburn, Kathryn McKinley Richard Jones, Eliot Moss Modified by: Weiming Zhao Oct. 2008

September 11, 2003

Beltway: Conclusions

• Beltway: a new framework– New family of GC algorithms– Subsumes existing copying GCs– Opens up new GC possibilities

• Generality + efficiency– Better, faster GCs

Page 19: September 11, 2003 Beltway: Getting Around GC Gridlock Steve Blackburn, Kathryn McKinley Richard Jones, Eliot Moss Modified by: Weiming Zhao Oct. 2008

September 11, 2003

Remaining Problem: Combining Throughput &

Responsiveness

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1.00E+01 1.00E+02 1.00E+03 1.00E+04

Uti

lizat

ion

Window (msec)

_228_jack

FG-MS

RC

Page 20: September 11, 2003 Beltway: Getting Around GC Gridlock Steve Blackburn, Kathryn McKinley Richard Jones, Eliot Moss Modified by: Weiming Zhao Oct. 2008

September 11, 2003

MMU Plots for javac at Two Heap Sizes

Page 21: September 11, 2003 Beltway: Getting Around GC Gridlock Steve Blackburn, Kathryn McKinley Richard Jones, Eliot Moss Modified by: Weiming Zhao Oct. 2008

September 11, 2003

Discussion Questions

• Worst case space overhead?• Pause time in a 100 belt increment?• Fixed-size increments?• What happens if the remsets get big?• Locality?