A Mostly Non-Copying Real-Time Collector with Low Overhead and Consistent Utilization

David BaconPerry Cheng (presenting)V.T. Rajan

IBM T.J. Watson Research

What is Real-time Garbage Collection? Pause Time, CPU utilization (MMU), and

Space Usage Heap Architecture

Types of Fragmentation Incremental Compaction Read Barriers Barrier Performance

Scheduling: Time-Based vs. Work-Based Empirical Results

Pause Time Distribution Minimum Mutator Utilization (MMU) Pause Times

Summary and Conclusion

Roadmap

Real-time Embedded Systems Memory usage important

Uniprocessor

Problem Domain

3 Styles of Uniprocessor Garbage Collection:Stop-the-World vs. Incremental vs. Real-Time

Pause Times (Average and Maximum)

1.5s 1.7s

0.5s 0.7s 0.3s 0.5s 0.9s 0.3s

0.15 - 0.19 s

Coarse-Grained Utilization vs. Time

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8

Time (s)

2.0 s window

Fine-Grained Utilization vs. Time

0.25 0.5

0.75 1

1.25 1.5

1.75 2

2.25 2.5

2.75 3

3.25 3.5

3.75 4

4.25 4.5

4.75 5

5.25 5.5

5.75 6

6.25 6.5

6.75 7

7.25 7.5

7.75 8

Time (s)

0.4 s window

Minimum Mutator Utilization (MMU)

Window Size (s) - logarithmic scale

Space Usage over Time

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0

Time (s)

RTmax live

trigger

2 X max live

Problems with Existing RT Collectors

0. 0 0. 5 1. 0 1. 5 2. 0 2. 5 3. 0 3. 5 4. 0 4. 5 5. 0 5. 5 6. 0 6. 5 7. 0 7. 5 8. 0

T i me (s )

max live2 X max live3 X max live4 X max live

Non-moving Collector

T i me (s )

0. 0 0. 5 1. 0 1. 5 2. 0 2. 5 3. 0 3. 5 4. 0 4. 5 5. 0 5. 5 6. 0 6. 5 7. 0 7. 5 8. 0

T i me (s )

max live2 X max live3 X max live4 X max live

Replicating Collector

Not fully incremental,Tight coupling,Work-based scheduling

Our Collector Goals Results

Real-Time ~10 ms Low Space Overhead ~2X Good Utilization during GC ~ 40%

Solution Incremental Mark-Sweep Collector Write barrier – snapshot-at-the-beginning [Yuasa] Segregated free list heap architecture Read Barrier – to support defragmentation [Brooks]

Incremental defragmentation Segmented arrays – to bound fragmentation

What is Real-time Garbage Collection? Pause Time, CPU utilization (MMU), and Space Usage

Heap Architecture Types of Fragmentation Incremental Compaction Read Barriers Barrier Performance

Roadmap

Fragmentation and Compaction

Intuitively: available but unusable memory

avoidance and coalescing - no guarantees compaction

needed

Heap Architecture Segregated Free Lists

– heap divided into pages– each page has equally-sizes blocks (1 object

per block)– Large arrays are segmented

used free

external

internal page-internal

Controlling Internal and Page-Internal Fragmentation

Choose page size (page) and block sizes (sk)

If sk = sk-1 (1 + ), internal fragmentation

page-internal fragmentation page / smax

E.g. If page = 16K, = 1/8, smax= 2K, maximum non-external fragmentation to 12.5%.

Internal Page-Internal External Recently Dead Live

Fragmentation - small heap ( = 1/8 vs.

= 1/2)

=1/8 =1/2

Incremental Compaction

Compact only a part of the heapRequires knowing what to compact ahead of time

Key ProblemsPopular objectsDetermining references to moved objects

Incremental Compaction: Redirection

Access all objects via per-object redirection pointers

Redirection is initially self-referential

Move an object by updating ONE redirection pointer

original replica

Consistency via Read Barrier [Brooks]

Correctness requires always using the replica

E.g. field selection must be modified

x[offset]

x[redirect][offset]

normal access

read barrier access

Some Important Details Our read barrier is decoupled from collection Complication: In Java, any reference might be null

actual read barrier for GetField(x,offset) must be augmented

tmp = x[offset];return (tmp == null) ? null : tmp[redirect]

CSE, code motion (LICM and sinking), null-check combining

Barrier Variants - when to redirectlazy - easier for collectoreager - better for optimization

Barrier Overhead to Mutator Conventional wisdom says read barriers are too

expensiveStudies found overhead of 20-40% (Zorn, Nielsen)Our barrier has 4-6% overhead with optimizations

jess db

Geo. M

Heap (one size only)Stack

Program Start

HeapStack

allocated

Program is allocating

HeapStack

unmarked

GC starts

HeapStack

unmarked

marked orallocated

Program allocating and GC marking

HeapStack

unmarked

marked orallocated

Sweeping away blocks

HeapStack

allocated

evacuated

GC moving objects and installing redirection

HeapStack

unmarked

evacuated

marked orallocated

2nd GC starts tracing and redirection fixup

HeapStack

allocated

2nd GC complete

Roadmap

Scheduling the Collector Scheduling Issues

bad CPU utilization and space usage loose program and collector coupling

Time-Based Trigger the collector to run for CT seconds whenever the program runs for QT seconds

Work-Based Trigger the collector to collect CW work whenever the program allocate QW bytes

Time-Based Scheduling

Trigger the collector to run for CT seconds whenever the program runs for QT seconds

Time (s)

Smooth Alloc Uneven Alloc High Alloc

Window Size (s)

Work-Based Scheduling

Smooth Alloc Uneven Alloc

High Alloc

Trigger the collector to collect CW bytes whenever the program allocates QW bytes

Window Size (s)

Time (s)

Roadmap

Pause Time Distribution for javac

(Time-Based vs. Work-Based)

12 ms 12 ms

Utilization vs. Time for javac

Time (s) Time (s)

Minimum Mutator Utilization for javac

Space Usage for javac (Time-Based vs. Work-

Based)

3 inter-related factors:Space Bound (tradeoff)Utilization (tradeoff)Allocation Rate (lower is better)

Other factorsCollection rate (higher is better)Pointer density (lower is better)

Intrinsic Tradeoff

Summary: Mostly Non-moving RT GC

Read Barriers Permits incremental defragmentation Overhead is 4-6% with compiler optimizations

Low Space Overhead Space usage is only about 2 X max live data

Fragmentation still bounded Consistent Utilization

Always at least 45% at 12 ms resolution

Conclusions Real-time GC is real

There are tradeoffs just like in traditional GC

Scheduling should be primarily time-based

Fallback to work-based due to user’s incorrect parameter estimations

Incremental defragmentation is possible

Compiler support is important!

Future Work Lowering the real-time resolution

Sub-millisecond worst-case pause Main issue: breaking up stack scan

Segmented array optimizations Reduce segmented array cost below ~2%

Opportunistic contiguous layout Type-based specialization with invalidation

Strip-mining

A Mostly Non-Copying Real-Time Collector with Low Overhead and Consistent Utilization

Documents

The Cone Collector N°5 - Seashell Collector

COMMERCIAL OVERHEAD DOORS - Mineral Area Overhead Door

1.2.1 copying

Simulation model for the study of overhead rail current ...oa.upm.es/12671/2/VSD_2006.pdf · Simulation model for the study of overhead rail current collector systems dynamics, focussed

02 Copying Objects

Copying Y Combinator

Quick Copying Guide - Xeroxdownload.support.xerox.com/.../any-os/en_GB/Quick_Copying_Guide.… · Quick Copying Guide Basic Copying 1. Place the originals face up in the document

Instructions for Copying

ATTACHMENT 6.2 CORPORATE OVERHEAD STRATEGY · > Corporate Overhead Strategy > Divisional Overhead Strategy > System Operational Expenditure Strategy 2. Corporate overhead overview

A REAL-TIME GARBAGE COLLECTOR WITH LOW OVERHEAD AND CONSISTENT UTILIZATION David F. Bacon, Perry Cheng, and V.T. Rajan IBM T.J. Watson Research Center

· (Post Script 3 compatible),PDF Direct PrintXPS emulations >Power Consumption copying/printing 57 w LOT 3 5 Overhead Projector (Branded)2 78,216.00 Unit >White and Colour Brightness

Inheriting, Copying, Deleting€¦ · 04/2012 A5E03854708-01 Trademarks 1 Inheritance 2 Copying: General Definitions 3 Copying with the Navigator 4 Copy structure 5 Copying across

9. copying effects

Memory Models - uni-luebeck.de · Garbage Collection memory overhead unpredictable collection cycle ... D uses garbage collector for memory and RAII for other resources Some languages

Basic Copying

Specificatie vacuümbuis collector - IJsselEnergie...Specificatie vacuümbuis collector Vacuümbuis collector Algemeen Vermogen Opbrengst Collector oppervlak Minimale hellingshoek

The NILU Particulate Fallout Collector / Precipitation ... · The NILU Particulate Fallout Collector / Precipitation Collector The NILU Particulate Fallout Collector and the NILU

1 PREDETERMINED OVERHEAD RATE, OVERHEAD APPLICATION

THOROUGHFARE PLAN - Webster Major Arterial Existing Minor Arterial Planned Minor Arterial Existing Major Collector Planned Major Collector Existing Minor Collector Planned Minor Collector

Instructions for Copying - macmillanmh.com · Instructions for Copying ... Lesson 4 Lesson Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254