65
© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University of Kent at Canterbury BCS Advanced Programming SG Thursday 14 October 1999 ©Richard Jones, 1999. All rights reserved.

© Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

Embed Size (px)

Citation preview

Page 1: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

1

Garbage CollectionGarbage Collection

Richard JonesComputing Laboratory

University of Kent at Canterbury

Richard JonesComputing Laboratory

University of Kent at Canterbury

BCS Advanced Programming SGThursday 14 October 1999

©Richard Jones, 1999. All rights reserved.

Page 2: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

2

Why garbage collect?Why garbage collect?

Language requirement• many languages assume GC, e.g. allocated objects may survive

much longer than the method that created them

Problem requirement• the nature of the problem may make it very hard/impossible to

determine when something is garbage

Abstraction and Modularity

Not a silver bullet• some memory management problems cannot be solved using

automatic GC, e.g. if you forget to drop references to objects that

you no longer need.

Page 3: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

3

What is garbage?What is garbage?

Almost all garbage collectors assume the following definition of live objects called liveness by reachability: if you can get to an object, then it is live.

More formally: An object is live if and only if:• it is referenced in a predefined variable called a root, or

• it is referenced in a variable contained in a live object.

Garage example: I need my dinghy, & so I need its sails, & so I need their bag.

Roots include words in the static area, registers and words on the execution stack that point into the heap.

Page 4: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

4

The basic algorithmsThe basic algorithmsThink of clearing out your garage:

• Reference counting: Keep a note on each object in your garage, indicating the number of live references to the object. If an object’s reference count goes to zero, throw the object out (it’s dead).

• Mark-Sweep: Put a note on objects you need (roots). Then recursively put a note on anything needed by a live object. Afterwards, check all objects and throw out objects without notes.

• Mark-Compact: Put notes on objects you need (as above). Move anything with a note on it to the back of the garage. Burn everything at the front of the garage (it’s all dead).

• Copying: Move objects you need to a new garage. Then recursively move anything needed by an object in the new garage. Afterwards, burn down the old garage (any objects in it are dead)!

Page 5: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

5

Your choice!Your choice!

Basic algorithms

Improving mark-sweep GC

Generational GC

Incremental GC

Conservative GC

Distributed GC

Page 6: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

6

Reference countingReference countingThe simplest form of garbage collection is reference counting.

Basic idea: count the number of references from live objects.

Each object has a reference count (RC) • when a reference is copied, the referent’s RC is incremented

• when a reference is deleted, the referent’s RC is decremented

• an object can be reclaimed when its RC = 0

Common implementation: smart pointers

Page 7: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

7

Advantages of reference counting

Advantages of reference counting

Simple to implement

Costs distributed throughout program

Good locality of reference: only touch old and new targets' RCs

Works well because few objects are shared and many are short-lived — space can be reused

Minimal zombie time from when an object becomes garbage until it is collected

Immediate finalisation is possible (due to near zero zombie time)

Page 8: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

8

Disadvantages of reference countingDisadvantages of reference counting

Not comprehensive (does not collect all garbage):

cannot reclaim cyclic data structures

High cost of manipulating RCs: cost is ever-present even if no garbage is collected

Bad for concurrency: RC manipulations must be atomic

Tightly coupled interface to mutator. smart pointers raw pointers

High space overheads for some applications

Recursive freeing cascade is only bounded by heap size

Page 9: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

9

Mark-SweepMark-Sweep

Mark-sweep is a tracing algorithm — it follows (traces) references from live objects to find other live objects.

Implementation:Each object has a mark-bit associated with it.

There are two phases:• Mark phase: starting from the roots, the graph is traced and

the mark-bit is set in each unmarked object encountered.

At the end of the mark phase, unmarked objects are garbage.

• Sweep phase: starting from the bottom, the heap is swept – mark-bit not set: the object is reclaimed– mark-bit set: the mark-bit is cleared

Page 10: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

10

Comprehensive: cyclic garbage collected naturally

No run-time overhead on pointer manipulations

Loosely coupled to mutator

Does not move objects• does not break any mutator invariants• optimiser-friendly• requires only one reference to each live object to be

discovered (rather than having to find every reference)

Advantages of mark-sweepAdvantages of mark-sweep

Page 11: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

11

Disadvantages of mark-sweepDisadvantages of mark-sweep

Stop/start nature leads to disruptive pauses and long zombie times.

Complexity is O(heap) rather than O(live) • every live object is visited in mark phase• every object, alive or dead, is visited in sweep phase

Degrades with residency (heap occupancy)• the collector needs headroom in the heap to avoid thrashing

Fragmentation and mark-stack overflow are issues

Tracing collectors must be able to find roots (unlike reference counting)

Page 12: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

12

Cheney copying GCCheney copying GC

Divide heap into 2 halves (semi-spaces) named Fromspace and Tospace

Allocate objects in Tospace

When Tospace is full• flip the roles of the semi-spaces• pick out all live data in Fromspace and copy them to

Tospace• preserve sharing by leaving a forwarding address in the

Fromspace replica• use Tospace objects as a work queue

Page 13: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

13

Registers

freescan

Tospace

From space

C

F GD E

B

A

Registers

freescan

Tospace

From space

C

F GD E

B

A

A'

A'

copy root and update pointer,

leaving forwarding address

Registers

freescan

Tospace

From space

C

F GD E

B

A

B'

A'

A'

B' C'

C'

scan A'copy B and C,

leaving forwarding addresses

Registers

freescan

Tospace

From space

C

F GD E

B

A

D' E'B' C'

B' C'

A'

A'

D' E'

scan B'copy D and E,

leaving forwarding addresses

Registers

freescan

Tospace

From space

C

F GD E

B

A

D' E'B' C'

B' C'

A'

A' F'

G'

D' E' F' G'

scan C'copy F and G,

leaving forwarding addresses

Registers

freescan

Tospace

From space

C

F GD E

B

A

F' G'

D' E'

D' E'

B' C'

B' C'

A'

A' F'

G'

scan D' and E'nothing to do

Registers

freescan

Tospace

From space

C

F GD E

B

A

F' G'

D' E'

D' E'

B' C'

B' C'

A'

A' F'

G'

scan F'use A's forwarding address

Registers

freescan

Tospace

From space

C

F GD E

B

A

F' G'

D' E'

D' E'

B' C'

B' C'

A'

A' F'

G'

scan G'nothing to do

Registers

freescan

Tospace

From space

C

F GD E

B

A

F' G'

D' E'

D' E'

B' C'

B' C'

A'

A' F'

G'

scan=freeso collection is complete

Co

pyi

ng

GC

Exa

mp

leC

op

yin

g G

C E

xam

ple

Page 14: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

14

Advantages of copying GCAdvantages of copying GC

Compaction for free

Allocation is very cheap for all object sizes• out-of-space check is pointer comparison• simply increment free pointer to allocate

Only live data is processed (commonly a small fraction of the heap)

Fixed space overheads• free and scan pointers• forwarding addresses can be written over user data

Comprehensive: cyclic garbage collected naturally

Simple to implement a reasonably efficient copying GC

Page 15: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

15

Disadvantages of copying GCDisadvantages of copying GC

Stop-and-copy may be disruptive Degrades with residency

Requires twice the address space of other simple collectors• touch twice as many pages• trade-off against fragmentation

Cost of copying large objectsLong-lived data may be repeatedly copied

All references must be updatedMoving objects may break mutator invariants

Breadth-first copying may disturb locality patterns

Page 16: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

16

Mark-compact collectionMark-compact collection

Mark-compact collectors make at least two passes over the heap after marking

• to relocate objects

• to update references (not necessarily in this order)

Issues• how many passes?

• compaction style

– sliding: preserve the original order of objects

– linearising: objects that reference each other are placed adjacently (as far as possible)

– arbitrary: objects moved without regard for original order or referential locality

Page 17: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

17

Complexity: caveat emptorComplexity: caveat emptorClaim: “Copying GC is always better than mark-sweep GC”

Simple mark-sweep GC is O(heap-size)• every live object is visited in mark phase

• every object, alive or dead, is visited in sweep phase

Simple copying GC is O(live)• only live objects are copied

But...• Copying is more expensive than setting a bit

• Efficient implementations of mark-sweep are dominated by cost of mark phase

– linear scanning less expensive than tracing

– cost of sweep can be reduced further

Simple asymptotic analyses are misleading

Page 18: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

18

GC MetricsGC Metrics

Execution time• total execution time

• distribution of GC execution time

• time to allocate a new object

Delay time• length of disruptive pauses

• zombie times

Memory usage• additional memory overhead

• fragmentation

• virtual memory and cache performance

Other important metrics• comprehensiveness

• implementation simplicity and robustness

Page 19: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

19

Your choice!Your choice!

Basic algorithms

Improving mark-sweep GC

Generational GC

Incremental GC

Conservative GC

Distributed GC

Page 20: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

20

Improving Mark-sweep GCImproving Mark-sweep GC

Problem: mark-sweep collectors touch all memory both dead and alive — not good in a VM environment.

Mark-sweep actions• mark phase: mark-bits of all live objects modified• sweep phase: all objects — live or dead — modified

Reducing the cost of the sweep• want to avoid sweep's O(heap) cost• manipulate mark-bits more efficiently• want to minimise paging

Page 21: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

21

Bitmap markingBitmap marking

Store mark-bits in a separate bitmap table• if smallest object is two 32-bit words, bitmap is 1.5% of heap

Advantages of bitmapsmark-bits held in RAM can be read or written without page faults

no object/page is dirtied during marking

– no page need be written back to swap disk

live objects need not be touched during sweep

– atomic objects need never be touched at all

objects live and die in clusters, so test 32 bits at a time

increased safety: less chance of program messing with bits

Access to mark-bits is more expensive

Page 22: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

22

Lazy sweepingLazy sweeping

Reduce pauses by sweeping in parallel with mutator• sweeper only modifies mark-bits and (maybe) garbage

• both are invisible to mutator

• thus sweeper and mutator cannot interfere

Do a small amount of sweeping at each allocation• sweep until sufficient free memory is found

• save reclaimed objects in a free-list or fixed-size vector

Lazy sweeping and bitmaps• deal with every bit in a bitmap word at the same time.

• since objects live and die in clusters,bitmaps of clusters can be test and cleared in groups of 32

Page 23: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

23

hb_next

hb_marks

Free_list

hb_next

hb_marks

hb_next

hb_marks

blockheaders

blocks

Page 24: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

24

Your choice!Your choice!

Basic algorithms

Improving mark-sweep GC

Generational GC

Incremental GC

Conservative GC

Distributed GC

Page 25: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

25

The generational hypothesisThe generational hypothesis

Weak generational hypothesis “Most objects die young” [Ungar, 1984]

It is common for 80-95% objects to die before a further megabyte has been allocated

• 50-90% of CL and 75-95% of Haskell objects die before they are 10kb old

• SML/NJ reclaims 98% of any generation at each collection

• Only 1% Cedar objects survived beyond 721kb of allocation

• 95% of Java objects are ‘short-lived’

Strong generational hypothesis• “The older an object is, the more likely it is to die?”

does not appear to hold

Page 26: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

26

Generational GCGenerational GC

Strategy: Segregate objects by age into generations

Collect different generations at different frequencies

Concentrate on the nursery generation

By concentrating on a small part of the heappause times can be reduced

overall effort can also be reduced

Page 27: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

27

ExampleExample

c

b

a

root set

Old generation New generation

S

c

b

a

root set

Old generation New generation

S

Update with pointer to a; request new object

c

b

a

R

root set

Old generation New generation

S

allocation request fails; perform minor collection

c

b

a

R

root set

Old generation New generation

S

further updates, allocation, etc...where are the roots for the new generation?

Page 28: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

28

Issues raised by generational garbage collection

Issues raised by generational garbage collection

Old-young pointers give rise to roots for the young generation: how can these roots be discovered?

Tenured garbage in older generations cannot be reclaimed by minor collections: how can the volume of older garbage be minimised?

When should surviving objects be promoted to the next generation?

How do we record ages?

How should generations be organised?

Page 29: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

29

Problem:Intergenerational pointers

Problem:Intergenerational pointers

We can collect the young generation on its own (minor collection)

Old-young pointers give rise to roots for the young generation

• such pointers are comparatively rare

• they arise from destructive pointer writes

• these assignments can be trapped with a write barrier

• implementations: remembered sets, card tables

Young-old pointers are common• but not a problem if we collect younger generation whenever we

collect an older one (major collection)

Page 30: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

30

Problem:Tenured garbage

Problem:Tenured garbage

Garbage in older generations cannot be reclaimed by minor collections. Tenured garbage also causes the retention of young objects it references (nepotism)

Pause time depends on size of the youngest generation• how large should this be?

• how early should object be promoted to next generation?

Too large?• pause time will be too long

Too small? • young objects do not have sufficient time to die

• older generations will fill too fast

Page 31: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

31

SolutionsSolutions

Multiple generations• allows youngest generation to be kept small • allows older data more time to die

Controlling promotion rate• how many collections before promotion?• GC when youngest generation is full or

when allocation threshold is reached?• adaptive techniques

Techniques for age recording• creation and aging spaces, buckets

Other organisations of heap regions• large object area• mature object space

Page 32: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

32

Train algorithmTrain algorithmThe ‘Train algorithm’ can collect a region incrementally

Divide this mature object space into cars• each car has a remembered set

• collect one car at a time

But data structures may span cars• so group cars into trains

• idea is to accumulate linked data structure into a single train

To collect a From-car• move objects referenced from outside MOS into a fresh train

• copy their descendents into this train

• copy objects referenced from other trains to those trains

Page 33: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

33

Y

C

P

XA

BY

C

P

X

A

B

reference to A is external socopy A and its descendents in car 1 to a new car

Y

C

A

B

P

X

move object P referenced from other trains to that train;move object X referenced from car in this train to the last car in this train.

ExampleExample

Page 34: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

34

Benefits and costsBenefits and costs

Benefitsonce a structure is held in a single train,

it can be reclaimed if there are no external references to it

volume of data copied is bounded at each collection: one car-full

objects are clustered and compacted as they are copied

Costscomplex

space required by remembered sets is large

Page 35: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

35

Generational GC:a summary

Generational GC:a summary

Highly successful for a range of applicationsreduces pause time to a level acceptable for interactive

applications

improves paging and cache behaviour

reduces the overall cost of garbage collection

requires a low survival rate, infrequent major collections, low overall cost of write barrier

But generational GC is not a universal panacea. It attempts to improve expected pause time at expense of the worst case

objects may not die sufficiently fast

applications may thrash the write barrier

too many old-young pointers, or very deep execution stacks, may increase pause times

Page 36: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

36

Your choice!Your choice!

Basic algorithms

Improving mark-sweep GC

Generational GC

Incremental GC

Conservative GC

Distributed GC

Page 37: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

37

Incremental/concurrent garbage collection

Incremental/concurrent garbage collection

Incremental/concurrent garbage collection• runs collector in parallel with mutator

• attempts to bound pause time

• many soft real-time solutions

• but no general hard real-time solutions yet

Sequential GC can be made incremental by interleaving collection with allocation. Alternatively, run GC concurrently in a separate thread.

• at each allocation, do a small amount of GC work.

• tune the rate of collection to the rate of allocation to prevent mutator running out of memory before collection is complete

Page 38: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

38

A B

C

root

A B

C

root

A B

C

root

Asynchronous execution of mutator and collector introduces a coherency problem. For example,

SynchronisationSynchronisation

Update(right(B), right(A))right(A) = nil Update(right(A), right(B))right(B) = nil

Page 39: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

39

Tricolour abstractionTricolour abstraction

Black• object and its immediate descendants have been visited

• GC has finished with black objects and need not visit again.

Grey • object has been visited but its components may not have been

scanned.

• or, for an incremental/concurrent GC, the mutator has rearranged connectivity of the graph.

• in either case, the collector must visit them again.

White • object is unvisited and, at the end of the phase, garbage.

A collection terminates when no grey objects remain, i.e. all live objects have been blackened.

Page 40: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

40

Two ways to prevent disruption

Two ways to prevent disruption

There are two ways to prevent the mutator from interfering with a collection by writing white pointers into black objects.

1) Ensure the mutator never sees a white object• when mutator attempts to access a white object,

the object is visited by the collector

• protect white objects with a read-barrier

2) Record where mutator writes black-white pointers, so that the GC can (re)visit objects

• protect objects with a write-barrier

Page 41: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

41

Write-barrier methodsWrite-barrier methods

To falsely reclaim an object, two conditions must hold: a pointer to the white object is written into a black object

and furthermore, this must be the only reference to the white object the original reference to the white object is destroyed

If does not hold, there will be at least one path to each reachable white object that passes through a grey object.

If does not hold, the white object will still be reachable through the original reference.

Write barrier methods• incremental update methods catch changes to connectivity

• snapshot-at-the-beginning methods prevent the loss of the original reference

Page 42: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

42

IssuesIssues

Barriers lead to floating garbage.How conservative is an algorithm?

• how much garbage is left floating in the heap until the next collection cycle?

• what is the policy towards new (and often short-lived) objects? what colour are they allocated?

How expensive is the barrier?

How is initialisation achieved? How is termination achieved? In particular

• how are large root sets handled?

• how are threads synchronised?

Page 43: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

43

B C

A

Best known method was introduced by Dijkstra et al

Update (A,C) { *A = C shade(C)}

shade(P) { if white(P) colour(P) = grey}

The barrier

• traps attempts to install a pointer to a white object into a black object

• incrementally records changes to the shape of the graph

• prevents condition arising

• but no special action is required when a pointer is deleted

B C

A

Incremental update methodsIncremental update methods

Page 44: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

44

Snapshot at the beginningSnapshot at the beginning

Update(A, C) { shade(*A) *A = C}

The barrier

• remembers old references

• prevents condition arising

More conservative than IU

Simpler termination than IU

B C

A

B C

A

Page 45: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

45

Read barrier methodsRead barrier methods

Idea: don't let the mutator see white objects• so it cannot disrupt the collector

Best known is Baker's copying collector. During collection• each mutator read from Fromspace is trapped by read barrier

• writes are OK — they are not trapped

• objects are copied to Tospace, at B

• allocation is made at top of Tospace, at T

Tospace

TBscan

newallocation

copiedobjects

Page 46: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

46

Limitations of simple read-barrier methodsLimitations of simple read-barrier methods

Pointer reads are very common (15% of instructions?)

read barrier is expensive (30% overhead?)

pauses may be unpredictable and tightly clustered

work done by read barrier is unbounded

– depends on the size of the object

the root set must be scanned atomically at the flip

– pause depends on depth of execution stack

Page 47: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

47

Tospace

scan

T

B

Mutator

Tospace

scan

T

B

Mutator

VM methodsVM methods

Concurrency without a fine-grain barrier

Appel-Ellis-Li is based on Baker• uses OS memory protection

• pagewise black-only read barrier

• more conservative than Baker

Implementation• lock grey pages

• mutator access springs trap

– scan (blacken) grey page, copying its children

– unlock page

• collector also scavenges in the background

Page 48: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

48

Appel-Ellis-Li issuesAppel-Ellis-Li issues

Trap and scanner threads must be able to access protected Tospace pages

Page traps are expensive, so not a real-time algorithm

However, cost is application dependent. • trap sprung only once per page per collection:

• suppose every element of a large array was updated at each iteration

– only the first update would spring the OS trap

– but every update would pay for a software write barrier

Page 49: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

49

Your choice!Your choice!

Basic algorithms

Improving mark-sweep GC

Generational GC

Incremental GC

Conservative GC

Distributed GC

Page 50: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

50

Conservative collectors for C and C++

Conservative collectors for C and C++

Conservative collectors try hard to find garbage but if in doubt, they must be conservative and declare questionable objects live.

Conservative collectors have little or no knowledge of• where roots are to be found

• stack frame layout

• which words are pointers and which are not

Further constraints:• values of words cannot be changed unless it is safe to do so

• compiler optimisations may compromise reachability invariants

• memory manager must be library-safe

Solutions• Conservative mark-sweep collectors

• ‘Mostly Copying’ collectors

Page 51: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

51

BDW collectorBDW collector

C interfaceTo make a C program a garbage collected program, just add: #define malloc(sz) GC_malloc(sz) #define realloc(p,sz) GC_realloc(p,sz) #define free(p)

C++ interfaceObjects derived from class gc are collectable. class A : public gc {...};

A* a = new A; // a is collectable.

Objects allocated with ::operator new are uncollectableBoth uncollectable and collectable objects can be explicitly deleted• delete invokes an object's destructors and frees its storage

immediately.

Page 52: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

52

BDW Execution timeBDW Execution time

0.00

0.20

0.40

0.60

0.80

1.00

1.20

1.40

1.60

1.80

Re

lati

ve

to

tal e

xe

cu

tio

n t

ime

(U

ltri

x=

1)

Ultrix=1

BDW2.6

Gnu'

G++

QF

Page 53: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

53

BDW Maximum memory usageBDW Maximum memory usage

Note: with Ultrix allocator, gawk and cfrac used only 79 and 64kb respectively

0.00

0.50

1.00

1.50

2.00

2.50

3.00

Re

lati

ve

Ma

xim

um

He

ap

Siz

e (

Ult

rix

=1

)

Ultrix=1

BDW2.6

Gnu'

G++

QF

4.43 7.87

Page 54: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

54

Conservative GC vs. explicit deallocation

Conservative GC vs. explicit deallocation

BDW is comparable with explicit deallocation• mean execution time overhead 19% above the best

• performs best with programs that allocate small objects

But BDW has substantial space overhead • mean overhead is 58% (excluding programs that allocate little)

• fixed overheads account for large space overheads for these programs

Some caveats about these figures• No attempts were made optimise for GC

• Programming style was ignored

At best, these surveys provide an upper bound on the cost of this conservative collector.

Page 55: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

55

Pointer finding algorithmPointer finding algorithm

No cooperation from the compiler• no knowledge of heap or stack layout

• objects do not have headers (that the collector can use)

Does a pointer p refer to the heap?• compare with highest and lowest plausible addresses

Has that heap block been allocated?• obtain from p through a two-level search tree

Is the offset of the object from the start of the block a multiple of the object size for that block?

• use an object_map for each object size

• is the entry in this map for (this block, this object) valid?

Page 56: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

56

Pointer misidentificationPointer misidentification

Problem: wrongly identifying a bit-pattern as a pointer would cause a leak

• most (small) integers cannot be valid heap addresses

• classes of data such as compressed bitmaps: use GC_malloc_atomic to avoid scanning them

• uninitialised stack frame slots: clear a few stack frames before GC

Blacklisting• if reference points into the heap but fails validity tests

• black list that address — do not use it for allocation

• call the GC before any allocation to detect false references from static data!

In practice• leaks are relatively uncommon

• defensive programming techniques prevent common scenarios

Page 57: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

57

Conservative GC & optimising compilersConservative GC &

optimising compilersProblem: garbage collectors assume liveness = pointer reachability but programming practices may disguise pointers, e.g.

• arithmetic on pointers

• reversing pointers with exclusive-or operator.

Valid compiler optimisations may render objects temporarily invisible. For example, the following optimisation destroys even interior pointers

sum = 0;for(i=0; i<SIZE; i++) sum += x[i]+y[i]

sum = 0;diff = y - x;xend = x + SIZE;for(; x<xend; x++) sum += *x + *(x+diff);x -= SIZE;y = x +diff

Page 58: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

58

Your choice!Your choice!

Basic algorithms

Improving mark-sweep GC

Generational GC

Incremental GC

Conservative GC

Distributed GC

Page 59: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

59

Problems of a distributed world

Problems of a distributed world

Concurrency everywhere• must avoid dead-locks, live-locks

Communication is costly• changing the reference count of a remote object may cost

10,000 times as much as changing the count of a local object

Not easy to get complete knowledge of object graph• synchronisation is expensive

Faults everywhere• communications, processes

Page 60: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

60

Ideal distributed collectorIdeal distributed collector

Safe

Complete• reclaim all garbage including cycles

Concurrent• mutator-collector and collector-collector

Efficient • garbage should be reclaimed promptly

Expedient• make progress despite unavailability of parts of system

Scalable• to networks of many processes

Fault tolerant• against message delay, duplication or loss, and process failure

Page 61: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

61

Distributed GC SolutionsDistributed GC Solutions

Two (or more) levels of GC• run a collector local to each machine

• run a distributed collector

Distributed Reference Counting• reference counting is a scalable solution

• reference listing also an alternative

• race conditions must not cause premature reclamation

• cannot reclaim garbage cycles

Leases• objects are reclaimed when all remote leases on them have

expired

Page 62: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

62

Distributed tracingDistributed tracing

Comprehensive collectors require tracing• but this requires global synchronisation: must conservatively

identify all live objects

Tracing with timestamps• propagate timestamps to 'live' objects

• weaker synchronisation: any object with timestamp less than global threshold is garbage

Weaken 'comprehensiveness'• tracing within groups

• direct tracing: trace garbage rather than live objects

– hybrid reference counting / partial tracing algorithms

– back tracing

Page 63: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

63

Your choice!Your choice!

Basic algorithms

Improving mark-sweep GC

Generational GC

Incremental GC

Conservative GC

Distributed GC

Page 64: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

64

In conclusionIn conclusion

Garbage collection is a relatively mature technology.

But hard problems remain.

Commercial deployment of collector technology is still at an early stage. There are few players, and they use a small set of solutions.

There are no magic solutions to all problems: know your application!

Resources• Garbage Collection page www.cs.ukc.ac.uk/people/staff/rej/gc.html

• this talk www.cs.ukc.ac.uk/people/staff/rej/gc/bcs.ppt

Page 65: © Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October 1999 1 Garbage Collection Richard Jones Computing Laboratory University

© Richard Jones, 1999 BCS Advanced Programming SG: Garbage Collection14 October 1999

65