28
Reducing Pause Time of Conservative Collectors Toshio Endo (National Institute of Informatics) Kenjiro Taura (Univ. of Tokyo)

Reducing Pause Time of Conservative Collectors

Embed Size (px)

DESCRIPTION

Reducing Pause Time of Conservative Collectors. Toshio Endo (National Institute of Informatics) Kenjiro Taura (Univ. of Tokyo). Incremental GC for soft-realtime applications [Steele 75] [Yuasa 90] [Doligez 93]. Target: Multimedia, game etc. Pauses should be

Citation preview

Page 1: Reducing Pause Time of Conservative Collectors

Reducing Pause Time of Conservative Collectors

Toshio Endo (National Institute of Informatics)

Kenjiro Taura (Univ. of Tokyo)

Page 2: Reducing Pause Time of Conservative Collectors

Incremental GC for soft-realtime applications [Steele 75] [Yuasa 90] [Doligez 93]

Target: Multimedia, game etc.– Pauses should be <10ms

Collection tasks are divided into small pieces Success: Pauses of <5ms [Cheng 01]

– They assume compiler cooperation

Reduction of pause for ‘conservative’ GCs is insufficient

Page 3: Reducing Pause Time of Conservative Collectors

Conservative GC [Boehm et al. 88]

Mark sweep GC for C/C++ programs No compiler cooperation (e.g., write barriers)

Mostly parallel GC [Boehm et al. 91] Incremental, conservative Pauses >100ms fairly common

Page 4: Reducing Pause Time of Conservative Collectors

Write barriers in conservative GCs

No fine-grain write barrier by compiler

VM’s write protectionCoarse grain– Page level– Detect only first update after protection

Restrict design

Page 5: Reducing Pause Time of Conservative Collectors

Incremental mark sweep algorithms

Snapshot at beginning&DLG [Yuasa 90] [Doligez 93]

– Make (conceptual) heap snapshot before marking– Promise short pause– Large space overhead with VM write barrier

Incremental update [Steele 75] [Dijkstra 78]– Maintain consistency after marking

Need final marking before finish

Unlimitedly long!

Only choice

With VM

Page 6: Reducing Pause Time of Conservative Collectors

Contributions

Analyze why previous algorithms fail Propose techniques to bound pauses &

guarantee progress Show a `stress-test’ benchmark: iukiller Demonstrate experimental results

– < 5ms in applications– < 12ms in the stress-test benchmark (constant

across all heap sizes)(This talk omits parallel issues)

Page 7: Reducing Pause Time of Conservative Collectors

Overview of presentation

Mostly parallel GC Techniques to reduce pause time Experimental results Related work Summary

Page 8: Reducing Pause Time of Conservative Collectors

Mostly parallel garbage collector (1)

Start GC

Write-protect heap

Incremental mark User

write fault

Remember dirty (=updated) pages addr.

UnprotectFinal marking

Incremental sweep User

Trap handler

End GC

Page 9: Reducing Pause Time of Conservative Collectors

Mostly parallel garbage collector (2)

Second update is un-trapped– Mark r in final phase

Need final

marking

writer

p

qwriter

p

q r

p

q

Page 10: Reducing Pause Time of Conservative Collectors

Final marking

heap

root1. Scan all dirty pages + root

2. Mark all unmarked objects from scanned region

The amount of work is unbounded # of dirty pages Objects reachable from a dirty

page

Makes pauses >100ms

Page 11: Reducing Pause Time of Conservative Collectors

Overview of presentation

Mostly parallel garbage collector Techniques to reduce pause time Experimental results Related work Summary

Page 12: Reducing Pause Time of Conservative Collectors

Goal of our collector

Bound pause time (< constant)– Mutator utilization is important, but focus on pause

Guarantee progress of collection

Combine two techniques: Bound dirty pages (BD) Retry incremental marking (RI)

Page 13: Reducing Pause Time of Conservative Collectors

Bounding dirty pages (1)

Basic collector produces many dirty pages

Keep # of dirty pages < a given limit– If exceeds limit, choose a dirty page– Re-protect, scan, clean it – Good: Reduce task in final marking– Bad: More protection cost

Page 14: Reducing Pause Time of Conservative Collectors

Bounding dirty pages (2)

Is pause now bounded?

… No! Unmarked objects

reachable from a dirty page are not bounded

heap

root

Page 15: Reducing Pause Time of Conservative Collectors

Retrying incremental marking (1)

Start GC

Write-protect heap

Incremental mark User Trap handler

Final marking

Incremental sweep User

End GC

Finished before limit?

Yes.

No.Retry!

Keep works of final marking < a given limit

Page 16: Reducing Pause Time of Conservative Collectors

Retrying incremental marking (2)

Good: Bound length of single final marking Bad: Risk of starvation (no progress)

– Final marking may abort before finishing scanning (unbounded) dirty pages

– Unmarked objects may ‘escape’ from collector

Page 17: Reducing Pause Time of Conservative Collectors

The worst case

Abort a final marking with no progressFinal aborts

write

Final abortswrite

Incr.

Incr.

Incr.finishes

Incr.finishes

Page 18: Reducing Pause Time of Conservative Collectors

Ensuring bounded pause and progress

Either is insufficient…Need two techniques:

– Bounding dirty pages (BD)– Retrying incremental marking (RI)

BD Every final marking can scan all dirty pages It finds some unmarked objects, if any

Page 19: Reducing Pause Time of Conservative Collectors

Overview of presentation

Mostly parallel garbage collector Techniques to reduce pause time Experimental results Related work Summary

Page 20: Reducing Pause Time of Conservative Collectors

Experimental Environments

400MHz UltraSPARC, Solaris 8 Four GCs

– Stop: Stop-the-world GC– Basic: Basic incremental GC– BD: Use bounding dirty pages– BD+R: Use bounding dirty pages + retrying

incremental marking

Basic/BD/BD+R: GC starts when heap usage > 75%BD/BD+R: # of dirty pages < 16

Page 21: Reducing Pause Time of Conservative Collectors

The iukiller synthetic benchmark

‘Stress-test’ benchmark for mostly parallel GC Trees tend to escape from collector

Final marking tends to be long

root root

large binary trees

repeat

Page 22: Reducing Pause Time of Conservative Collectors

Results of iukiller benchmark:the maximum pause time

Previous collectors fail– > 1.8 seconds– The larger the heap,

the longer

BD+R achieves <12ms pause– independent from heap

size

heap live GC kind max. pausesize (MB) data (MB) time (ms)

100 64 Stop 4122Basic 2085BD 1802BD+R 11.7

200 128 Stop 8607Basic 4071BD 3753BD+R 11.7

400 256 Stop 17039Basic 8205BD 7166BD+R 11.2

Page 23: Reducing Pause Time of Conservative Collectors

Application benchmarks

Programs written in C/C++– deltablue: an incremental constraint solver (25MB)– espresso: a logic optimizer for PLA (10MB)– N-Body: an N-Body solver with Barnes-Hut (15MB)– CKY: a context free grammar parser (40MB)– Cube: a Rubik’s cube puzzle solver (8MB)

Page 24: Reducing Pause Time of Conservative Collectors

CKY

020406080

100120140160180

max

. pau

se (

mse

c)

deltablue

0102030405060708090

max

. pau

se

(mse

c)

Results of application benchmarks:the maximum pause time

BD+R achieves <5ms pause in five applications

BD is also OK (< 16ms)

215ms

283ms

espresso

024

68

10m

ax. p

ause

(m

sec)

N- Body

0102030405060

max

. pau

se (

mse

c)

Cube

0102030405060

max

. pau

se (

mse

c)

Page 25: Reducing Pause Time of Conservative Collectors

Results of application benchmarks:

overhead

BD/BD+R is <9% slower than Basic– More protection

All incr. GCs are 1—53% slower than Stop

– VM write barrier– Floating garbage – More GC cycles

deltablue

00.20.40.60.8

11.21.41.6

exec

. tim

e (S

top=

1)

espresso

00.20.40.60.8

11.21.41.6

exec

. tim

e (S

top=

1)

N- Body

00.20.40.60.8

11.21.41.6

exec

. tim

e (S

top=

1)

CKY

00.20.40.60.8

11.21.41.6

exec

. tim

e (S

top=

1)

Cube

00.20.40.60.8

11.21.41.6

exec

. tim

e (S

top=

1)Total execution times (‘Stop’=1)

Page 26: Reducing Pause Time of Conservative Collectors

Related work

[Appel et al. 88]– Copy GC with VM read barrier. Slower than write barrier

[Furuso et al. 91]– Snapshot-at-beginning on VM. Large space overhead

Recent version of [Boehm et al. 91]– Time limit on final marking. Risks of starvation

[Printezis et al. 00] [Ossia et al. 02]– Keep # of dirty cards small. Final marking is still unbounded

Page 27: Reducing Pause Time of Conservative Collectors

Summary

An incremental conservative GC Short pause (<5ms in 5 applications) GC progress

Use both techniques:– Bounding dirty pages– Retrying incremental marking

Page 28: Reducing Pause Time of Conservative Collectors

Future direction

Reducing overhead of BD– Strategy for proper limit for dirty pages

Bounding roots to be scanned– Protect stacks partially