50
Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala University, Sweden

Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Embed Size (px)

Citation preview

Page 1: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Message Analysis-Guided Allocation and Low-Pause

Incremental Garbage Collection in a

Concurrent Language

KonstantinosSagonas

JesperWilhelmsson

Uppsala University, Sweden

Page 2: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Goals of this work

Efficiently implement concurrency

through asynchronous message-

passing

Memory management with real-time

characteristicso Short stop-times

o High mutator utilization

Design for multithreading

Page 3: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Our context: Erlang

Designed for highly concurrent applications

Soft Real-Time

Light-weight processes

No destructive updates

Data types: atoms, numbers, PIDs, tuples,

cons cells (lists), binariesheapdata

Page 4: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Our context: the Erlang/OTP system

Industrial-strength implementation

Used in embedded applications

Three memory architectures: [ISMM’02]

o Private

o Shared

o Hybrid

Page 5: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Stack

Heap

Private heaps

P P

Page 6: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Private heaps

P P

O(|message|)

copy

Page 7: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Private heaps

P P

Garbage collection is a private business

Fast memory reclamation of terminated processes

Page 8: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

O(1)

Shared heap

P P

Global synchronization

Longer stop-times

No fast reclamation of process-local data

Page 9: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Hybrid architecture

P P

Message area

Process-localheaps

Big objects area

Page 10: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Several possible methodso User annotationso Dynamic monitoring [Petrank et al ISMM’02]o Static analysis guided allocation

Allocating messages in themessage area

Page 11: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Static message analysis [SAS’03]

Similar to escape analysis

Allocation is process-local by default

o Possible messages allocated on message

area

o Copy on demand

Analysis is quite precise

o Typically finds 99% of all messages

Page 12: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Process-local heapsPrivate business: No synchronization

required

Message areaTwo generationsCopying collector in young generation

o Fast allocation

Mark-and-sweep in old generationo Prevents repeated copying of old objects

Garbage Collection in Hybrid Arch.

Page 13: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

GC of the message area is a bottleneck

1. Generational process scanning

2. Remembered set in local heaps

The root-set for the message area consistsof all stacks and process-local heaps

This is not enough...We need an incremental collector

in the Message Area!

Page 14: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Properties of incremental collector

No overhead on mutator

No space overhead on heap objects

Short stop-times

High mutator utilization

Page 15: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Oldgeneration

Organization of the Message Area

Fwd

Black-map

Younggeneration

NurseryFrom-space

Nursery and from-space always have a constant size,

(=100k words)

Storage area for forwarding pointers.

Size bound by (currently = )

List of arbitrary sized areasFree-list, first-fit allocation

Bit-array used to mark objects in

mark-and-sweep

Page 16: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Nlimit

Ntop

allocationlimit

Nursery

Organization of the Message Area

Page 17: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Incremental collector

Two approaches to choose from:

Work-based

Reclaim n live words each step

Time-based

A step takes no more than t ms

n and t are user-specified

Page 18: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Work-based collection

The mutator wants to allocate need words

reclaim = max( n , need )Nlimit

Ntop

allocationlimit

Allocation limit = Ntop + reclaim

Page 19: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Time-based collection

1. User annotations (as in Metronome)

2. Dynamic worst-case calculation

How much can the mutator allocate?

How much live data is there?

Page 20: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Time-based collection

GC = reclaimed after GC – reclaimed before GC

GCsteps = – reclaimed after GC

GC

wM =Nfree

GCsteps

Nlimit

Ntop

allocationlimit

Allocation limit = Ntop + wM

Page 21: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Collecting the Message Area

P1 P2 P3

FwdNurseryFromspace

Page 22: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Process Queue

Collecting the Message Area

P1 P2 P3

FwdFromspaceNursery

Page 23: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Process Queue

Collecting the Message Area

P1 P2 P3

FwdFromspaceNursery

Page 24: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Process Queue

Collecting the Message Area

P1 P2 P3

FwdFromspaceNursery

P1

Page 25: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Process Queue P1

Collecting the Message Area

P2 P3

FwdFromspaceNursery

Page 26: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Process Queue P1

Collecting the Message Area

P2 P3

FwdFromspaceNursery

Page 27: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Process Queue P1

Collecting the Message Area

P2 P3

FwdFromspaceNursery

allocationlimit

Cheap write barrier

Link receiver to a list in the send operation

Page 28: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Process Queue

Collecting the Message Area

P2 P3

FwdFromspaceNursery

P1

allocationlimit

Page 29: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Process Queue

Collecting the Message Area

P2 P3

FwdFromspaceNursery

P1

allocationlimit

Page 30: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Process Queue

Collecting the Message Area

P2 P3

FwdFromspaceNursery

P1

allocationlimit

Page 31: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Process Queue

Collecting the Message Area

P2 P3

FwdFromspaceNursery

allocationlimit

P1

Page 32: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Process Queue

Collecting the Message Area

P2 P3

FwdFromspaceNursery

P1

allocationlimit

Page 33: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Process Queue

Collecting the Message Area

P2 P3

FwdFromspaceNursery

P1

allocationlimit

Page 34: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Process Queue

Collecting the Message Area

P2 P3

FwdFromspaceNursery

P1

allocationlimit

Page 35: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Process Queue

Collecting the Message Area

P2 P3

FwdFromspaceNursery

P1

allocationlimit

Page 36: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Process Queue

Collecting the Message Area

P2 P3

FwdFromspaceNursery

allocationlimit

P1

Page 37: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Collecting the Message Area

P2 P3

FwdFromspaceNurseryallocationlimit

P1

Page 38: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Performance evaluation: Settings

Intel Xeon 2.4 GHz, 1GB RAM, Linux

Start with small process-local heaps(233 words, grows when needed)

Measure active CPU timeo using hardware performance monitors

Page 39: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Performance evaluation: Benchmarks

Mnesia – Distributed database system1,109 processes 2,892,855 messages

Yaws – HTTP Web server420 processes 2,275,467 messages

Adhoc – Data mining application137 processes 246,021 messages

Page 40: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Stop-times – Time-based

Mnesia

Yaws t = 1ms

Page 41: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Stop-times – Work-based

Adhoc Yaws

n = 2 words

Mean: 3Geo. Mean: 2

Mean: 9Geo. Mean: 1

Page 42: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Stop-times – Work-based

Adhoc Yaws

n = 100 words

Mean: 53Geo. Mean: 46

Mean: 268Geo. Mean: 36

Time (s) Time (s)

Page 43: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Bench-mark

n = 2MA GC

n = 100

MA GC

n = 1000

MA GC

Non-Inc.MA GC

Mnesia 182 164 156 88

Yaws 373 374 242 153

Adhoc 244 203 78 27

Message area total GC timesincremental vs. non-incremental

Times in ms

Page 44: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Bench-mark

MutatorLocal GC

MAn = 2

MAn = 100

MAn =

1000

Mnesia 52,906 4,439 182 164 156

Yaws237,62

911,72

8373 374 242

Adhoc 61,045 8,194 244 203 78

Runtimes – Incremental

Times in ms

Page 45: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Minimum Mutator Utilization

The fraction of time that the mutatorexecutes in any time window[Cheng & Blelloch PLDI 2001]

Page 46: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Mutator Utilization – Work-based

Adhoc

Yaws n = 100 words

Page 47: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Concluding Remarks

Memory allocator is guided by the intended use of data

Incremental Garbage CollectorHigh mutator utilizationSmall overhead on total runtimeNo mutator overheadSmall space overhead

Really short stop-times!

Page 48: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Runtimesincremental vs. non-incremental

Times in ms

Bench-mark

Inc.Mutator

Non-Inc.Mutator

Mnesia 52,906 53,276

Yaws237,62

9240,985

Adhoc 61,045 61,578

Page 49: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Total GC timesincremental vs. non-incremental

Times in ms

Bench-mark

Inc. Local GC

Non-Inc.Local GC

Mnesia 4,439 4,487

Yaws 11,728 11,359

Adhoc 8,194 7,848

Page 50: Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala

Mutator Utilization – Time-based

Mnesia

Yaws t = 1ms