33
ceg860 (Prasad) L9GC 1 Memory Management Garbage Collection

Ceg860 (Prasad)L9GC1 Memory Management Garbage Collection

Embed Size (px)

Citation preview

ceg860 (Prasad) L9GC 1

Memory Management

Garbage Collection

ceg860 (Prasad) L9GC 2

Modes of Object Management

• Static * An entity may become attached to at most one

run-time object during the entire execution.• E.g., FORTRAN variables.

+ Simple and Efficient.– Precludes recursion.– Precludes dynamic data structures.

– Note, FORTRAN 90 supports recursion and pointers.

ceg860 (Prasad) L9GC 3

• Stack-based* An entity may at run-time become attached to

several objects in succession, and the allocation and deallocation of these objects is in last-in first-out discipline.

• E.g., Pascal, C/C++, Java, etc.

+ Well-matched with block-structuring.

+ Allocation and Deallocation automatic.

+ Supports recursion.

+ Supports Ada “unconstrained” array types, etc.– Does not support flexible data structures.

ceg860 (Prasad) L9GC 4

• Heap-based* Objects are created dynamically through explicit

requests.• E.g., C++, LISP, Java/C#, etc.

+ Enables construction of complex dynamic data structures (with “unpredictable” lifetimes).

– Requires techniques for memory reclamation. • Objects may become unreachableunreachable as a result of an

assignment or a method-return.

• Even systems with large virtual memory can thrash if memory is not recycled.

ceg860 (Prasad) L9GC 5

ceg860 (Prasad) L9GC 6

Programmer Controlled Deallocation

• Using language primitives• E.g., Pascal’s dispose, C’s free, C++’s delete, C++

destructors, etc.

– Reliability issue• Dangling reference problem (“premature freeing”)

• Memory Leakage (“incomplete recycling”)

– Ease of software development issue

• Component-level approach

ceg860 (Prasad) L9GC 7

Automatic Memory Management

Language implementation techniques

(run-time system)

• Reference counting– Restricted to acyclic data structures.

• Garbage collection+ Applicable to general dynamic data structures.– Unsuitable in certain areas such as hard real-

time systems.

ceg860 (Prasad) L9GC 8

Reference counting• Keep count of number of references to

each object.

• Update the count in response to operations.– initialize : create/clone

– increment : assignment, method call

– decrement : assignment, method return.

ceg860 (Prasad) L9GC 9

(cont)• Limitations

• Space/time overhead to maintain count. • Memory leakage when cycles in data.

• Advantage• Incremental algorithm

• Applications • UNIX File System - Symbolic Links• Java RMI, Strings, COM/DCOM• Pure Functional Languages• Scripting Languages : Python, PERL, etc

ceg860 (Prasad) L9GC 10

Garbage Collection

• Detecting and reclaiming unreachable

objects automatically.

• Soundness / Safety• Every collected object is unreachable.

• Completeness• Every unreachable object is eventually collected.

ceg860 (Prasad) L9GC 11

Reachability is an Approximation

• Consider the program: x new A; y new B; x y; if alwaysTrue() then x new A else x.foo() fi

• After x y (assuming y becomes dead there)• the initial object A is not reachable anymore• the object B is reachable (through x)• thus B is not garbage and is not collected• but object B is never going to be used

ceg860 (Prasad) L9GC 12

Soundness issue : C++ Problem

void main(void){ Point *aptr = new Point();

// casting an object reference to an int int i = (int) aptr;int i = (int) aptr; // normal access to an object and its fields aptr->print(); aptr->printxy(); // freeing an object and nulling a reference to it delete(aptr);delete(aptr); aptr = NULL; aptr->print();// segmentation fault only when the object fields are accessed // aptr->printxy();

// casting the int back to object reference aptr = (Point*) i;aptr = (Point*) i;

// object resurrected !!aptr->print(); aptr->printxy();

}

ceg860 (Prasad) L9GC 13

Mark and SweepMark and Sweep Algorithm

• Garbage Detection

• Depth-first search to mark live data cells (cells in heap reachable from variables on run-time stack).

• Garbage Reclamation

• Sweep through entire memory, putting unmarked nodes on freelist. Sweep also unmarks the marked nodes.

ceg860 (Prasad) L9GC 14

Mark phase : Using an explicit stackfunction DFS(r) if r is a reference and object r not marked

then { mark object r;

t <- 1; stack[t] <- r;

while (t > 0) {

p <- stack[t--];

foreach field p.fi do { if p.fi is a reference and object p.fi not marked

then { mark object p.fi;

stack[++t] <- p.fi

}}}}

ceg860 (Prasad) L9GC 15

Sweep phase

p <- first address in heap;

while (p < last address in heap) {

if marked object p

then unmark object p

else {

let f be the first field in object p;

p.f <- freelist;

freelist <- p;

}

p <- p + (size of object p)

}

ceg860 (Prasad) L9GC 16

Mark and Sweep Example

A B C D Froot E

free

0 0 0 0 0 0

A B C D Froot E

free

1 0 1 0 0 1

After mark:

A B C D Froot E

free

0 0 0 0 0 0

After sweep:

ceg860 (Prasad) L9GC 17

• Problems– Memory Fragmentation.

– Work proportional to size of heap.

– Potential lack of locality of reference for newly allocated objects (fragmentation).

• Solution– Compaction

– Contiguous live objects, contiguous free space

ceg860 (Prasad) L9GC 18

Copying Collection • Divide heap into two “semispaces”.

• Allocate from one space (fromspace) till full.

• Copy live data into other space (tospace).

• Switch roles of the spaces.

• Requires fixing pointers to moved data (forwarding).

• Eliminates fragmentation.• DFS improves locality, while BFS does not

require any extra storage.

ceg860 (Prasad) L9GC 19

Stop and Copy GC: Example

A B C D Froot

E

Before collection:

new space

A C F

root

new space

After collection:

free

heap pointer

ceg860 (Prasad) L9GC 20

Implementation of Stop and Copy

• We find and copy all the reachable objects into the new space, and fix ALL pointers pointing to moved objects!

• As we copy an object, we store in the old copy a forwarding pointer to the new copy– when we later reach an object with a

forwarding pointer, we know it was already copied

ceg860 (Prasad) L9GC 21

Implementation of Stop and Copy (Cont.)

• We still have the issue of how to implement the traversal without using extra space

• The following trick solves the problem:

– partition the new space in three contiguous regions

copied and scanned

scan

copied objects whose pointerfields were followed

copied objects whose pointer fields were NOTfollowed

emptycopied

allocstart

ceg860 (Prasad) L9GC 22

Stop and Copy. Example (1)

A B C D Froot

E new space

• Before garbage collection

ceg860 (Prasad) L9GC 23

Stop and Copy. Example (2)

A B C D Froot

E

• Step 1: Copy the objects pointed by roots and set forwarding pointers

A

scan

alloc

ceg860 (Prasad) L9GC 24

Stop and Copy. Example (3)

A B C D Froot

E

• Step 2: Follow the pointer in the next unscanned object (A)– copy the pointed objects (just C in this case)– fix the pointer (to C) in A – set forwarding pointer in C

A

scanalloc

C

ceg860 (Prasad) L9GC 25

Stop and Copy. Example (4)

A B C D Froot

E

• Follow the pointer in the next unscanned object (C)

– copy the pointed objects (F in this case)

A

scanalloc

C F

ceg860 (Prasad) L9GC 26

Stop and Copy. Example (5)

A B C D Froot

E

• Follow the pointer in the next unscanned object (F)

– the pointed object (A) was already copied. Set the pointer same as the forwarding pointer

A

scanalloc

C F

ceg860 (Prasad) L9GC 27

Stop and Copy. Example (6)

root

• Since scan caught up with alloc we are done• Swap the role of the spaces and resume the program

A

scanalloc

C Fnew space

ceg860 (Prasad) L9GC 28

Generational Collectors• Observation

• Most objects short-lived, small percentage long lived. – 80% to 98% new objects die very quickly.

• An object that has survived several collections has a bigger chance to become a long-lived one.

– It is inefficient to copy long-lived objects over and over again.

• Strategy• Partition objects based on age, collecting areas containing

“younger” objects more frequently than “older” ones.

• Advantages• Less effort on each collection (reduced pause time).• Avoids unnecessary copying of long-lived objects (efficiency).• Improves locality (temporal and spatial locality correlated)

ceg860 (Prasad) L9GC 29

Generational Garbage Collection

Segregate objects into multiple areas (2 ~ 7) by age, and collect older areas less often than the younger ones.

ceg860 (Prasad) L9GC 30

Practical Issues– Unpredictability

• Major concern in critical applications.

+ Support collect_on, collect_off, collect_now.

– Efficiency • Generation scavenging; Clustering

– Incrementality• Separate GC thread running concurrently.

– Finalization• Recycling non-memory resources.

– External Calls• Interfacing with other languages

ceg860 (Prasad) L9GC 31

Memory Leakage in Java (Unintended References)

public class Stack { private static final

int MAXLEN = 10; private Object stk[] = new Object[MAXLEN]; private int stkp = -1; public void push(Object p) {

stk[++stkp] = p; } public Object pop() { return stk[stkp--]; } }

ceg860 (Prasad) L9GC 32

• In order to ensure that the popped object is promptly available to the garbage collector, it must be made explicitly “unreachable” by setting the object reference in stk[top] to null.

public Object pop(){ Object p = stk[stkp]; stk[stkp--] = null; return p; }

ceg860 (Prasad) L9GC 33

Other Java Details

• Class representations stay in memory as long as the corresponding class loader is present. This is because the contents of the class (static) variables may be needed later. Unintended references from class variables to instances can cause memory leakage.

• See Reference Objects and java.lang.ref package in Java 2 for GC related issues.