History Run-time management of dynamic memory is a necessary
activity for modern programming languages Lisp of the 1960s was one
of the first languages to incorporate automatic memory management
Key for functional and logic programming languages Many different
approaches: Tombstone Locks and Keys Mark-sweep Generational
etc.
Slide 3
Heap Run-time stack clarifies our understanding of how memory
is organized to implement subprograms Heap helps us understand the
run-time behavior of dynamic objects Data Segment static and/or
global variables run-time stack (local) space heap (dynamic)
space
Slide 4
Static and Run-time Stack Static memory contains values whose:
Storage requirements are known before run time Remain constant
throughout the life of the program Run-time stack Center of control
for dispatching active functions, local variables and
parameter-argument linkage
Slide 5
Heap Contains values that are dynamically allocated and
structured while the program is running E.g. strings, dynamic
arrays, objects, linked-lists By its very nature, becomes
fragmented as it is used for dynamic allocation and deallocation of
storage blocks of different sizes
Slide 6
Values in the heap For simplicity, assume each memory word in
the heap can have one of three states: Unused not allocated to the
program Undef allocated, but not yet assigned a value Value e.g.
int or float 7undef120 4unused undef0unused h n
Slide 7
Heap management New and delete allow the program to obtain and
release a contiguous block of memory words New returns the address
of the first word in a contigous block of k-unused works and marks
them undef E.g. new(5) h+10 7undef120 4unused undef0 unused h
n
Slide 8
Garbage and Dangling Pointers Any block of heap memory that
cannot be accessed by the program Easily created: class node { int
value; node next; }... node p, q; p = new node(); q = new node();
node p q p q null p q ? delete(p); q=p;
Slide 9
Garbage collection Ideally, wed never have garbage in the heap
However, pinpointing the moment the block is no longer needed by
the program can be complex Instead, when the heap becomes full (or
some other indicator) we reclaim all blocks that are garbage Three
major strategies: Reference counting Mark-sweep Copy
collection
Slide 10
Reference Counting Assumes that the initial heap is a
continuous chain of nodes called the free list Each node has an
additional integer field that contains a count of the number of
pointers referencing that node As the program runs, nodes are taken
from the free list and connected to each other via pointers Eager
approach
Algorithm for assignment p = q 1. Reference count for p
increased by 1 2. Reference count for q decreased by 1 3. If qs
count is zero, the reference count for each of its descendents is
decreased by 1, qs node returned to free_list a) Repeat for each of
qs descendants 4. Pointer q is assigned the (reference) value of
p
Slide 13
Fundamental Flaw? p.next = null; null31 10 p q
Slide 14
Reference Counting Advantage: Occurs dynamically (overhead
distributed over the run-time life of the program) Disadvantages:
Failure to detect inaccessible circular chains Storage overhead
created by appending integer reference counts to every node
Performance overhead (each pointer allocation/deallocation)
Slide 15
Mark-Sweep Called into action only when the heap is full
Results in two passes through the heap First pass : Mark pass Every
heap block that can be reached by following a chain of pointers
originating in the run-time stack in marked accessible (mark bit
set to 1) Second pass: Sweep pass Returns all unmarked nodes to the
free_list Unmarks all nodes that had been marked in mark pass Lazy
approach
Slide 16
Mark Sweep (initial configuration) 0 null 0 00 00 free_list p
q... null Mark Bit
Slide 17
Mark Sweep (after first pass) 1 null 1 10 00 free_list p q...
null
Slide 18
Mark-Sweep (after second pass) 0 null 0 0 0 00free_list p
q...
Slide 19
Mark-Sweep vs. Reference Counting Advantages of mark-sweep
Reclaims all garbage in the heap Only invoked when the heap is full
Disadvantages: When invoked, takes more time May only result in a
small number of cells that can be places on the free_list
Variations: Incremental mark-sweep Only do pieces of the heap
Slide 20
Copy Collection A time-space compromise compared to mark-sweep
Only called when the heap is full Only makes one pass through the
heap Requires much more memory Only half the entire heap space is
actively available for allocating new memory blocks
Slide 21
Copy Collection (initial configuration) free p q from_space
to_space
Slide 22
Copy Collection (post gc) free p q from_space to_space
Slide 23
Copy Collection vs Mark Sweep Let R : the number of active heap
blocks r : the ratio of R to the heap size hs Efficiency : the
amount of memory reclaimed per unit time Then: If r is much less
than hs/2, copy collection is more efficient As r approaches hs/2,
then mark-sweep becomes more efficient
Slide 24
Dangling Pointers Tombstone: extra heap cell that is a pointer
to the heap-dynamic variable The actual pointer variable points
only at tombstones When heap-dynamic variable de-allocated,
tombstone remains but set to nil Costly in time and space
Locks-and-keys: Pointer values are represented as (key, address)
pairs Heap-dynamic variables are represented as variable plus cell
for integer lock value When heap-dynamic variable allocated, lock
value is created and placed in lock cell and key cell of
pointer
Slide 25
Many different algorithms Hybrid systems that select between
mark-sweep and copy collection depending on r Generational
Collection: Recently created regions contain high % of garbage
while older regions do not - Lieberman and Hewitt 85 Like
copy-collection, divide heap into multiple regions New objects
allocated to the nursery Age object ages promoted out of the
nursery