Upload
kaethe
View
28
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Compiler Optimizations for Nondeferred Reference-Counting Garbage Collection. Pramod G. Joisha Microsoft Research, Redmond. Classic Reference-Counting (RC) Garbage Collection. All references (stack, statics, heap) tallied Based on the nondeferred RC invariant - PowerPoint PPT Presentation
Citation preview
Compiler Optimizations for Nondeferred Reference-Counting
Garbage CollectionPramod G. Joisha
Microsoft Research, Redmond
ISMM’06 2
Classic Reference-Counting (RC) Garbage Collection
• All references (stack, statics, heap) tallied
• Based on the nondeferred RC invariant– Nonzero means at least one incident
reference and zero means garbage
• High processing costs– Counts need to be updated on every mutation
ISMM’06 3
Past Solution to High Overhead• Count only a subset of references
– Deferred RC collection (1976)– Ulterior RC collection (2003)
• Based on the deferred RC invariant– Nonzero means at least one incident
reference but zero means maybe garbage
• Faster, but– more “floating” garbage– longer pauses
ISMM’06 4
Our Solution
• Program analyses– Idea: Eliminate redundant RC updates
• Redundancy with respect to RC invariant
– Advantages• Reclamation characteristics unchanged• Pause time no worse than unoptimized case
ISMM’06 5
Talk Outline
• Optimizations (and related analyses)– RC subsumption– Acyclic object RC update specialization
• Experimental results– Impact on execution times– Comparison with deferred RC collection
• Conclusions
ISMM’06 6
Optimizations
• Fall into three categories– Data-centric (immortal RC update elision,
acyclic object RC update specialization)– Program-centric (RC subsumption, RC update
coalescing, null-check omission)– RC update-centric (RC update inlining)
ISMM’06 7
RC Subsumption: Intuition
ISMM’06 8
Flow-Insensitive RC Subsumption
• y is always RC subsumed by x if1. All live ranges of y are contained in x
2. The variable y is never live through a redefinition of either y or x
3. Everything reachable from y is also reachable from x
y
x
ISMM’06 9
Live Range Webs
x := ...
y := x
... y ...
... x ...
... y ...
x := ...y := x
ISMM’06 10
Provision 1: Live-Range Subsumption Graph
• Directed graph GL
– Nodes represent local references– Edges denote live-range containment– (y, x) means “y is always contained in x”
• Quadratic algorithm– Start with G = (V,E)
– Add (u, v) if u is live and v dead at point P
– Complement of G is GL
ISMM’06 11
A Contingent Opportunity
ISMM’06 12
Provision 2: Uncut Live-Range Subsumption Graph
• Handles redefinition provision
• Directed graph GE
– Start with GL
– Find livethru(s) and defsmay(s)
– Then liverdef(s) = livethru(s) defsmay(s)
– Delete (u, x) if u liverdef(s)
– Delete (y, u) if y livethru(s) and u liverdef(s)
ISMM’06 13
Overlooking Rootsst
ack
v
u
A
B
u := v
u := v.g(g is a read-only field)
u := v[e](v is thread local and v[e]isn’t written into before v dies)
u := v.f(v is thread local and v.f isn’t written into before v dies)
ISMM’06 14
• Start with GE
• Delete (u, v), where u v– nothing overlooks u at its definition– u is overlooked by w and (w, v) GR
• Delete until fixed point is reached• Approximate overlooking roots’ set used
Provision 3: RC Subsumption Graph
u
w
v
ISMM’06 15
Talk Outline
• Optimizations (and related analyses)– RC subsumption– Acyclic object RC update specialization
• Experimental results– Impact on execution times– Comparison with deferred RC collection
• Conclusions
ISMM’06 16
The Problem of Garbage Cycles
• Reference counting can’t capture cycles
• Three solutions:– Programming paradigms– Back-up tracing collector– Local tracing solution: trial deletion
ISMM’06 17
Background on Trial Deletion
• Decremented references buffered
• Trial deletion adds overheads– Bookkeeping memory (PLC buffer, PLC link)– Extra processing in RC updates
• Idea: Statically identify acyclic objects
ISMM’06 18
• Determine types that are always acyclic
• Type hierarchy and field information– Type connectivity (TC) graph
• SCC decomposition of TC graph
Acyclic Type Analysis
y
w
v
x
z
ISMM’06 19
Building the TC Graph
• Separate compilation
• Immortal object optimization
• Array subtyping issues
ISMM’06 20
Other Optimizations
• RC updates on immortal objects– vtables, string literals, GC tables
• Coalescing of RC updates
• Non-null operand RC update specialization
• RC update inlining
ISMM’06 21
Talk Outline
• Optimizations (and related analyses)– RC subsumption– Acyclic object RC update specialization
• Experimental results– Impact on execution times– Comparison with deferred RC collection
• Conclusions
ISMM’06 22
Benchmarks
ISMM’06 23
Optimization Effects
ISMM’06 24
Overlooking Roots’ Set Effects
ISMM’06 25
RC Update Distributions
ISMM’06 26
Summary
• High overheads can be drastically reduced without compromising on benefits!– Key: a new analysis called RC subsumption
• Improvements due to it alone often significant
– Execution times on a par with deferred RC collection on a number of programs
– Challenges wisdom on classic RC efficiency
• Scope for further improvement exists
• Future Work: Multithreading