Upload
eliana-moody
View
29
Download
1
Embed Size (px)
DESCRIPTION
Data layouts for object-oriented programs. Martin Hirzel IBM Research SIGMETRICS 6/16/2007. Object-oriented programs put data in objects. Caches and TLBs put data in blocks. Scattering objects over blocks causes cache/TLB misses. Misses cost time. Problem. o 1. o 5. o 6. o 8. o 2. o 3. - PowerPoint PPT Presentation
Citation preview
1
Data layouts forobject-oriented programs
Martin HirzelIBM Research
SIGMETRICS 6/16/2007
2
Problem
• Object-oriented programs put datain objects.
• Caches and TLBs put data in blocks.
• Scattering objects over blocks causes cache/TLB misses.
• Misses cost time.
o2
o1
o3
o4 o11o9
o10
o5
o8
o6
o7
cache lineTLB page
3
Solution
• Most object-oriented languages use garbage collection.
• Garbage collection can move objects.
• To avoid misses, move objects to the right cache/TLB blocks.
• Simple, right?
o2
o1
o3
o4 o11o9
o10
o5
o8
o6
o7
o11o9 o10o2o1 o3 o4
o5 o8o6 o7
4
Cheney Copying GC
scanfre
e
To-space
Copied ¬ yet scanned
Copied & scanned
free
scanscan fre
efree
scan
From-space
scan=free
o9 o10o2o1 o3 o4 o5 o8o6 o7
5
BF: Breadth-first layout
1
2 3
4 5 6 7
8 9 10 11 12 13 14 15
Why? Siblings
How? Queue-based traversal
6
DF: Depth-first layout
Why? Child-parent
How? Stack-based traversal
1
2 3
4 5 6 7
8 9 10 11 12 13 14 15
7
HI: Hierarchical layout
1
2 3
4 5 6 7
8 9 10 11 12 13 14 15
Why? Both siblings and child-parent
How? Block-bounded breadth first
8
AO: Allocation order layout
1 2d 3 4d 5d 6 7 8d 9
1c 3c 6c 7c 9c
Why? Creation order matches usage
How? Sliding compaction
9
SZ: Size segregation layout
1 2 3d 4 5d 76 98d
1c 6c 2c 7c 9c 4c
Why? Efficiently finding allocation holes
How? Segregated free lists
10
TH: Thread local layout
1
23
8
11
129
10
5
4
6 7
001 010
111
110
100
101
Why? Disjoint working sets
How? Reachability from call stacks
11
ProblemAO Allocation order
AS Allocation site
BF Breadth-first
DF Depth-first
HI Hierarchical
PO Popularity
RA Random
SZ Size
TH Thread
TY Type
• Which layout is best, and which is worst?
• How much does it matter in practice?
• How similar are the layouts?
• How much does it matter in the limit?
12
Solutions?
• Appeal to intuition– They can’t all be right!
• Formal– Petrank/Rawitz showed hardness
• Simulation– Who would believe those numbers?
• Brute-force– Do you have a few person-years to spare?
13
Avoiding Heisenberg Effects
TimeGarbage co
llecto
r
Applicatio
n
Garbage colle
ctor
Applicatio
n
Garbage colle
ctor
• Exclude garbage collector performance• Measure real effect of layout on application• Implement the layouts with simple algorithms
14
Object sorting garbage collection
populate sort copy fixup
Sort keys: AO, AS, PO, RA, SZ, TH, TY
15
32 Benchmarks
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
16
% Mutator time overhead
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
• Performance impact increases with SMP• Conclusions for AO, DF, TH, RA still hold
• All layouts sometimes best, sometimes worst• RA is worst, as expected• Low averages, but beware of worst cases!• AO has best average (but not by much)• DF has most best cases• TH has most benign worst case
17
% Mutator miss rate increases
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
• Layouts have large impact on miss rates• Miss rates confirm overhead conclusions• Miss rates can not replace time measurements
18
Layout similarities and differences
• AO, PO, and TH are quite similar• As expected, RA is far out
19
Estimated limit mutator time
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
• Benchmarks:largest avg. overhead
• Baseline:best observed
• Linear regression:Limit time (no misses)
+ cache misses cache latency
+ TLB misses TLB latency
= Total time (as measured)
20
Related work
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
21
Conclusions
• Layouts matter little on average, but:– Beware of the worst cases!– Layout importance increases with SMP
• All layouts are sometimes best,sometimes worst– AO has best average– DF has most best-cases– TH has best worst-cases