Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
ROOT and x32-ABI
Nathalie Rauschmayr
On behalf of LHCb collaboration
March 13, 2013
Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 1 / 20
Outline
1 History of application binary interfacesWhy is a new binary interface necessary?x32-ABI - A new 32-bit ABI for x86-64Preconditions
2 How ROOT can profitResults within other CERN applicationsResults within ROOT-BenchmarksReasons for performance differences
3 Support of x32-ABI
4 Conclusion
Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 1 / 20
History of application binary interfaces
x86:
32-bit word size → 4GB addressable
x86 64 as an extension of x86:
runs 32-bit as well as 64-bit applications
64-bit mode supports new hardware features
Number of CPU registers
Floating-point performance
Function parameters passed via registers
Syscall instruction ...
Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 2 / 20
Why is a new binary interface necessary?
x86-64 (64-bit mode):
new hardware features
no memory limit
slowdown due to memory issues
contention
page faults
paging ...
x86:
very low memory footprint
performance issues
Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 3 / 20
x32-ABI - A new 32-bit ABI for x86-64
Application Binary Interface based on 64-bit x86 architecture
Reduces size of pointers and C-datatype long to 32-bit
Takes advantage of all x64-features
Avoids memory overhead
advantages of 64-bit instruction set+
memory footprint of a 32-bit application
Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 4 / 20
x32-ABI - A new 32-bit ABI for x86-64
Developed by H.J. Lu (Intel)
Introduced in Linux-Kernel 3.4 (released in summer 2012)
http://www.linuxplumbersconf.org/2011/ocw/sessions/531
Opinions in Linux-community differ quite a lot
32-bit time values will overflow
adressable memory
...
Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 5 / 20
x32-ABI - A new 32-bit ABI for x86-64
Impressive results from x32-developers:
181.mcf from SPEC CPU 2000 (memory bound):
Intel Core i7∼ 40% faster than x86-64∼ 2% slower than ia32
Intel Atom∼ 40% faster than x86-64∼ 1% faster than ia32
Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 6 / 20
x32-ABI - A new 32-bit ABI for x86-64
186.crafty from SPEC CPU 2000 (64bit integer):
Intel Core i7∼ 3 % faster than x86-64∼ 40% faster than ia32
Intel Atom∼ 4 % faster than x86-64∼ 26% faster than ia32
=⇒ CERN applications will definitely reduce memory footprint and verylikely CPU-time
Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 7 / 20
Preconditions
x32-ABI requires:
Linux Kernel 3.4 compiled with CONFIG X86 X32=y
Gcc 4.7
Binutils 2.22
Glibc 2.16
Recompiling all system libraries, required by an application, with gcc-mx32
ELF 32-bit LSB shared object, x86-64, version 1 (SYSV)
Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 8 / 20
How ROOT can profit
CERN applications use millions of pointers
GAUDI framework stores all events as pointers
Many virtual functions (virtual tables: function pointers and hiddenpointers)
ROOT (histogram as a tree of pointers ...)
...
Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 9 / 20
Results within other CERN applications
namd dealII soplex povray omnetpp astar xalancbmk
−30
−20
−10
0
10
20
30
Mem
ory
red
uct
ion
in%
Memory
Resident Size (32 vs x32) Resident Size (64 vs x32)
namd dealII soplex povray omnetpp astar xalancbmk
−5
0
5
10
15
20
25
30
Red
uct
ion
ofti
me
in%
Time
CPU Time (32 vs x32) CPU Time (64 vs x32)
Figure : Comparison of memory consumption and CPU-time within theHEPSPEC-benchmarks
Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 10 / 20
Results within other CERN applications
LHCb Reconstruction with Gaudiv23r3 and ROOT 5.34.00Reconstruction of 1000 Events:
Physical Memory: -20 %
Total elapsed: -2 % reduction
LHCb Analysis with Gaudiv23r3 and ROOT 5.34.00Analysis of 10000 Events:
Physical Memory: -21 %
Total elapsed: 1.5 % increase
Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 11 / 20
Results within ROOT-Benchmarks
Following ROOT-benchmarks:
stressHistogram
stress 1000
stressHepix
IO
linear algebra
vector
sparse matrix
Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 12 / 20
Results within ROOT-Benchmarks
Histogram Events Spectrum Linear Fit0
10
20
30
40
50
60
Mem
ory
red
uct
ion
in%
Memory
Resident Size (64 vs x32)
Histogram Events Spectrum Linear Fit
0
5
10
15
20
25
30
Red
uct
ion
ofti
me
in%
Time
Real Time (64 vs x32) CPU Time (64 vs x32)
Figure : Comparison of memory consumption and CPU-time within theROOT-benchmarks
Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 13 / 20
Results within ROOT-Benchmarks
stress 1000:
0 200 400 600 800 1,000 1,2000
0.2
0.4
0.6
0.8
1
·106
Snapshots
Mem
ory
inkB
RSS m64 VSS m64 RSS x32 VSS x32
Histogram Events Spectrum Linear Fit0
20
40
60
80
100
Mem
ory
red
uct
ion
in%
Memory
heap - profiling (64 vs x32) heap - maps (64 vs x32)
Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 14 / 20
Reasons for performance differences
Memory seen by system and by process are different
Default thresholds of malloc:
t = 4 · 1024 · 1024 · sizeof (long)
if x > t: malloc uses mmap
if x < t: malloc uses sbrk
mmap:
memory blocks can be independently given back
anonymous memory blocks as an extension of the heap
sbrk :
memory blocks cannot be returned to the operating system
Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 15 / 20
Reasons for performance differences
Looking into dealII, omnetpp, ROOT stress:
dealII : page faults reduced by 10%
omnetpp: page faults reduced by 38%
stress: difference in CPU-time likely caused by memory issues
Further Issues:
code and data alignment
x32 loop unrolling needs some tuning
zero-extensions for pointer conversion
=⇒ x32-ABI still under development and there are possiblities foroptimization
Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 16 / 20
Outline
1 History of application binary interfacesWhy is a new binary interface necessary?x32-ABI - A new 32-bit ABI for x86-64Preconditions
2 How ROOT can profitResults within other CERN applicationsResults within ROOT-BenchmarksReasons for performance differences
3 Support of x32-ABI
4 Conclusion
Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 16 / 20
Support of x32-ABI
Ubuntu: Ongoing work, support planned for 13.04:https://blueprints.launchpad.net/ubuntu/+spec/
foundations-r-x32-planning
Gentoo: Release candidate available since summer 2012:http://www.gentoo.org/news/20120608-x32_abi.xml
Red Hat: No plans according to: http://comments.gmane.org/
gmane.linux.redhat.fedora.devel/164019
LLVM-compiler: supports x32-ABI since summer 2012 http:
//www.phoronix.com/scan.php?page=news_item&px=MTI3NTk
Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 17 / 20
Outline
1 History of application binary interfacesWhy is a new binary interface necessary?x32-ABI - A new 32-bit ABI for x86-64Preconditions
2 How ROOT can profitResults within other CERN applicationsResults within ROOT-BenchmarksReasons for performance differences
3 Support of x32-ABI
4 Conclusion
Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 17 / 20
Conclusion
New room for improvement, ( CERN ) applications can profitsubstantially
Computing Grid limits memory anyway to 4 GB per process
Gain in performance is for FREE
Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 18 / 20
Conclusion
Recompilation:
Cast from time values to long (...) will produce wrong results (xrootd)
New pointer size required modifications in CINT (function and datapattern)
Getting a working environment:
does not work out of the box
building gcc fails due to missing glibc x32
building glibc x32 with a partly built gcc
Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 19 / 20
Any questions?
Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 20 / 20