Upload
anna-leonard
View
212
Download
0
Embed Size (px)
Citation preview
Code optimization
What do we want to optimize?
Code A exhibits 90% of
theoretical peak performance (FLOPS)
Code B exhibits 90% of strong scaling
Code C exhibits 10% of the
theoretical peak performance poorly balanced,
solves problem 2 times faster than code A or B
Code optimization
What do we want to optimize?
Code A uses slow algorithm
has great weak and strong scaling
Code B uses fast algorithm
but poor scaling
How to profile a code?
1. Make profiling the code as simple as you can.
2. Always expect surprises ... do not assume anything regarding your code performance.
3. Look for bottle-necks and for the most time consuming parts of the code.
4. Keep the reference version of the code.
5. Document every modification you make !!!!
Code profiling: exampleint main(int argc, char **argv){
MPI_Init(argc,argv);
sync(); t0=time();functionA();sync(); t1=time();
sync(); t2=time();functionB();sync(); t3=time();
MPI_Finalize();}
# cores time t1-t0 t3-t2
16 Min 1 15
Max 10 16
32 Min 0.8 10
Max 5.4 11
64 Min 0.6 7
Max 3.2 8.5
Which function should we optimize first?
Code optimization through code profiling
Difference: 40 sec 25 sec
Performance analysis toolspgprof vampire
crayPAT
100.0% | 100.0% | 512 | Total ------------------------------------ | 59.8% | 59.8% | 306 | stepfx_ | 17.6% | 77.3% | 90 | getrusage | 8.0% | 85.4% | 41 | stepfy_ | 6.2% | 91.6% | 32 | integr_ | 2.0% | 93.6% | 10 | gradco_ | 1.0% | 94.5% | 5 | __write | 0.8% | 95.3% | 4 | filerx_ |
IMP
CrayPAT
http://www.nersc.gov/nusers/systems/franklin/tools.phphttp://docs.cray.com/cgi-bin/craydoc.cgi?this_sort=title;mode=Search;sq=%20product%3D%22CrayPat%22
%module load xt-craypat
%man pat_build
%man pat_report
EXAMPLE:
pat_build -f -g blas,io,blacs -D trace-max=1600 -u a.out
Debugging parallel code
DDT
Totalview
Totalview
• http://www.totalviewtech.com/index.html• https://computing.llnl.gov/tutorials/totalview• https://computing.llnl.gov/tutorials/totalview/
exercise.html