35
Woo-Sun Yang User Engagement Group, NERSC Parallel Debugging Tools New User Training 2017 -1- February 23, 2017

Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

Woo-Sun Yang!User Engagement Group, NERSC

Parallel Debugging Tools New User Training 2017

-1-

February23,2017

Page 2: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

Debugging •  Whydebugging?

–  Yourprogramcrashesforaunknownreason–  Yourprogramgiveswrongresults

•  Howtofindcodingerrors?–  Usingprintstatements

•  Insertprintstatementsinstrategicloca:ons•  Canbedifficulttoknowwherethecodefailsandwhethervariableshave

incorrectvalues•  Recompilewheneveryoumakeachange-tediousand:me-consuming

–  Usingdebuggers•  Youcompileonlyonce(generally)•  Canpointtowherethecodefails•  Theyletyoucontrolexecu:onpaceofyourprogramandexaminevariables•  Usefultoolscanaidyourdetec:veworkgreatly

–  Visualiza:onandsta:s:cs–  Memorydebugging–  MPImessagequeue

-2-

Page 3: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

Parallel debuggers on Cori and Edison

•  Paralleldebuggerswithagraphicaluserinterface–  DDT(DistributedDebuggingTool)–  TotalView

•  SpecializeddebuggersonCoriandEdison–  STAT(StackTraceAnalysisTool)

•  Collectstackbacktracesfromall(MPI)tasks–  ATP(AbnormalTermina:onProcessing)

•  Collectstackbacktracesfromall(MPI)taskswhenanapplica:onfails

•  Valgrind–  Suiteofdebuggingandprofilingtools

-3-

Page 4: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

DDT and TotalView •  GUI-basedtradiJonalparalleldebuggers

–  Intui:veandsimpletouse;manyusefultools–  Allowtocontrolprogram’sexecu:onpaceand,some:mes,execu:onpath

•  Setbreakpoints,watchpointsandtracepoints–  Displaythevaluesofvariablesandexpressions,andvisualizearrays

•  Checkwhethertheprogramisexecu:ngasexpected–  Memorydebugging–  MessagequeuefeatureNOTworkingwithCrayMPI

•  WorksforC,C++,FortranprogramswithMPI,OpenMP,pthreads–  DDTsupportsCAF(CoarrayFortran)andUPC(UnifiedParallelC),too

•  MaximumapplicaJonsizeforthedebuggersatNERSC–  DDT:upto4096MPItasksonCori(HaswellandKNL)andEdison–  TotalView:upto512MPItasksonCori(Haswell)andEdison–  Licensessharedamongusersandmachines

•  Forinfo–  h_ps://www.allinea.com/products/ddt–  h_p://www.nersc.gov/users/sobware/debugging-and-profiling/ddt/–  h_p://www.roguewave.com/products/totalview–  h_p://www.nersc.gov/users/sobware/debugging-and-profiling/totalview/

-4-

Page 5: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

How to build and run with DDT

-5-

$ ftn -g -O0 -o jacobi_mpi jacobi_mpi.f90

$ salloc -N 1 -t 30:00 -p debug -C knl,quad,cache$ module load allineatools$ ddt ./jacobi_mpi

LoadtheallineatoolsmoduletouseDDTStartDDT

Compilewith-gtohavedebuggingsymbolsInclude-O0fortheIntelcompiler

Startaninterac:vebatchsession

Themodulenamewillchangeto‘forge’forfutureversions

Page 6: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

If you are far away from NERSC •  RemoteXwindowapplicaJon(GUI)overnetwork:slow

response

•  TwosoluJons–  UseNXtoimprovethespeed

•  WorkswithanyXwindowapplica:ons•  h_ps://www.nersc.gov/users/network-connec:ons/using-nx/(general)•  h_p://portal.nersc.gov/project/mpccc/nx/NX_Tutorial/Start_Over.html

(installa:onandquickuserguide)

–  UseAllineaForgeremoteclient•  Runsonyourdesktop/laptop•  SubmitadebuggingbatchjobfromaNERSCmachineandmaketheclient

reverseconnecttothejob•  Displaysresultsinreal:me•  Nolicensefilerequiredonyourlocaldesktop/laptop•  h_p://www.allinea.com/products/forge/download(downloadingremote

clients)

-6-

Page 7: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

Using NX

-7-

Page 8: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

Using Allinea remote client

-8-

(1)Select‘Configure’tocreateaconfiguraJonforaNERSCmachine

(2)CreateaconfiguraJon

2ndentryforaMOMnodeCori:cmom02orcmom06Edison:edimom01,…,oredimom06

Notethatthepathswillchangeforfutureversions

Page 9: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

Using Allinea remote client (Cont’d)

-9-

(3)Selectamachine (4)EntertheNIMpassword

Page 10: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

Using Allinea remote client (Cont’d) (5)SubmitabatchjobonaNERSCmachineandstartDDT

(6)Accepttherequest

(7)Setparametersandrun

-10-

$ salloc -N 1 -t 30:00 -p debug -C knl...$ module load allineatools$ ddt --connect ./jacobi_mpiomp

Page 11: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

DDT window

-11-

Fornaviga:on

Parallelstackframeviewishelpfulinquicklyfindingoutwhereeachprocessisexecu:ng

Tocheckthevalueofavariable,right-clickonavariableorcheckthepaneontheright

Sparklinestoquicklyshowvaria:onoverMPItasks

Processingen:tytocontrol

Page 12: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

Navigation

-12-

•  Play/ConJnue•  Pause•  AddBreakpoint•  StepInto

–  Tonextline;ifit’safunc:oncall,enterthefunc:on•  StepOver

–  Tonextlineinthecurrentstackframeevenifit’safunc:oncall•  StepOut

–  Returntothecallerfunc:on•  RunToLine

Page 13: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

Breakpoints, watchpoints and tracepoints

•  Breakpoint–  Stopsexecu:onwhenaselectedline(breakpoint)isreached–  Doubleclickonalinetocreateone;thereareotherways,too

•  Watchpointsforvariablesorexpressions–  Stopswhenavariableoranexpressionchangesitsvalue

•  Traceponits–  Whenreached,printswhatlinesofcodesisbeingexecutedandthelistedvariables

•  CanaddacondiJonforanacJonpoint–  Usefulinsidealoop

•  CanbeacJveorinacJve

-13-

Page 14: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

Many ways to check variables •  Rightclickonavariableforaquicksummary•  Variablepane•  Evaluatepane•  Displayvariablevaluesoverprocesses(Compareacrossprocesses)or

threads(Compareacrossthreads)•  MDA(MulJ-dimensionalArray)Viewer

–  Visualiza:on–  Sta:s:cs

-14-

Page 15: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

Memory debugging •  Why?

–  Todetectmemoryleaks–  Tocatchout-of-boundarrayreferences–  Tocatchothermemoryerrors(“doublefree”,etc.)–  Toseememoryusage

•  ForastaJcally-linkedexecutable–  Fornon-threadedcode

$ ftn -c -g -O0 myprog.f$ static_linking_ddt_md ftn -o myprog myprog.o # instead of ftn -o myprog myprog.o

–  static_linking_ddt_md_thforthreadedprogram–  SimilarlyforCandC++codes–  sta:c_linking_ddt_mdandsta:c_linking_ddt_md_thareu:lityscripts

providedbyNERSC•  Foradynamically-linkedexecutable,buildasusual

-15-

Page 16: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

Enabling memory debugging

-16-

•  Foradynamically-linkedbinaryonly–  Check‘Preloadthememorydebugginglibrary’–  Selecttheappropriateonefromthe

‘Language’pull-downmenu•  Addingguardpages(default:4KB)beforeor

afermemoryblocksfordetecJngout-of-boundheaparrayreferences

Whenyouclick‘Details…’

Page 17: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

Memory debugging – Overall Memory Stats

-17-

Tools>OverallMemoryStats

Memoryleaksof120MB

memory_leaks.ffromNERSCDDTwebpage

Page 18: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

KNL MCDRAM usage on Cori

-18-

•  MemoryblocksallocatedinMCDRAMwithmemkind’shbw_malloccallsandFortran’sfastmemdirecJvesareannotatedaccordinglyinDDT/7.0.

Page 19: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

KNL MCDRAM usage on Cori (Cont’d) •  Withnumactl

–  Inaninterac:vebatchjob:1.  Runddtinbackground

$ ddt &2.  Select‘MANUALLAUNCH

(ADVANCED)’3.  Setrunparametersandcheck

‘MemoryDebugging’4.  Click‘Listen’5.  Runasruncommand:

$ srun -n … numactl \ --preferred=1 \ allinea-client ./a.out

–  --mem_bind=…:simplyusesrun’s--mem_bind=map_mem:… instead

–  MCDRAMusageisnotproperlyannotatedinversion7.0.ReportedtoAllinea.

-19-

Page 20: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

TotalView

-20-

$ salloc -N 1 -t 30:00 -p debug$ module load totalview$ export OMP_NUM_THREADS=6$ totalview srun -a -n 4 ./jacobi_mpiompThen,•  ClickOKinthe‘StartupParameters-srun’window•  Click‘Go’bu_oninthemainwindow

•  Click‘Yes’totheques:on‘Processsrunisaparalleljob.Doyouwanttostopthejobnow?’

Page 21: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

TotalView (cont’d)

-21-

Toseethevalueofavariable,right-clickonavariableto“dive”onitorjusthovermouseoverit

Fornaviga:onRootwindow Processwindow

StateofMPItasksandthreads;membersdenotedroughlyas‘rank.thread’

Forselec:ngMPItaskandthread

Breakpoints,etc.

Page 22: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

Viewing variables

•  Variablewindow

-22-

•  VisualizaJonandstatsTools>Visualize

Tools>Sta:s:cs

Page 23: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

Memory debugging with MemoryScape •  MemoryScapeintegratedintoTotalViewformemory

debugging–  Memoryleaks–  Memoryusage–  Memorycorrup:on–  …

•  AstaJcally-linkedexecutable$ module load totalview$ CC -g -O0 -o memry_leaks memory_leaks.o ${TVMEMDEBUG_POST_OPTS}

•  Adynamically-linkedexecutable,buildasusual$ CC -dynamic -g -O0 -o memry_leaks memory_leaks.o

-23-

Page 24: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

Memory debugging with MemoryScape •  StartTotalViewandenablememorydebugginginthe‘StartupParameters’window

•  ProceedtouseTotalViewasusual

•  Formemory-relatedissues,openMemoryScapefromtheDebugpull-downmenu

-24-

Page 25: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

Memory debugging examples

-25-

Corruptedguardblocks

Page 26: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

STAT (Stack Trace Analysis Tool) •  Gathersstackbacktraces(showingthefuncJoncallingsequences

leadinguptotheonesinthecurrentstackframes)fromall(MPI)processesandmergesthemintoasinglefile(*.dot)–  Resultsdisplayedgraphicallyasacalltreeshowingtheloca:oninthe

codethateachprocessisexecu:ngandhowitgotthere–  Canbeusefulfordebuggingahungapplica:on–  WiththeinfolearnedfromSTAT,caninves:gatefurtherwithDDTor

TotalView•  WorksforMPI,CAFandUPC,butnotOpenMP•  STATcommands(aferloadingthe‘stat’module)

–  stat-cl:invokesSTATtogatherstackbacktraces–  stat-view:aGUItoviewtheresults–  stat-gui:aGUItorunSTATorviewresults

•  Formoreinfo:–  ‘intro_stat’,‘stat-cl’,‘stat-view’and‘stat-gui’manpages–  h_ps://compu:ng.llnl.gov/code/STAT/stat_userguide.pdf–  h_p://www.nersc.gov/users/sobware/debugging-and-profiling/stat-2/

-26-

Page 27: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

Hung application with STAT •  Ifyourcodehangsinaconsistentmanner,youcanuseSTATto

seeifandwheresomeMPIranksarestuck.•  Currently,oneknownwaytouseSTATisasfollows.

-27-

$ ftn -g -o jacobi_mpi jacobi_mpi.f90$ salloc -N 1 -t 30:00 -p debug -C knl,quad,cache...$ srun -n 4 ./jacobi_mpi &[1] 93834$ module load stat$ stat-cl -i 93834…Attaching to application...Attached!Application already paused... ignoring request to pauseSampling traces...Traces sampled!…Resuming the application...Resumed!Merging traces...Traces merged!Detaching from application...Detached!

Results written to /global/cscratch1/sd/wyang/debugging/stat_results/jacobi_mpi.0001

$ ls -l stat_results/jacobi_mpi.0001/*.dot-rw-r----- 1 wyang wyang 2768 Feb 20 21:24 stat_results/jacobi_mpi.0001/00_jacobi_mpi.0001.3D.dot$ stat-view stat_results/jacobi_mpi.0001/00_jacobi_mpi.0001.3D.dot

-itogetsourcelinenumbersSTATsamplesstackbacktracesafew:mes

withusualop:miza:onflags,ifany

Page 28: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

Hung application with STAT (Cont’d)

-28-

Rank3ishere

Ranks1&2arehere

Rank0ishere

Page 29: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

ATP (Abnormal Termination Processing) •  ATPgathersstackbacktracesfromallprocessesifanapplicaJon

fails–  InvokesSTATunderneath–  OutputinatpMergedBT.dotandatpMergedBT_line.dot(whichshows

sourcecodelinenumbers),whicharetobeviewedwithstat-view•  Bydefault,theatpmoduleisloadedonCoriandEdison,butATP

isnotenabled;toenable:export ATP_ENABLED=1 # sh/bash/kshsetenv ATP_ENABLED 1 # csh/tcsh

•  Cangetcoredumps(core.atp.jobid.rank),too,bysenngcoredumpsizeunlimited:

ulimit -c unlimited # sh/bash/kshunlimit coredumpsize # csh/tcsh

buttheydonotrepresenttheexactsamemomentinJme(therefore thelocaJonofafailurecanbeinaccurate)

•  Formoreinfo–  ‘intro_atp’manpage–  h_p://www.nersc.gov/users/sobware/debugging-and-profiling/stat-and-

atp/-29-

Page 30: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

$ sacct -j 4097861JobID JobName Partition Account AllocCPUS State ExitCode ------------ ---------- ---------- ---------- ---------- ---------- -------- ...4097861.0 jacobi_mp+ nstaff 4 RUNNING 0:0 ...$ ssh edimom02$ scancel -s ABRT 4097861.0$ exit$ cat slurm-4097861.outApplication 4097861 is crashing. ATP analysis proceeding......Process died with signal 6: 'Aborted'View application merged backtrace tree with: stat-view atpMergedBT.dot...$ module load stat$ stat-view atpMergedBT.dot # or statview atpMergedBT_line.dot

Hung application with ATP •  ForcetogeneratebacktracesfromahungapplicaJon•  Forthefollowingtowork,musthaveused

–  ‘exportATP_ENABLED=1’inbatchscript–  ‘exportFOR_IGNORE_EXCEPTIONS=true’inbatchscriptforIntelFortran–  ‘-fno-backtrace’atcompile/link:meforGNUFortran

-30-

FindthejobstepID

Killtheapplica:ononaMOMnode

Page 31: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

Valgrind

•  Suiteofdebuggingandprofilertools•  Toolsinclude– memcheck:memoryerrorandmemoryleaksdetec:on– massif,dhat(exp-dhat):heapprofilers–  cachegrind:acacheandbranch-predic:onprofiler–  callgrind:acall-graphgenera:ngcacheandbranchpredic:onprofiler

–  helgrind,drd:pthreadserrordetectors•  Forinfo:–  h_p://valgrind.org/docs/manual/manual.html

-31-

Page 32: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

Valgrind’s memcheck

-32-

$ module load valgrind$ ftn -dynamic -g -O0 memory_leaks.f $VALGRIND_MPI_LINK$ salloc -N 1 -t 30:00 -p debug -C knl$ srun -n 2 valgrind --leak-check=full --log-file=%p ./a.out$ ls -l...-rw-r--r-- 1 wyang wyang 7550 Feb 21 23:36 91835-rw-r--r-- 1 wyang wyang 7550 Feb 21 23:36 91836

$ more 91835...==91835== LEAK SUMMARY:==91835== definitely lost: 83,886,880 bytes in 20 blocks==91835== indirectly lost: 0 bytes in 0 blocks==91835== possibly lost: 41,943,440 bytes in 10 blocks==91835== still reachable: 103,903 bytes in 74 blocks==91835== suppressed: 0 bytes in 0 blocks...

•  Let’slookatthereportforprocess91835

•  Cansuppressspuriouserrormessagesbyusingasuppressionfile(--suppressions=/path/to/directory/file)

Couldhaveexplicitlyadded‘--tool=memcheck’

Page 33: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

Valgrind’s massif

-33-

$ ftn -g -O2 memory_leaks.f$ srun -n 2 -c 128 valgrind --tool=massif ./a.out$ ls -lrt…-rw------- 1 wyang wyang 50233 Feb 21 23:55 massif.out.92841-rw------- 1 wyang wyang 81113 Feb 21 23:55 massif.out.92842$ ms_print massif.out.92841... MB120.4^ # | :::# | :::: :# | :::: : :# | ::::: :: : :# | :: : : :: : :# | ::@@:: : : :: : :# | ::: @ :: : : :: : :# | :::: : @ :: : : :: : :# | ::: :: : @ :: : : :: : :# | :::: : :: : @ :: : : :: : :# | ::: :: : :: : @ :: : : :: : :# | @::: : :: : :: : @ :: : : :: : :# | ::@: : : :: : :: : @ :: : : :: : :# | ::::: @: : : :: : :: : @ :: : : :: : :# | :: : : @: : : :: : :: : @ :: : : :: : :# | :::::: : : @: : : :: : :: : @ :: : : :: : :# | :: : :: : : @: : : :: : :: : @ :: : : :: : :# | :::::: : :: : : @: : : :: : :: : @ :: : : :: : :# | @@: : :: : :: : : @: : : :: : :: : @ :: : : :: : :# 0 +----------------------------------------------------------------------->Mi 0 628.0

Number of snapshots: 95 Detailed snapshots: [14, 29, 44, 48, 50, 51, 61, 71, 81, 91 (peak)]...

•  Forprofilingheapmemoryusage

‘:’:normalsnapshot;basicinfoprovided‘@’:detailedsnapshotwheredetailedinfoisprovided‘#’:peaksnapshotwherethepeakheapusageisThisexamplestronglysuggestsmemoryleaksJme(instrucJonsexecuted)

Page 34: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

Valgrind’s massif (Cont’d)

-34-

...-------------------------------------------------------------------------------- n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B)-------------------------------------------------------------------------------- 82 531,809,757 96,862,856 96,761,707 101,149 0... 91 658,233,924 126,259,976 126,130,750 129,226 099.90% (126,130,750B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.->99.66% (125,830,320B) 0x4E3FF6A: _mm_malloc (in /opt/intel/compilers_and_libraries_2017.1.132/linux/compiler/lib/intel64_lin/libintlc.so.5)| ->99.66% (125,830,320B) 0x40AF1F: for_allocate (in /global/cscratch1/sd/wyang/debugging/memory_leaks)| | ->33.22% (41,943,440B) 0x4033AF: MAIN__ (memory_leaks.f:41)| | | ->33.22% (41,943,440B) 0x402FDC: main (in /global/cscratch1/sd/wyang/debugging/memory_leaks)| | | | | ->33.22% (41,943,440B) 0x403621: MAIN__ (memory_leaks.f:51)| | | ->33.22% (41,943,440B) 0x402FDC: main (in /global/cscratch1/sd/wyang/debugging/memory_leaks)| | | | | ->33.22% (41,943,440B) 0x403898: MAIN__ (memory_leaks.f:54)| | ->33.22% (41,943,440B) 0x402FDC: main (in /global/cscratch1/sd/wyang/debugging/memory_leaks)| | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%)| ->00.24% (300,430B) in 1+ places, all below ms_print's threshold (01.00%)

-------------------------------------------------------------------------------- n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B)--------------------------------------------------------------------------------... 94 658,456,870 126,056,640 125,935,407 121,233 0

Page 35: Parallel Debugging Tools - NERSC · 2017. 2. 24. · DDT and TotalView • GUI-based tradional parallel debuggers – Intui:ve and simple to use; many useful tools – Allow to control

National Energy Research Scientific Computing Center

-35-