19
January 19, 2005 SGI® Altix™ Using The Intel VTune Performance Analyzer Reiner Vogelsang SGI GmbH [email protected]

SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH [email protected] | January 19, 2005 | Page 2 Module Objectives

  • Upload
    others

  • View
    9

  • Download
    1

Embed Size (px)

Citation preview

Page 1: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives

January 19, 2005

SGI® Altix™Using The Intel VTune Performance Analyzer

Reiner VogelsangSGI GmbH

[email protected]

Page 2: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives

January 19, 2005 Page 2| |

Module Objectives

After completing the module you will be able •to profile an application using VTune•to run an experiement with multiple performance counter•to generate a callgraph of your application

Page 3: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives

January 19, 2005 Page 3| |

VTune – Purpose

Helps to identify and characterize performance issues by •Collecting performance data

– CPU-Cycles (time)– Micro-architectural events of processor– Platform resource utilization

Organizing and displaying the data

Identifying performance ‘hotspots’

Suggesting improvements (currently Windows only!)

Page 4: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives

January 19, 2005 Page 4| |

VTune: Status

•Native: Vtune for Linux 3.0– Any IA-32 or Itanium® system running recent Linux version

– Some kernel and GLIBC dependencies– Full Eclipsed-based GUI only for IA32 today

– Due to Eclipse issues with 64bit – Simple GUIs for IA64 available

– For Itanium® & EM64T command-line version– But graphical viewers for result– Eclipse-based release for 64bit system later in 2005

•Remote Data Collection– Allows full Windows GUI to be used for Linux too

Page 5: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives

January 19, 2005 Page 5| |

VTune: Features

•Sampling of Execution Addresses–Profiling based on processor event counters

•Call Graph Profiling - Instrumented analysis–Call tree, number of calls, timing information–Executing Instrumented Code

•Tracking of System Performance Counters–Performance Monitor (perfmon) Style Counters–Extended Performance DLL APIs – SDK Available!

•Intel® Tuning Assistant: Interpret the results ( Windows or RDC only )

Page 6: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives

January 19, 2005 Page 6| |

VTune: Example 1

General status commandsvtl query –lc

– lists all collectors ( sampling and callgraph for 2.0) vtl –help –c sampling

– lists all events available for EBS ( event base sampling ) Compile code with -g Create/Run a Sampling activity

setenv OMP_NUM_THREADS 4vtl activity –c sampling –app dplace, “-c 4-7 \

./untrim_elbe” run– Create and run a single Sampling collector Activity with application

‘dplace -x 2 -c 4-7 ./untrim_elbe’ ; default settings ( Instruction Retired and Cycles )

Invoke the viewer in order to display collected data

vtl view -gui

– Displays the last activitiy per default

Page 7: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives

January 19, 2005 Page 7| |

VTune -Process Display

Page 8: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives

January 19, 2005 Page 8| |

VTune: Module Display

Page 9: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives

January 19, 2005 Page 9| |

VTune: Hotspots

Page 10: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives

January 19, 2005 Page 10| |

VTune: Source Code Display

Page 11: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives

January 19, 2005 Page 11| |

VTune: Multiple Performance Counter Events

•VTune let you sample customized subsets of performance counter subsets.

•Important for stall cycle analysis and DEAR analysis(DEAR=Data Event Address Registers)

vtl activity -d 600 -c sampling \-o "-cpu_mask 8-15-ec en='L3_READS-ALL-MISS', \en='LOADS_RETIRED',en='STORES_RETIRED', \en='FP_OPS_RETIRED'" \-app dplace,"-x2 -c8-15 ./untrim_elbe" run

–Example collects all loads,stores, floating point operations and misses in L3 due to reads.

–Application will be executed #-of-event times.–Viewer let you sort hot spots according to each individual event.

Page 12: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives

January 19, 2005 Page 12| |

VTUNE: Module Display, Multiple Events

Page 13: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives

January 19, 2005 Page 13| |

VTune: Hotspots, Multiple Events

Page 14: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives

January 19, 2005 Page 14| |

VTune: Hotspots As Charts, Multiple Events

Page 15: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives

January 19, 2005 Page 15| |

VTune: Source of L3_MISS Hotspot

Page 16: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives

January 19, 2005 Page 16| |

VTUNE: Example 3, Callgraph

• Shows graphically the caller-callee relationship

• Highlights the hot path of an application

vtl activity -d 600 -c callgraph -app ../src/adi \-moi ../src/adi run

– It is important to declare the path to the application and the module of interest (-moi) in unique manner.

Page 17: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives

January 19, 2005 Page 17| |

VTUNE: Callgraph + Hot Path

Page 18: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives

January 19, 2005 Page 18| |

VTune: Summary

•VTune has its benefits for

–Hotpath detection within a caller-callee relationship

–Collecting and Displaying multiple performance counter events in case of stallcylce analysis or DEAR analysis.

Page 19: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives

January 19, 2005 Page 19| |