Upload
francis-burke
View
217
Download
1
Tags:
Embed Size (px)
Citation preview
TraceRep IWSG'15 1
TRACEREP: GATEWAY FOR SHARING AND COLLECTING TRACES IN HPC SYSTEMS Iván Pérez
Enrique Vallejo
José Luis Bosque
University of Cantabria
TraceRep IWSG'15 2
Overview• HPC Traces - Introduction
• Traces for Application Developers• Traces for Computer Architects• Traces - Objections• Goals
• BSC Trace Tools • Extrae• Paraver
• TraceRep• Architecture• Design• Implementation• Limitations• Snapshots
• Conclusions and Future Work
TraceRep IWSG'15 3
1. HPC Traces – Introduction• HPC traces are sequences of events and messages
recorded during the execution of a parallel HPC program.
TraceRep IWSG'15 4
1.1. Traces for Application Developers
Computation Synchronization Waits
Point to PointMessages
Load Unbalance
Evaluation, tuning and optimization of applications
TraceRep IWSG'15 5
1.2. Traces for Computer Architects• Evaluate computer architectures.• Workloads for feeding simulators.
Application Binaries
Application Execution
Extraction Tool
Hardware model 1
Hardware model 2
Hardware model 3
Stats 1
Stats 2
Stats 3
Simulator
TraceRep IWSG'15 6
1.3. Traces - Objections• Complexity of tools and environment.• Limited access to HPC clusters.• Traces can reach very large sizes.• Traces are often not shared between researchers
• Traces are hard to obtain and distribute.• The tracing effort is not recognized.
TraceRep IWSG'15 7
1.4. TraceRep - Goals• User friendly interface to collect traces.
• Support with multiple clusters.• Easy to incorporate new clusters.
• Public trace repository.• Computer architects can access to traces of parallel
applications for their experiments.• Users can upload their own traces for the community.• Author encouragement:
• Authorship: Users can set Creative Commons licenses which protect the authorship of their traces.
• Citation of related work: Users can add a citation (.bib file) of a paper which studied the traced application, so it can be cited when the trace is used.
TraceRep IWSG'15 8
Overview• HPC Traces - Introduction
• Traces for Application Developers• Traces for Computer Architects• Traces - Objections• Goals
• BSC Trace Tools • Extrae• Paraver
• TraceRep• Architecture• Design• Implementation• Limitations• Snapshots
• Conclusions and Future Work
TraceRep IWSG'15 9
2.1. Extrae• Collects information during
the program execution and generates traces:• Runtime entries and exits,
hardware counters, user functions, periodic samples…
• Supported programming models:• MPI, OpenMP, CUDA, OpenCL,
pthreads, OmpSs, Java, Python.
• Supported platforms:• Linux clusters, BlueGene/Q,
Cray, nVidia GPUs, Intel Xeon Phi, ARM, Android.
Extrae configuration file
TraceRep IWSG'15 10
2.2. BSC Tools - Paraver• Very flexible visualization tool of trace-files.
IWSG'15 11
Overview• HPC Traces - Introduction
• Traces for Application Developers• Traces for Computer Architectures• Traces - Objections• Goals
• BSC Trace Tools • Extrae• Paraver
• TraceRep• Architecture• Design• Implementation• Limitations• Sanpshots
• Conclusions and Future Work
TraceRep IWSG'15 12
Internet
Internet
Web Browser
Users
Gateway
Drupal 7
Core Modules
TraceRepModule
Third Party Modules
User / Password
HPC clusters
TraceRep Scripts
Extrae Resource Manager
SSH/SFTP TraceRep User / Password
Apache 2 PHP MySQL
3.1. TraceRep - Architecture
TraceRep IWSG'15 13
Cluster
Compilation
Environment setup
Experiment launch
Gateway
Authentication
Source Code Upload
Create Experiment
Trace Repository
Anonymous
Registered
Trace Upload
3.2. TraceRep - Design
TraceRep IWSG'15 14
3.2. TraceRep - Implementation• Drupal’s modules covered most of the features.• Trace extraction service has implementations in both sides:
• Gateway side: new Drupal module.• Clusters side: Python scripts adapted to the specific cluster.
Drupal
Cluster
Trace Extraction
Experiment
Periodic Task
Cluster Filesystem
TraceRep directory
Compiltation Tools Extrae Resource Manager
Makefile Scripts
Is the experiment over?
TraceRep IWSG'15 15
3.4. TraceRep – Current prototype limitations
• Security:• TraceRep users upload code to the HPC clusters• Alternatives:
• Restricted privileges for the user account of TraceRep• Require a cluster account per-user to extract traces
• Compilation:• Paths to compilers and libraries can vary from cluster to cluster• Compilation constrains: a generic Makefile is currently used for all
source codes. Applications that use complex building tools are currently no supported.
• Alternative: provide a unified environment for compilation.
• Storage:• Storage in the gateway server is limited (limitation of the service used)• Alternative: $$$
TraceRep IWSG'15 16
3.5. Snapshotshttp://tracerep.unican.es
TraceRep IWSG'15 17
Overview• HPC Traces - Introduction
• Traces for Application Developers• Traces for Computer Architectures• Traces - Objections• Goals
• BSC Trace Tools • Extrae• Paraver
• TraceRep• Architecture• Design• Implementation• Limitations• Snapshots
• Conclusions and Future Work
TraceRep IWSG'15 18
4. TraceRep – Conclusions• Traces are very useful for HPC parallel application
developers and computer architects.• TraceRep provides a user friendly interface to collect
and share traces.• It encourage to share traces through trace licensing and
citations.• There are some limitations that must be addressed,
regarding security, compilation and storage.
TraceRep IWSG'15 19
4. TraceRep – Future work• Alternative frameworks to replace the Drupal prototype:
• Liferay [1]• Apache Airavata [2]
• Improve the compilation toolchain to present a consistent view on different clusters and allow for more complex codes.
• Exploiting the advanced features of Paraver is complex. We are seeking for a way to integrate Paraver in TraceRep.
[1] “Liferay” 2015. Available: http://www.liferay.com/
[2] “Apache Airavata architecture overview,” 2015. Available:http://airavata.apache.org/architecture/overview.html
TraceRep IWSG'15 20
TRACEREP: GATEWAY FOR SHARING AND COLLECTING TRACES IN HPC SYSTEMS Iván Pérez
Enrique Vallejo
Jose Luis Bosque
University of Cantabria
Thank you for your attention