Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
IntelIntelIntelIntel®®®® Cluster ToolsCluster ToolsCluster ToolsCluster Tools
Stephen Blair-ChappellTechnical Consulting Engineer
Intel Compiler Labs
1/11/20102
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
AgendaAgendaAgendaAgenda
� Introduction
� Intel® Software Development Products overview
� Cluster Toolkit and components
1/11/20103
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
What Are the Biggest Bottlenecks Today What Are the Biggest Bottlenecks Today What Are the Biggest Bottlenecks Today What Are the Biggest Bottlenecks Today in Creating Parallel Applications?in Creating Parallel Applications?in Creating Parallel Applications?in Creating Parallel Applications?
Source: Developing Custom Parallel Computing Applications, Simon Management Group, September 2006
1/11/20104
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
Cluster Market Rapidly GrowingCluster Market Rapidly GrowingCluster Market Rapidly GrowingCluster Market Rapidly Growing
Source: *IDC HPC Technical Computing And Cluster Market Update May, 2006
Clusters are now the majority of HPC market
Why? Less expensive hardware. Easier implementation.
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
03Q1 03Q2 03Q3 03Q4 04Q1 04Q2 04Q3 04Q4 05Q1 05Q2 05Q3 05Q4 06Q1 06Q2
Cluster Market Penetration
Clusters
Non-Clustered
1/11/20105
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
Definition of ClustersDefinition of ClustersDefinition of ClustersDefinition of Clusters
� Distributed computing systems which communicate with each other over an interconnect
� Examples of interconnect:
– Gigabit Ethernet
– InfiniBand*
– Myrinet*
– Quadrics*
1/11/20106
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
New releases of IntelNew releases of IntelNew releases of IntelNew releases of Intel®®®® Cluster ToolsCluster ToolsCluster ToolsCluster ToolsIntel software tools make clusters easier to program and optimizIntel software tools make clusters easier to program and optimizIntel software tools make clusters easier to program and optimizIntel software tools make clusters easier to program and optimizeeee
– IntelIntelIntelIntel®®®® Cluster Toolkit 3.0Cluster Toolkit 3.0Cluster Toolkit 3.0Cluster Toolkit 3.0
• Bundle with single installer and license
– Intel® MPI Library 3.0
•• Automated fabric selection and performance Automated fabric selection and performance
optimizationsoptimizations
– Intel® Trace Analyzer and Collector 7.0
•• Trace file comparison and analyzing the Trace file comparison and analyzing the
effects on MPI performance of code changes.effects on MPI performance of code changes.
– Intel® Math Kernel Library 9.0 Cluster Edition
•• Optimizations for the latest dual and quad Optimizations for the latest dual and quad
core processorscore processors
– Cluster OpenMP* for Intel® C++ and Fortran compilers
• The first commercially available OpenMP for clusters
• Licensing and pricing for use by wider range of cluster users
1/11/20107
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
IntelIntelIntelIntel®®®® MPI Library 3.0MPI Library 3.0MPI Library 3.0MPI Library 3.0A high performance universal MPI solution A high performance universal MPI solution A high performance universal MPI solution A high performance universal MPI solution enabling applications to run across multiple enabling applications to run across multiple enabling applications to run across multiple enabling applications to run across multiple network fabricsnetwork fabricsnetwork fabricsnetwork fabrics
� Features
– Easy to install and configure
– Save development resources and improve application quality
– Job scheduler support: PBS Pro*, Torque*, LSF*, etc.
– Debugger support: IDB, DDT*, gdb, TotalView*
– Based on the widely used ANL MPICH2
� What’s New
– Automated fabric selection
– Enhanced process pinning
– Performance optimizations and tuning options
– Full thread support (MPI_THREAD_MULTIPLE)
RIKENIntel’s MPI and Cluster Tools provide us the best cluster development environment.”
Dr. Takahiro Koichi
Computational Astro Physics Laboratory
RIKEN, Japan
1/11/20108
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
IntelIntelIntelIntel®®®® Trace Analyzer and Collector 7.0Trace Analyzer and Collector 7.0Trace Analyzer and Collector 7.0Trace Analyzer and Collector 7.0The worldThe worldThe worldThe world’’’’s best analysis tool for MPI applicationss best analysis tool for MPI applicationss best analysis tool for MPI applicationss best analysis tool for MPI applications
� Features
– Increase productivity and cluster application performance
– Very low impact
– Excellent scalability on time and processors
– GUI on Linux* and Windows*
� What’s New
– Comparison of multiple trace files
– Timeline display for performance counters
– Powerful new aggregation and filtering functions
– Better and faster GUI
– MPI Checking - correctness checking library
EM Software SystemsEM Software SystemsEM Software SystemsEM Software SystemsIntel Trace Analyzer and Collector have proven to be very valuable tools to help understand FEKO parallel communication patterns and consequently in optimizing the message passing call that result in an extremely well performing electromagnetic ISV cluster application
Dr. Ing. Ulrich Jakobus, Technical Director
1/11/20109
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
MPI Message Checking Case StudyMPI Message Checking Case StudyMPI Message Checking Case StudyMPI Message Checking Case StudyLSTC LSLSTC LSLSTC LSLSTC LS----DYNA* Transient Finite Element Analysis DYNA* Transient Finite Element Analysis DYNA* Transient Finite Element Analysis DYNA* Transient Finite Element Analysis ApplicationApplicationApplicationApplication
Images and Logo copyright Livermore
Software Technology Corporation
"At LSTC we know how difficult MPI programming can be and invest considerable effort into making LS-Dyna robust. Message Checking with Intel Trace Analyzer and Collector identified a very subtle issue before it became a problem, saving us a significant amount of potential future debugging. No other tool No other tool No other tool No other tool of which I am aware has this capability or could have of which I am aware has this capability or could have of which I am aware has this capability or could have of which I am aware has this capability or could have detected this problem."detected this problem."detected this problem."detected this problem."
Brian Wainscott, Developer, LSTC/LS-Dyna*
Trace Analyzer and Collector helped LSTC to debug MPI code and create optimized code.
Solution
LSTC LS-DYNA* is a general-purpose transient dynamic finite element program capable of simulating complex real world problems in Automobile Design, Aerospace, Manufacturing, and Bioengineering.
Overview
1/11/201010
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
IntelIntelIntelIntel®®®® Math Kernel Library Math Kernel Library Math Kernel Library Math Kernel Library Cluster Edition 9.0Cluster Edition 9.0Cluster Edition 9.0Cluster Edition 9.0A highly optimized math library for desire A highly optimized math library for desire A highly optimized math library for desire A highly optimized math library for desire maximum performancemaximum performancemaximum performancemaximum performance
� What’s New
– Optimizations for the new multi-coreIntel® Xeon® 5100 and 5300 series processors
– New VML Functions
• floor, ceil, round, trunc, hypot, etc.
– New FMGRES iterative sparse solver
– FFTW Interface in Fortran & C
– New User’s Guide and Linux man pages
ABAQUS By adopting the Intel MKL DGEMM libraries, our standard timing improved between 43% and 71%, which is very impressive”
Matt Dunbar, Software Developer
1/11/201011
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
Cluster OpenMP* for IntelCluster OpenMP* for IntelCluster OpenMP* for IntelCluster OpenMP* for Intel®®®® C++ and C++ and C++ and C++ and Fortran CompilersFortran CompilersFortran CompilersFortran CompilersINNOVATION for OpenMP:INNOVATION for OpenMP:INNOVATION for OpenMP:INNOVATION for OpenMP:The first commercially available OpenMP for The first commercially available OpenMP for The first commercially available OpenMP for The first commercially available OpenMP for ClustersClustersClustersClusters
RWTH Aachen UniversityRWTH Aachen UniversityRWTH Aachen UniversityRWTH Aachen University“RWTH Aachen has used OpenMP to parallelize many of our scientific applications because it is easier to use than MPI and provides comparable performance on large shared-memory machines. We are in the process of evaluating Intel's Cluster OpenMP. We believe that Cluster OpenMP will allow some of our OpenMP applications to run on clustered Intel processors at lower cost and with less effort than either rewriting in MPI or buying additional large SMP machines.“
Dieter an MeyRWTH Aachen University
Available as add-on to Intel Compilers!
� Features
– Bringing the ease of OpenMP to cluster systems
– Run (slightly modified) OpenMP code on a commodity cluster
– Exploit existing SMP OpenMP codes on cheaper clusters
– Equivalent OpenMP performance compared to SMP machine with the same number of CPUs
� Suitable Programs
– Scale with OpenMP on SMP
– Have good data locality
– Use synchronization sparingly
1/11/201012
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
Applies to majority of OpenMP* codes easilyApplies to majority of OpenMP* codes easilyApplies to majority of OpenMP* codes easilyApplies to majority of OpenMP* codes easily
� Only one new statement “sharable” is required
– Used at the declaration (or allocation) point of variables which are shared between threads
• In many cases the compiler can deduce the need for a sharable qualification and introduce it automatically
– As with OpenMP you still have a valid serial code after porting
– As an example, internally we took the SPECOMPM benchmarks (Spec OpenMP) and “ported” them to use Cluster OpenMP.
• 9 out of 11 ported easily and showed good results in scaling. The other 2 would need non-trivial work to scale well. We think this is typical – most OpenMP will port easily and be able to harness small clusters well. With some effort, OpenMP can be used in a manner where this will work.
• only about 2% of source lines needed to be changed.
• The largest code (FMA3D, ~60,000 lines) needed no source code changes at all (a global switch on the compiler was sufficient to port all the code to clusters)
1/11/201013
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
IntelIntelIntelIntel® MPI LibraryMPI LibraryMPI LibraryMPI Library
� � Linux
� � Windows
� � Itanium® 2
� � Xeon™/EM64T
� � Pentium® 4
1/11/201014
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
WhatWhatWhatWhat is MPI ?is MPI ?is MPI ?is MPI ?
� MPI is a de facto standard for communication among the processes modeling a parallel program on a distributed memory system. Often these programs are mapped to clusters and distributed memory supercomputers -- from Wikepdia
� Features
– Explicit communication and synchronization
– Explicit distribution of data
– Collective operations
– Single sided communication
– Parallel I/O
� Bindings
– C / C++ / Fortran
1/11/201015
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
Intel MPI Library?Intel MPI Library?Intel MPI Library?Intel MPI Library?
Customers select
interconnect at
runtime
ISVs see &
support single
interconnect
A
B
C
D
E
F
TCP/IP
Myrinet
InfiniBand
SharedMemory
Applications
Fabrics
Intel® MPI atopAbstract Fabric
IHVs create DAPL
providers and fabric
drivers
Quadrics
Othernetworks
1/11/201016
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
Intel MPI Library 3.0 enhancementsIntel MPI Library 3.0 enhancementsIntel MPI Library 3.0 enhancementsIntel MPI Library 3.0 enhancements
� Increased Application Performance
– Fine tuning by env variables
– Faster start-up
� Optimized Collective Operations
� Improved Stability and Correctness
� Increased Interoperability
– Thread-safe libraries at the MPI_THREAD_MULTIPLE level
– Support for Etnus*, Totalview*, DDT*, and Intel debuggers
– Simplified process management by integration with leading job schedulers (LSF, PBS Pro, Torque)
� Enhanced Operating System and Compiler Support
� Improved Support of OpenFabrics stack
1/11/201017
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
Advantages for Developers Advantages for Developers Advantages for Developers Advantages for Developers
– Reduce development and testing costs
– Increase productivity and functionality
– Simplify maintenance
Eliminate the need to develop, maintain, and test an application on various, supported
fabrics, thus saving resources and improve product quality
1/11/201018
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
Intel Trace Analyzer and CollectorIntel Trace Analyzer and CollectorIntel Trace Analyzer and CollectorIntel Trace Analyzer and Collector
� � Linux
� Windows
� � for Trace Analyzer GUI
� � Itanium® 2
� � Xeon™/EM64T
� � Pentium® 4
1/11/201019
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
Trace UniverseTrace UniverseTrace UniverseTrace Universe
Intel TraceIntel Trace
CollectorCollector TracefileTracefileIntel TraceIntel Trace
AnalyzerAnalyzer
ApplicationApplication
1/11/201020
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
Components and InteractionComponents and InteractionComponents and InteractionComponents and Interaction
TracesSTF
Intel® Trace Collector Lib
API
Intel® Trace Collector Lib
itcinstrument instrument
Executable
Application
Compiler
Linker
Instrumented
Executable
1/11/201021
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
IntelIntelIntelIntel®®®® Trace CollectorTrace CollectorTrace CollectorTrace CollectorOverviewOverviewOverviewOverview
– Event based approach
• Event = time stamp + thread ID + description
• Function entry/exit
• Messages
• Collective operations
• Counter samples
– Low impact on application performance
– Provides API to instrument user code
– Trace optimized program runs
– Analyzes communication layer (default)
1/11/201022
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
Key FeaturesKey FeaturesKey FeaturesKey Features
� Catch all MPI events
� Strong configuration mechanism
– Filters, settings, features
� Automatic source-code references
� Instrumentation
– Rich API
– Binary instrumentation (itcinstrument)
– Compiler based (beta)
� Fail-safe version
� Comparison feature
1/11/201023
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
Intel Trace Analyzer OverviewIntel Trace Analyzer OverviewIntel Trace Analyzer OverviewIntel Trace Analyzer Overview
� Enables the user to quickly focus at the appropriate level of detail to find performance hotspots and bottlenecks.
� Use of hierarchical displays to address scalability in time and processor–space
� High–performance graphics, excellent zooming and filtering
� Windows version of the Graphical User Interface
1/11/201024
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
ChartChartChartChart
A Chart is a numerical or graphical diagram
1/11/201025
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
Timelines: Event Timeline� Get impression of program structure
� Display functions, messages and collective operations for each process/thread along time-axis
� Retrieval of detailed event information
1/11/201026
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
Timelines: Qualitative Timeline
� Find patterns and irregularities
� Display attributes of functions, messages or collective operations as they occur for any process/thread
� Retrieval of detailed event information
1/11/201027
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
Timelines: Quantitative Timeline
� Get impression on parallelism and load balance
� Show for every function how many threads/processes are currently executing it
1/11/201028
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
Profiles: Flat Function ProfileProfiles: Flat Function ProfileProfiles: Flat Function ProfileProfiles: Flat Function Profile
� Statistics about functions
1/11/201029
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
� Function statistics
� including calling hierarchy
– Tree: call-stack
– Graph: calling dependencies
Profiles: CallProfiles: CallProfiles: CallProfiles: Call----Tree and CallTree and CallTree and CallTree and Call----GraphGraphGraphGraph
1/11/201030
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
Communication ProfilesCommunication ProfilesCommunication ProfilesCommunication Profiles
� Statistics about point-to-point or collective communication
� Generic matrix supports grouping by several attributes in each dimensionSender, Receiver, Data volume per msg, Tag, Communicator, Type
� Available attributesCount, Bytes transferred, Time, Transfer rate
1/11/201031
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
ViewViewViewView
� Helps navigating through the trace data and keep orientation
� Every View can contain several Charts
� A View on a file is defined by a triplet of
– time-span
– set of threads
– set of functions
� All Charts follow changes to View (e.g. zooming)
� Timelines are correctly aligned along time
1/11/201032
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
View View View View ---- zoomingzoomingzoomingzooming
1/11/201033
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
Flexibility of ViewsFlexibility of ViewsFlexibility of ViewsFlexibility of Views
� Several Views can be opened (on the same or on different files)
� Location, orientation and size of charts can easily be changed
� Entire Views can and individual charts can be cloned and closed
� Individual charts can be cloned in own View
1/11/201034
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
Understanding your codeUnderstanding your codeUnderstanding your codeUnderstanding your code
� Parallel Poisson
� Example of intuitive parallelization with disadvantageous communication pattern
1/11/201035
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
Understanding the problemUnderstanding the problemUnderstanding the problemUnderstanding the problem
� Blocking border exchange
� Pn has blocks until communication between Pn+1 and Pn+2 was completed
� Solution: Non blocking communicationSolution: Non blocking communicationSolution: Non blocking communicationSolution: Non blocking communication
1/11/201036
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
Detecting Load ImbalanceDetecting Load ImbalanceDetecting Load ImbalanceDetecting Load Imbalance
� Mandelbrot set (MPI-tutorial)
– mpitutorial.tar.gz
� Example of intuitive parallelization with huge load imbalance
1/11/201037
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
Understanding Load ImbalanceUnderstanding Load ImbalanceUnderstanding Load ImbalanceUnderstanding Load Imbalance
1/11/201038
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
IntelIntelIntelIntel®®®® Trace Analyzer and Collector 7.0 Trace Analyzer and Collector 7.0 Trace Analyzer and Collector 7.0 Trace Analyzer and Collector 7.0 New Feature: Comparison of two program runsNew Feature: Comparison of two program runsNew Feature: Comparison of two program runsNew Feature: Comparison of two program runs
Timeline ofinitial
application run
Comparison of function and process profile
data
Network usage profile data for MPI messages
Shorter RED barsmeans less MPI traffic and increased performance
Timeline ofoptimized
application run
Works on systems from 2 processes to more that a thousand processes
1/11/201039
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
Intel Message Checking with Intel Trace Analyzer and CollectorIntel Message Checking with Intel Trace Analyzer and CollectorIntel Message Checking with Intel Trace Analyzer and CollectorIntel Message Checking with Intel Trace Analyzer and Collector
� A novel MPI correctness tool– Detects errors with data types, buffers, communicators, point-to-point & collective ops, deadlocks and hangs.
� Online-based
– MPI correctness checking using IMC library for Intel® Trace Collector– All error-checking done at runtime
� Offline analysis
– Interactive debugging using a traditional debugger– Text error output for analysis– Automates detection of errors
� Platforms
– Intel MPI Library on Linux*, IA32, Intel® EM64T, IPF
1/11/201040
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
IntelIntelIntelIntel® Math Kernel Library Cluster EditionMath Kernel Library Cluster EditionMath Kernel Library Cluster EditionMath Kernel Library Cluster Edition
� � Linux
� � Windows
� � Itanium® 2
� � Xeon™/EM64T
� � Pentium® 4
1/11/201041
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
IntelIntelIntelIntel® Math Kernel Library Cluster EditionMath Kernel Library Cluster EditionMath Kernel Library Cluster EditionMath Kernel Library Cluster Edition
� All the functionality of Intel® MKL plus …
ScaLAPACK
– ScaLAPACK is for solving dense linear systems and computing eigenvalues for dense matrices
– Optimized version for Intel ® processors
1/11/201042
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
ScaLAPACKScaLAPACKScaLAPACKScaLAPACK
� “Scalable LAPACK” or LAPACK for distributed memory computer systems
� The standard for Linear Algebra problem solutions for clusters
� Netlib*
– Standard publicly available implementation of ScaLAPACK
� Performance (PDGETRF function)
– Intel Cluster MKL significantly outperforms Netlib* implementation• >20% faster for block sizes of 64-
128• >50% faster for block sizes 256 or
greater
– Intel Cluster MKL is much less sensitive to block size differences• Intel Cluster MKL performs well on a
wide range of block sizes
Configuration Info:• Cluster of four 4-way Intel Itanium® 2, 1.4 GHz, 16 GB memory
• Red Hat Linux* Advanced Server release 2.1AS
Linear AlgebraLinear AlgebraLinear AlgebraLinear Algebra
1/11/201043
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
IntelIntelIntelIntel® MPI Benchmarks MPI Benchmarks MPI Benchmarks MPI Benchmarks ((((open source))))
– Successor of the well-known Pallas MPI benchmarks (PMB)
– Comprehensive set of MPI kernels that provide performance measurements for:
• Point-to-point message-passing
• Global data movement and computation routines
• One-sided communications
• File I/O
– Intel® MPI Benchmarks helps to compare the performance of various computing platforms, MPI implementations, and interconnection fabrics
1/11/201044
Copyright © 2007, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Intel® Software Development Products Overview
Conclusion and Next stepsConclusion and Next stepsConclusion and Next stepsConclusion and Next steps
� Intel® Software Development tools help make software faster and developers more productive
– Gain competitive advantage
– Reduce development and deployment investment
– Increase productivity with profiling tools and libraries
Learn more and download evals at: www.intel.com/software/products