View
56
Download
1
Category
Preview:
DESCRIPTION
The PROOF Benchmark Suite Measuring PROOF performance. Sangsu Ryu (KISTI, Korea) Gerardo Ganis ( CERN). Measuring PROOF performance. PROOF aims at speeding-up analysis using N ROOT sessions in parallel - PowerPoint PPT Presentation
Citation preview
The PROOF Benchmark SuiteMeasuring PROOF performance
Sangsu Ryu (KISTI, Korea)Gerardo Ganis (CERN)
ACAT 2011, Brunel Univ. 2
Measuring PROOF performance• PROOF aims at speeding-up analysis using N
ROOT sessions in parallel • Scalability vs. N is a natural metrics to study
PROOF performance and to understand the main bottlenecks under different conditions
• The new benchmark suite is a framework toperform scalability measurements on a PROOFcluster in a standard way
06-09-2011
ACAT 2011, Brunel Univ. 3
Who’s supposed to use it?• PROOF site administrators, private
users– Check the installation– Find bottlenecks– Find optimal configuration parameters
• PROOF developers– Understand / Improve PROOF
06-09-2011
ACAT 2011, Brunel Univ. 4
Design requirements• Easiness-of-use– Default case must be straightforward to
run– Fine control also supported
• Flexible–Must be possible to run {user, experi-
ment}-specific cases
06-09-2011
ACAT 2011, Brunel Univ. 5
Address both PROOF modes• Data-driven– Unit of process is entries of a TTree fetched
from distributed files– Typically I/O intensive
• Can be network, RAM, CPU intensive• Cycle-driven– Unit of process is independent tasks
• Ex) generation of MC events– Typically CPU-intensive
• Can be I/O, network, RAM intensive
06-09-2011
ACAT 2011, Brunel Univ. 6
The new proofbench module• The suite is made of a set of client side classes• New module proofbench under $ROOTSYS/proof
– Set of default selectors– Set of default Par packages– Steering class TProofBench
• TProofBench– Test initizialization (open PROOF, setup a file for re-
sults)– Interface to run the tests– Tools to display the results– Interface to customize the tests
06-09-2011
ACAT 2011, Brunel Univ. 7
proofbench features (cnt’d)• Statistical treatment– 4 measurements (default) for each point– {Value, error} from {Average, RMS}
• Two types of scan– Worker scan: 1 … Nwrk (the usual one …)– Core scan: 1wrk/node … Nwrk/node
• study scalability inside a node• Possibility to save performance TTree• All relevant parameters configurable
06-09-2011
ACAT 2011, Brunel Univ. 8
Easiness-Of-Use
06-09-2011
root [] TProofBench pb(“<master>”)root [] pb.RunCPU()
Cycles/sa + b×Nwrk
Cycles/s / Nwrka / Nwrk + b
ACAT 2011, Brunel Univ. 9
Easiness-Of-Use (2)
06-09-2011
root [] TProofBench pb(“<master>”)root [] pb.MakeDataSet()root [] pb.RunDataSetx()
Event / s
MB / s
Normalizedplots
ACAT 2011, Brunel Univ. 10
Default tasks• Cycle-driven– Intensive random number generation to
testthe CPU scalability
–Merging TH3D: study impact of big out-puts
• Data-driven– Based on $ROOTSYS/test/Event.h,.cxx– Study impact of {file size, event size,
…} onI/O device scalability06-09-2011
ACAT 2011, Brunel Univ. 11
User defined tasks• Change the TSelector to be processed
• Change/Add PAR files, if required
• Use an existing dataset
06-09-2011
Can benchmark any user specific case
TProofBench::SetCPUSel(const char *selector)TProofBench::SetDataSel(const char *selector)
TProofBench::SetCPUPar(const char *par)TProofBench::SetDataPar(const char *par)
TProofBench::RunDataSet(const char *dataset)TProofBench::RunDataSetx(const char *dataset)
ACAT 2011, Brunel Univ. 12
Examples• To illustrate the tool we show some results
obtained from runs on– ALICE CERN Analysis Facility (CAF)– ALICE KISTI Analysis Facility (KIAF)– PoD clusters on clouds
• Courtesy of A. Manafov, GSI– Using a non-default data task
• for ALICE ESDs (courtesy of ALICE)• Shows also the sort of issues that can be
spotted06-09-2011
ACAT 2011, Brunel Univ. 13
Example 1: CPU task on ALICE CAF
06-09-2011
• 58 nodes, 8 cores/node, max 2 workers/node, 2 GB memory/core• 3 different types of CPU• The transition between CPU types are visible on the plot
lxbsq lxfssi lxfssl
ACAT 2011, Brunel Univ. 14
Example 2: CPU task on a cloud
06-09-2011
500 workers 971 workers
• Breakdown of scalability between 200-300 workers likely due tosingle master (packetizer) scalability issues (under study)
Courtesy of A. Manafov, GSI
ACAT 2011, Brunel Univ. 15
Hardware Configuration• KIAF (kiaf.sdfarm.kr, homogeneous)
Nodes 1 master + 4 workersCPU/node 2 CPU, 6 cores (Intel Xeon X5650,
2.67 GHz)Memory 24 GB/nodeStorage 5 TB/node (NAS, Ibrix X9000)
300 GB/node (SAS, 2 * IBM MB-D2147RC Max 6Gb/sec)
NICs 2 * 1 GB/node (storage, bonded)+ 2 * 1 GB/node (bonded)
06-09-2011
ACAT 2011, Brunel Univ. 16
Example 3: Default data task on KIAF
06-09-2011
ACAT 2011, Brunel Univ. 17
Example 4: ALICE data task on KIAF
06-09-2011
•Non-default data task•Full read of esdTree•Courtesy of ALICE
ACAT 2011, Brunel Univ. 18
Availability / Doc• ROOT versions– From ROOT 5.29/02 on– It can be imported in previous versions• See doc
• Web pagehttp://root.cern.ch/drupal/content/new-bench-mark-framework-tproofbench
06-09-2011
ACAT 2011, Brunel Univ. 19
Future plans• Analysis of the results–Modeling of dependencies and fit to rele-
vantparameters
• Tools to analysis the performance tree(s)– Better problem digging using per-packet
information• More graphical options06-09-2011
ACAT 2011, Brunel Univ. 20
Summary• The new PROOF benchmark suite provides
a standardized framework for performancetests
• Allows to measure scalability in differentconditions– Cross-check installation– Spot unusual / unexpected / weird behavior– Identify places for improvement– …
• Distributed with ROOT 5.30
06-09-2011
Recommended