FAX PERFORMANCE TIM, Tokyo May 2013. PERFORMANCE TIM, TOKYO, MAY 2013ILIJA VUKOTIC 2 Metrics Data Coverage Number of users

Embed Size (px)

DESCRIPTION

COST MATRIX TIM, TOKYO, MAY 2013ILIJA VUKOTIC 3 destination Rate MB/s BNL-ATLASCERN-PRODDESY-HHINFN-ROMA1LRZ-LMUMWT2RAL-LCG2SWT2_CPBUKI-LT2-QMULUKI-SCOTGRID-GLASGOW source AGLT BNL-ATLAS CERN-PROD DESY-HH IllinoisHEP INFN-FRASCATI INFN-NAPOLI-ATLAS INFN-ROMA LRZ-LMU MPPMU MWT OU_OCHEP_SWT praguelcg RAL-LCG RU-Protvino-IHEP SWT2_CPB UKI-LT2-QMUL UKI-SCOTGRID-ECDF UKI-SCOTGRID-GLASGOW UKI-SOUTHGRID-OX-HEP WT A place to get idea on rate a single job can expect to see. Are our pipes really this full? Let’s see other sources of information.

Citation preview

FAX PERFORMANCE TIM, Tokyo May 2013 PERFORMANCE TIM, TOKYO, MAY 2013ILIJA VUKOTIC 2 Metrics Data Coverage Number of users Percentage of successful jobs Total amount of data delivered Bandwidth usage Source Ganglia plots MonaLisa FAX Dashboard HC tests CostMatrix tests Special tests using dedicated resources better than 97%, more than 2 replicas mostly UofC, Prague users Latest HC tests >99% ~ 2PB/week COST MATRIX TIM, TOKYO, MAY 2013ILIJA VUKOTIC 3 destination Rate MB/s BNL-ATLASCERN-PRODDESY-HHINFN-ROMA1LRZ-LMUMWT2RAL-LCG2SWT2_CPBUKI-LT2-QMULUKI-SCOTGRID-GLASGOW source AGLT BNL-ATLAS CERN-PROD DESY-HH IllinoisHEP INFN-FRASCATI INFN-NAPOLI-ATLAS INFN-ROMA LRZ-LMU MPPMU MWT OU_OCHEP_SWT praguelcg RAL-LCG RU-Protvino-IHEP SWT2_CPB UKI-LT2-QMUL UKI-SCOTGRID-ECDF UKI-SCOTGRID-GLASGOW UKI-SOUTHGRID-OX-HEP WT A place to get idea on rate a single job can expect to see. Are our pipes really this full? Lets see other sources of information. COST MATRIX VS. PERFSONAR TIM, TOKYO, MAY 2013ILIJA VUKOTIC 4 Comparison of just one link in one direction: source AGLT destination MWT2 Perfsonar info at 4 h intervals. Can it be worker nodes links are saturating? MWT2 SLAC AGLT2 BNL CERN CLOGGING THE PIPES Using HC submitted jobs submitted to 4 ANALY queues AGLT2, BNL, MWT2, SLAC Each site runs 300 jobs of two types 50 in parallel xrdcp 3 files randomly chosen from SMWZ datasets prepared for FDR from others Reads 10% of events from 3 file randomly chosen from FDR SMWZ from others Uploads time to finish, events/s, MB/s for each job, pandaid so jobs can be investigated All jobs submitted through FDR web interface All in parallel to other HC stress tests TIM, TOKYO, MAY 2013ILIJA VUKOTIC 5 TESTS 0.17% failure rate ! TIM, TOKYO, MAY 2013ILIJA VUKOTIC 6 COPY Clearly not limited by WN links Assuming just 30 simultaneous jobs worst case delivery rates are: BNL to CERN: 75 MB/s CERN to AGLT2: 170 MB/s MWT2 to AGLT2: 100 MB/s AGLT to CERN: 90 MB/s SLAC to BNL: 300 MB/s Average WAN access ~ 300 MB/s TIM, TOKYO, MAY 2013ILIJA VUKOTIC 7 MB/s BNL-ATLASCERN-PRODMWT2AGLT2SLAC source BNL-ATLAS CERN-PROD MWT AGLT SLAC READ Jobs were reading 10% of events using TTC 30MB 100% data are transferred and decompressed. ROOT can decompress our D3PD at ~20 MB/s Rates are the same as for xrdcp except when local access. Over WAN one should expect at least 50% of CPU efficiency of local access. Less than 100 simultaneous standard analysis jobs will saturate 10 Gb WAN link. FAX needs to be used judiciously, can easily overwhelm weaker links Rates are the same as for xrdcp except when local access. Over WAN one should expect at least 50% of CPU efficiency of local access. Less than 100 simultaneous standard analysis jobs will saturate 10 Gb WAN link. FAX needs to be used judiciously, can easily overwhelm weaker links TIM, TOKYO, MAY 2013ILIJA VUKOTIC 8 READ destination events/s BNL-ATLASCERN-PRODMWT2AGLT2SLAC source BNL-ATLAS CERN-PROD MWT AGLT SLAC MONA LISA TIM, TOKYO, MAY 2013ILIJA VUKOTIC 9 WAYS AHEAD TIM, TOKYO, MAY 2013ILIJA VUKOTIC 10 Increase coverage, add redundancy, increase total bandwidth Enlargement Increases performance, reduces bandwidth needs Caching Cost matrix smart FAX Smart network - Bandwidth requests, QOS assurance Improve adoption rate Presenting, teaching, preaching New services Improve satisfaction FAX tuning Application tuning New services