Upload
orlando-wolf
View
27
Download
0
Embed Size (px)
DESCRIPTION
Di Qing SA3 Academia Sinica & CERN Geneva, 2008. CREAM CE Certification and Testing. Introduction. Goals Verify installation and configuration Pass normal certification procedures 5 days unattended continuously stress test 50 multiple users Less than 0.5% failures Patches - PowerPoint PPT Presentation
Citation preview
EGEE-III
Enabling Grids for E-sciencE
www.eu-egee.org
EGEE and gLite are registered trademarks
CREAM CE Certification and Testing
Di Qing
SA3
Academia Sinica & CERNGeneva, 2008
Enabling Grids for E-sciencE
EGEE-III
Introduction
• Goals– Verify installation and configuration– Pass normal certification procedures– 5 days unattended continuously stress test
50 multiple users Less than 0.5% failures
• Patches– 1755, CREAM server– 1790, CREAM client
• Test started at the end of May• The test scripts provided by INFN• One wiki page setup for test results
– https://twiki.cern.ch/twiki/bin/view/EGEE/CREAMTest
• CE checklist – https://twiki.cern.ch/twiki/bin/view/EGEE/CECheckList
CREAM CE certification and testing - Qing 2
Enabling Grids for E-sciencE
EGEE-III
Testbed setup
• One separated Torque server – BLParser server installed there by hand
• 11 WNs– 110 Virtual CPUs
• One UI• One physical CREAM CE for stress testing
– 4 2.2 GHz cores– 4GB memory
• One CREAM CE for installation and configuration test
3CREAM CE certification and testing - Qing
Enabling Grids for E-sciencE
EGEE-III
Test performed
• Installation and configuration– Followed the formal installation procedure, specially check the
package dependency– Tested different installation scenario– Configured it by YAIM
• Basic functionalities– Submission through CLI, job status check, delegate proxy etc.
• Stress testing– Submission through CLI– 9800 jobs per day with 49 users– Jobs accumulated in queues as fast as possible
CREAM CE certification and testing - Qing 4
Enabling Grids for E-sciencE
EGEE-III
Test results
• Basic functionality tests passed• Dependency missing on some packages
– tomcat, mysql-server and mysql-connector-java
• Configuration issues– Configurations of BLPaser server and blah– LCAS and LCMAPS generate too many logs– Issue with the upgrade of Glexec
fixed
• Other issues– Need to restart tomcat if CAs are updated
problems in trustmanager
– too many files left under the home directories of pool accounts– Authentication fails with new type of VO attributes in VOMS proxy
fixed
5CREAM CE certification and testing - Qing
Enabling Grids for E-sciencE
EGEE-III
Stress test results
• System load– CPU load is quite low, even less than 1 for most of time, only
when submitting massive jobs to cream CE, it can reach 9– Memory usage is low too, less than 2GB– Disk usage can increase
Can be solved by purging jobs and limiting the log level of services
• Job submission– Job submission to CREAM CE can fail
Happened in last two days test, even more than 3% of jobs could not be submitted
– Job success rate More than 99.5% jobs succeeded in 6 of 10 days test The worst failure rate is about 9.5%
• Most of failures only give error message, “blah error”
CREAM CE certification and testing - Qing 6
Enabling Grids for E-sciencE
EGEE-III CREAM CE certification and testing - Qing 7
Conclusion
• Still need more works to reach production quality– We never reached 5 days unattended continuously stress test with
less than 0.5% failure rate on certification testbed
• Now can be released to PPS for users to test and get some experiences after sorting out the installation and configuration issues– In principle, it can be done today
• The tests have been done only by CREAM CLI– Tests through WMS will be done when ICE is ready
A recipe on how to setup a WMS plus ICE is available
• A new CREAM CE patch is in preparation for certification