11
F. Rademakers - CERN/E P Linux Certification - FOCUS Linux Certification Fons Rademakers

F. Rademakers - CERN/EPLinux Certification - FOCUS Linux Certification Fons Rademakers

Embed Size (px)

Citation preview

Page 1: F. Rademakers - CERN/EPLinux Certification - FOCUS Linux Certification Fons Rademakers

F. Rademakers - CERN/EPLinux Certification - FOCUS

Linux Certification

Fons Rademakers

Page 2: F. Rademakers - CERN/EPLinux Certification - FOCUS Linux Certification Fons Rademakers

F. Rademakers - CERN/EPLinux Certification - FOCUS 2

Foreword

• An up-to-date version of RedHat Linux is a key issue for almost all computing activities and experiments

• Certification process must be discussed in FOCUS

• Experiment input collected by e-mail– 4 LHC experiments– Most LEP experiments– Compass, NA49

• DAQ, HLT and Offline projects

Page 3: F. Rademakers - CERN/EPLinux Certification - FOCUS Linux Certification Fons Rademakers

F. Rademakers - CERN/EPLinux Certification - FOCUS 3

Linux Ubiquitous

• Linux ubiquitous: – All experiments are using it; most as main platform

• Windows mentioned only by LHCb as DAQ development platform • Sun Solaris alternative platform (CMS)

– Desktop: • Standalone, dual-boot, VMware• Code (DAQ, HLT, Offline) development• TRG/DAQ simulation

– Large farms (50 to 200 nodes):• Either private or in the computing centre• Data taking periods (Compass)• Data Challenges (LHC exp.)• Physics simulation, reconstruction and analysis

• RISC is now minimal– Transition of experiments to Linux completed

Page 4: F. Rademakers - CERN/EPLinux Certification - FOCUS Linux Certification Fons Rademakers

F. Rademakers - CERN/EPLinux Certification - FOCUS 4

Reasons to Stick to RH 6.1

• Mainly compatibility with CERN version • Ease system management until RH 7.2 validated • Objectivity (Compass), GPHIGS (Delphi, OPAL)

• CERN RH 7.2 validation platform not yet ready

• Each experiment developed its own migration strategy due to the strong need to move to RH 7.2 (next slide)– RH 6.1 + kernel 2.4– RH 7.1 + gcc 2.95– RH 7.2

Page 5: F. Rademakers - CERN/EPLinux Certification - FOCUS Linux Certification Fons Rademakers

F. Rademakers - CERN/EPLinux Certification - FOCUS 5

Motivation to Convert to RH 7.2

• RH 6 becoming obsolete– Recent packages do not support it any longer– Need to use latest desktop environments (KDE, Gnome)

• Improved and more robust security• Collaboration home institutes have already converted (Compass,

Delphi, ALICE, ATLAS, LHCb)– Often home labs member of several collaborations

• New hardware needs it (ALICE, ATLAS, CMS)• Latest version of gcc (CMS)• Kernel 2.4 required for better I/O (TCP/IP stack, disk I/O), better

VM system and support for recent motherboards

• All experiments request to go to RH 7.2 ASAP

Page 6: F. Rademakers - CERN/EPLinux Certification - FOCUS Linux Certification Fons Rademakers

F. Rademakers - CERN/EPLinux Certification - FOCUS 6

Linux Environments at CERN

• IT computer centre public farms:– LXPLUS, LXBATCH, LXSHARE

• Private experiment machines and farms– CMS DAQ test benches, e.g. to test switch fabrics

(70 nodes)– Development and test of DAQ framework– To run DAQ simulation (based on ptolemy)– Many smaller, 4-5 machine, clusters

Page 7: F. Rademakers - CERN/EPLinux Certification - FOCUS Linux Certification Fons Rademakers

F. Rademakers - CERN/EPLinux Certification - FOCUS 7

Proposal for Certification Process

• Step 1:– RH Linux is certified by RH, no need to redo it

• In case of problems RH releases promptly fixes, these can be installed via a SUE mechanism or RH’s up2date

• Step 2:– IT only needs to certify the packages it is responsible for

1. Special drivers (OpenAFS, etc), mostly kernel and not RH specific

2. What used to be in ASIS (mostly obsolete with new RH releases), SUE, hepix, cernlib

3. Commercial packages used by (some) experiments (NAG, Objy)

• Step 3:– Ask external package providers to certify their programs– Ask the experiments to certify their codes

Page 8: F. Rademakers - CERN/EPLinux Certification - FOCUS Linux Certification Fons Rademakers

F. Rademakers - CERN/EPLinux Certification - FOCUS 8

Proposal for Certification Process

• Steps 2.1 and 2.2 can be done within a few days, but surely within 2-3 weeks after release of a new RH

• At this stage the new RH can be certified for users not relying on commercial products

• Stage 2.3 depends on vendors. Avoid vendors without commitment to release very timely new version in response to major Linux releases– Should take not more than two months, otherwise

renegotiate contract or dump vendor

Page 9: F. Rademakers - CERN/EPLinux Certification - FOCUS Linux Certification Fons Rademakers

F. Rademakers - CERN/EPLinux Certification - FOCUS 9

Proposal for Certification Process

• Step 3 can be completed within two weeks after finishing 2.2 for experiments like ALICE and after 2.3 for other experiments

• Certification of the non-commercial version should not take more than 4-5 weeks from the release of a new RH version

• Full certification can take considerably longer due to 2.3

• Therefore there should be a two stage certification process

• IT should be able to switch entire clusters between two RH versions if certification step 2.3 cannot be achieved within a couple of months – E.g. ALICE will need RH7.2 for its data challenges on

LXSHARE, while CMS will still need RH6.1– If not then, in extremis, an experiment like COMPASS can hold

ALICE hostage for the coming years

Page 10: F. Rademakers - CERN/EPLinux Certification - FOCUS Linux Certification Fons Rademakers

F. Rademakers - CERN/EPLinux Certification - FOCUS 10

More Comments on RH 7.2

• CERN should distribute its additions as RPM’s (all)• Software needed with RH 7.2:

– Bigphys area patch (ATLAS DAQ, CMS DAQ)– CERN environment: AFS, SUE– CMT, GEANT4, CLHEP, ANAPHE, AIDA, ROOT– Doxygen, Ptolemy, Python version 2, REXX, Xemacs, Xerces XML

• Work on a network booting service for diskless VMEbus SBCs based on a server with CERN RH 7.2.1beta2 and version 3.0 of the LTSP (Linux Terminal Server Project) software which is also kernel 2.4.9. (ATLAS DAQ)

• CERN store should have on stock machine with both Win2K and Linux installed (dual boot or VMWare)

Page 11: F. Rademakers - CERN/EPLinux Certification - FOCUS Linux Certification Fons Rademakers

F. Rademakers - CERN/EPLinux Certification - FOCUS 11

What Next ?

• Finish as soon as possible the RH 7.2 certification– RH 7.3 is imminent

• Review certification process along the lines proposed• Involve 1 person of each experiment in the certification

process to improve coordination and to ensure timely feedback on the certification of experiment codes

• Evaluate in parallel a second Linux distribution like SuSE– installing SuSE 7.2 + AFS by a non-experienced

person and running the complete CMS code took only one day