15
Support for Distributed Computing CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/ DB SDC Author etc PanDAMon Integration in CMS Workshop on Analysis Tools Development May 16 th 2013 Nicolò Magini CERN IT-SDC-OL date

PanDAMon Integration in CMS

  • Upload
    mort

  • View
    26

  • Download
    1

Embed Size (px)

DESCRIPTION

PanDAMon Integration in CMS. Workshop on Analysis Tools Development May 16 th 2013 Nicolò Magini CERN IT-SDC-OL. Outline. Status after the prototype Current status of the testbed deployment Plans for the integration testbed Next steps. - PowerPoint PPT Presentation

Citation preview

Page 1: PanDAMon Integration in CMS

Support for Distributed Computing

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

DBSDC

Author etc

PanDAMon Integration in CMS

Workshop on Analysis Tools Development

May 16th 2013

Nicolò Magini

CERN IT-SDC-OL

date

Page 2: PanDAMon Integration in CMS

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

SDC

2Author etc2013-05-16

Outline

• Status after the prototype• Current status of the testbed deployment• Plans for the integration testbed• Next steps

Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL

Page 3: PanDAMon Integration in CMS

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

SDC

3Author etc2013-05-16

Monitoring of PanDA jobs

• Reminder: “Monitoring of jobs in PanDA” is more than “PanDA Monitor”

• ATLAS ops and users take advantage of Dashboard (populated from PanDA DB) to complement PanDA Monitor, especially for– Task monitoring– Historical view

• Here I’m going to look only at the “PanDA Monitor” itself, in particular for job debugging

Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL

Page 4: PanDAMon Integration in CMS

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

SDC

4Author etc2013-05-16

PanDAMon for the prototype

• Using ATLAS PanDA Monitor as-is, with minimal updates by V. Fine (ATLAS PanDAMon developer) to make it functional for CMS jobs

• Already working successfully by CMS power users in proof of concept phase

Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL

Page 5: PanDAMon Integration in CMS

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

SDC

5Author etc2013-05-16

PanDAMon for the prototype

• viewlogfiles: perform LFN2PFN conversion with PhEDEx datasvc to find log file location (instead of looking up in central ATLAS catalog)

– Recently had an issue with logfile retrieval, now fixed by V. Fine

Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL

Page 6: PanDAMon Integration in CMS

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

SDC

6Author etc2013-05-16

Testbed deployment

• Additional 2 core, 8 GB VM could be useful as PanDA Mon “development instance” to test deployment and new modules

vocms09 Panda Mon (varnish) SLC6 LB

Preslav VM 2 cores, 8 GB mem, 500 GB disk

prototype

vocms35 Panda Mon (varnish) SLC6 LB

Preslav VM 2 cores, 8 GB mem, 500 GB disk

prototype

vocms33 Panda Mon SLC6 - power node LB

Preslav 23-JAN-14 24 cores, 32 GB mem, 2x750 GB disk

prototype

vocms100 Panda Mon SLC6 LB (temporary node, this is the ASO spare)

Preslav 27-JAN-14 8 cores, 24 GB mem, 3x1TB disk

spare

Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL

Page 7: PanDAMon Integration in CMS

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

SDC

7Author etc2013-05-16

Testbed status

• Basic quattor configuration performed by VOC on all machines following ATLAS templates

• Now in contact with ATLAS Distributed Computing operators for software deployment and configuration procedures

Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL

Page 8: PanDAMon Integration in CMS

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

SDC

8Author etc2013-05-16

PanDAMon testbed goals

• During testbed phase– Reproduce working PanDA Monitor setup from

prototype phase in CMS instance– Identify “ATLAS” assumptions in monitoring,

assess usability for CMS• Some examples found by developers in job debugging

views reported in the following• More surely to be found by CMS ops and users, will

gather feedback

– Produce new PandaMon custom modules for CMS integration for items not covered by current PanDAMon or Dashboard

Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL

Page 9: PanDAMon Integration in CMS

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

SDC

9Author etc2013-05-16

Navigation

• A lot of information on the website is aggregated by cloud

• For CMS, more useful to look at sites rather than clouds?

Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL

Page 10: PanDAMon Integration in CMS

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

SDC

10Author etc2013-05-16

Dataset info

• Dataset info linking to DQ2

Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL

Page 11: PanDAMon Integration in CMS

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

SDC

11Author etc2013-05-16

Dataset info

Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL

• Need to update to link to DAS/DBS

Page 12: PanDAMon Integration in CMS

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

SDC

12Author etc2013-05-16

Task monitoring

Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL

• Linked to ATLAS Task Monitoring• Integrate with CMS Task Monitoring

Page 13: PanDAMon Integration in CMS

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

SDC

13Author etc2013-05-16

Output file links

• Links to log and output file locations working in “viewlogfile” page, need to fix in “findfile“

• (do we want to update output location in PanDA DB from /store/temp/user to /store/user after ASO?)

Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL

Page 14: PanDAMon Integration in CMS

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

SDC

14Author etc2013-05-16

Error reporting

• ASO failures reported to DB and visible in monitoring but not in “Error details”

• CMS transformation (job wrapper) exit code visible in PanDAMon, but not detailed error message - includes cmsRun messages

• Update links to support mail…

Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL

Page 15: PanDAMon Integration in CMS

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

SDC

15Author etc2013-05-16

Next steps

• Next week: deploy PanDAMon as-is on dev server in testbed setup

• When testbed setup is ready, start looking into reported issues

• Interact with PanDAMon developers to learn how to integrate new modules if needed by CMS– First session already done

• Reproduce deployment on prod server

Workshop on Analysis Tools Nicolò Magini CERN IT-SDC-OL