54
LCG Information and Mon itoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

Embed Size (px)

Citation preview

Page 1: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

LCG Information and Monitoring System

Jason ShihWLCG T2 Asia Workshop

Dec 2, 2006: TIFR

Page 2: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

Outline• Components of IS

• Introduction to MDS, GRIS, GIIS and BDII• Monitoring tools

• GridICE & GIIS monitor• Practicals:

• Job Requirement, ranking and FCR• High level tools

• lcg-infosites, lcg-info, glite-sd-query, lcg-ManageVOTag• Data management • Nice tool: LDAP browser• APEL accounting system

Page 3: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

Information System

• Provide information about the grid resources, and their status• Offering data conforms GLUE (Grid Laboratory for Uniform

Environment) Schema (common conceptual data model)• Main component of Glue schema trying to describe:

• CE (computing element)• SE (storage element)• Binding information for SE and CE

• From users’ point of view:• Where I can have available resource to accomplish my job

(requirement, ranking etc.)• Where I can register/replicate/copy files to/from (SE

attributes)• where I can find certain applications to run my job (VO Tag)• Where are the nearest resources (computing, storage fabric

etc.)

Page 4: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

MDS, GRIS, GIIS and BDII

Page 5: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

LDAP Protocol: the data model

dn: <distinguished name>objectclass:<objectclassname><attributetype>:<attributevalue><attributetype>:<attributevalue>

dn: <distinguished name>objectclass:<objectclassname><attributetype>:<attributevalue><attributetype>:<attributevalue>

This is an entry; collection of

attributes. It’s defined by a unique DN (Distinguished Name) Objectclass: special attr

ibutes

a) Defines the tree structure of a certain entry

b) Filters the entries of this objectclass

White space to separate entries from each other

The types and objectclass names should follow a schema (Glue Schema on LCG)

The information is imported and exported by

LDIF files (LDAP Data Interchange Format)

which follow such structure

Page 6: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

MDS (monitoring and discovery services)

• IP (information provide) running on computing and storage elements generate relevant information of the resources and published by the grid resource information server.

Page 7: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

Globus.conf: MDS

Content in globus configuration file:

….[mds]globus_flavor_name=gcc32dbgpthruser=edginfo

[mds/gris/provider/edg]

[mds/gris/registration/site]regname=Taiwan-LCG2reghn=lcg00125.grid.sinica.edu.tw

….

/etc/globus.conf:

Page 8: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

GRIS (Grid Resource Information Service)

• Local GRISes run on Computing/Storage elements and publish characteristic information and status of the services, which include:• Static information

• For SE: Service endpoint, SE unique ID, Schema Ver., AccessProtocolType, SAPath and SARoot

• For CE: SiteWeb, SiteName, SiteUniqueID, SysAdminContact etc. • Dynamic information

• For SE: AvailableSpace , UsedSpace wrt all supported VOs• For CE: FreeCPUs, RunningJobs, TotalJobs, WaitingJobs, EstimatedR

esponseTime

Page 9: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

Local GRIS (Cont’)

• ldapsearch -x -H ldap://`hostname`:2135 -b mds-vo-name=local,o=grid• Query from target host with “–h” associated with

proper port no., with “–p”• -x indicate that simple authentication will be used• -b option is adopted to specify the initial entry from

which starts the search in the LDAP tree

Page 10: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

GIIS (Grid Index Information Service)

• The role of GIIS is to collect info from all GRISes, and sources published from other GIIS, but it has shown its scalability limits with growing no. of sites.

• Due to this, BDII (Berkeley DB Information Index) was introduced.

• GIIS has been kept at site level to collect information from GRISes implemented at each resource fabric.

Page 11: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

BDII (Berkeley DB Information Index)

• BDII queries the GIISes and act as cache storing information about the grid status in its database.

• Each BDII contains information from site GIISes define by a configuration file in which BDII fork processes simultaneously and cutout unanswered queries after timeout.

• User and other Grid Services (such as RB) can interrogate with BDII to get information of grid status.

• Up2date information can be found directly from site GIISes or from local GRISes that run on specific resources.

Page 12: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

BDII configuration file (Site GIIS)

# more bdii.confBDII_PORT_READ=2170BDII_PORTS_WRITE="2171 2172 2173"BDII_USER=edguserBDII_BIND=mds-vo-name=Taiwan-LCG2,o=gridBDII_PASSWD=BDII_SEARCH_FILTER='*'BDII_SEARCH_TIMEOUT=30BDII_BREATHE_TIME=60BDII_AUTO_UPDATE=noBDII_AUTO_MODIFY=noBDII_DIR=/opt/bdii/BDII_UPDATE_URL=http://grid-deployment.web.cern.ch/grid-deployment/gis/lcg2-bdii/dteam/lcg2-all-sites.confBDII_UPDATE_LDIF=http://SLAPD=/usr/sbin/slapdSLAPADD=/usr/sbin/slapadd

Page 13: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

BDII update configurtion (Site GIIS)

# more bdii-update.confCE ldap://lcg00125.grid.sinica.edu.tw:2135/mds-vo-name=local,o=gridSE ldap://lcg00123.grid.sinica.edu.tw:2135/mds-vo-name=local,o=gridSE2 ldap://lcg00129.grid.sinica.edu.tw:2135/mds-vo-name=local,o=gridRB ldap://lcg00124.grid.sinica.edu.tw:2135/mds-vo-name=local,o=gridPX ldap://lcg00127.grid.sinica.edu.tw:2135/mds-vo-name=local,o=gridLFC ldap://lfc.grid.sinica.edu.tw:2135/mds-vo-name=local,o=gridCASTORSC ldap://lcg00115.grid.sinica.edu.tw:2135/mds-vo-name=local,o=gridDPM ldap://dpm01.grid.sinica.edu.tw:2135/mds-vo-name=local,o=grid

Page 14: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

Top BDII (global) Config file:BDII_PORT_READ=2170BDII_PORTS_WRITE="2171 2172 2173"BDII_USER=edguserBDII_BIND=mds-vo-name=local,o=gridBDII_PASSWD=BDII_SEARCH_FILTER='(|(objectClass=GlueSchemaVersion)(objectClass=GlueTop))'BDII_SEARCH_TIMEOUT=30BDII_BREATHE_TIME=60BDII_AUTO_UPDATE=yesBDII_AUTO_MODIFY=yesBDII_DIR=/opt/bdii/BDII_UPDATE_URL=http://grid-deployment.web.cern.ch/grid-deployment/gis/lcg2-bdii/dteam/lcg2-all-sites.confBDII_UPDATE_LDIF=https://goc.grid-support.ac.uk/gridsite/bdii/BDII/www/bdii-update.ldifSLAPD=/usr/sbin/slapdSLAPADD=/usr/sbin/slapadd

Page 15: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

Top BDII update config file:## Top Level BDII configuration file# ---------------------------------# This file is generated, DO NOT EDIT it directly

#TW-ASGC-WSTW-ASGC-WS ldap://ws45.twgrid.org:2170/mds-vo-name=TW-ASGC-WS,o=grid

#TW-ASGC-WS1TW-ASGC-WS1 ldap://ws52.twgrid.org:2170/mds-vo-name=TW-ASGC-WS1,o=grid

#TW-ASGC-WS2TW-ASGC-WS2 ldap://ws62.twgrid.org:2170/mds-vo-name=TW-ASGC-WS2,o=grid

#TW-ASGC-WS3TW-ASGC-WS3 ldap://ws72.twgrid.org:2170/mds-vo-name=TW-ASGC-WS3,o=grid

#TW-ASGC-WS4TW-ASGC-WS4 ldap://ws82.twgrid.org:2170/mds-vo-name=TW-ASGC-WS4,o=grid

Page 16: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

LCG Information System

LCGInformation System

GIP(Generic Information Provider)

Top level BDIIMDS(Monitoring and

Discovery System)

Site BDII

LDAPProtocol

Page 17: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

Components of IS: GRIS, GIIS & BDII

Each site can run

a BDII. It collects the information

coming from the GIISs % ldapsearch –x –h <BDII name>

-p 2170 –b “o=grid”

At each site, a site GIIS collects the information

given by the GRISs % ldapsearch –x –h <site GIIS> -p 2135 –b “mds-vo-name=<name>,o=grid”

Local GRISes run on CEs and SEs at each site and report dynamic and static information

% ldapsearch –x –h <node name> -p 2135 –b “mds-vo-name=local,o=grid”

Abbreviations:

BDII: Berkeley DataBase Information Index

GIIS: Grid Index Information

Server

GRIS: Grid Resource

Information Server

From LCG2.3.0 site GIIS has been replaced by“local” BDII

ldapsearch –x –h <local BDII> -p 2170 –b “o=grid”

Page 18: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

LCG Information System (cont’)

We try to figure out how the information is generated and transported among the nodes. Let’s have a look to the files…

After a while we will have such a basic picture in mind:

globus-mds This file is new!!!!

/opt/var

lcg-static.conf

lcg-dynamic

libexec

ldapadd

ldapsearch

Templates too!!

Files generatedby the system Files generated

by a deamon

Even more config files

Where does path come from?!?!?!?!

This file was not

here yesterday!

Template II

Let’s try to understand it, element by element:

1. Not paying so much attention to the code2. Just giving the fundamental FILES

Page 19: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

How information generated?

globus-mds:globus-mds:deamon scriptdeamon script

It’s a deamon, initialized at boot time. It runs the info-providers and has a 60s cache

grid-info-resource-ldif.conf

a) Created by the startup script of globus-mdsb) It begins the chain of scripts generating the information of the node

/opt/globus/etc/grid-info-resource-ldif.conf

dn: Mds-Vo-name=local,o=gridobjectclass: GlobusTopobjectclass: GlobusActiveObjectobjectclass: GlobusActiveSearchtype: execpath: /opt/lcg/libexecbase: lcg-info-wrapperargs:cachetime: 60timelimit: 20sizelimit: 250

Page 20: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

Info generation: lcg-info-wrappergrid-info-resource-ldif.conf

lcg-info-wrapperInterface between the info System and all providers

#!/bin/sh/opt/lcg/libexec/lcg-info-generic \ /opt/lcg/var/gip/lcg-info-generic.conf

Executed by the previous file

lcg-info-generic:

1. Read the file containing the static information Where does this file comes from?2. Execute dynamic plug-ins which produces the dynamic information Where is the script name included?3. GIP (Generic Information Provider) uses dynamic information built in cache Where is the path name defined?

This is included in its Argument: GIP Config Filelcg-info-generic.conf

GIPSCRIPT

Page 21: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

Configuration file for SEldif_file=/opt/lcg/var/gip/lcg-info-static.ldifgeneric_script=/opt/lcg/libexec/lcg-info-genericwrapper_script=/opt/lcg/libexec/lcg-info-wrappertemp_path=/opt/lcg/var/gip/tmptemplate=/opt/lcg/etc/GlueSite.templatetemplate=/opt/lcg/etc/GlueCE.templatetemplate=/opt/lcg/etc/GlueCESEBind.templatetemplate=/opt/lcg/etc/GlueSE.templatetemplate=/opt/lcg/etc/GlueService.template

# Common for allGlueInformationServiceURL: ldap://lcg00123.grid.sinica.edu.tw:2135/mds-vo-name=local,o=grid

dynamic_script=/opt/lcg/libexec/lcg-info-dynamic-se

GlueSEType: diskGlueSEPort: 2811GlueSESizeTotal: 0GlueSESizeFree: 0GlueSEArchitecture: diskGlueSAType: permanentGlueSAPolicyFileLifeTime: permanentGlueSAPolicyMaxFileSize: 10000

Contains the staticInformation in ldif format

Templates tocreate the information in ldif

Format: GIP Templates

df –k /flatfiles/SE00/VOThe dynamic information: used and available space,visible in $temp_path/lcg-info-synamic-classic.ldif

Page 22: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

Configuration file for CE

ldif_file=/opt/lcg/var/gip/lcg-info-static.ldifgeneric_script=/opt/lcg/libexec/lcg-info-genericwrapper_script=/opt/lcg/libexec/lcg-info-wrappertemp_path=/opt/lcg/var/gip/tmptemplate=/opt/lcg/etc/GlueSite.templatetemplate=/opt/lcg/etc/GlueCE.templatetemplate=/opt/lcg/etc/GlueCESEBind.templatetemplate=/opt/lcg/etc/GlueSE.templatetemplate=/opt/lcg/etc/GlueService.template

# Common for allGlueInformationServiceURL: ldap://lcg00125.grid.sinica.edu.tw:2135/mds-vo-name=local,o=grid

dn: GlueSiteUniqueID=Taiwan-LCG2,mds-vo-name=local,o=gridGlueSiteName: Taiwan-LCG2GlueSiteDescription: LCG Site….

dynamic_script=/opt/lcg/libexec/lcg-info-dynamic-cedynamic_script=/opt/lcg/libexec/lcg-info-dynamic-software /opt/lcg/var/gip/lcg-info-generic.conf

Contains the staticInformation in ldif format

Templates tocreate the information in ldif

Format: GIP Templates

Page 23: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

Dynamic information of CE

• /opt/lcg/libexec/lcg-info-dynamic-pbs

• /opt/lcg/libexec/lcg-info-dynamic-software

GlueCEInfoLRMSVersion, GlueCEInfoTotal(Free)CPUs, GlueCEPolicyMaxCPUTime(RunningJobs)(WallClockTime), GlueCEStateTotal(Waiting)(Running)Jobs, GlueCEStateWorst(Estimate)ResponseTime

Quantities calculated basically using:qstat –B –f pbsHost obtained from its argument:pbsnodes –a –s pbsHost lcg-info-generic.conf

GlueHostApplicationSoftwareRunTimeEnvironment

Obtained by reading the *list files stored under: /opt/edg/var/info/VO_nameThis files are created by running the script lcg-ManageVOTag

Page 24: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

Configuration files of RB

lcg-info-generic.conf:

ldif_file = /opt/lcg/var/gip/lcg-info-static.ldifgeneric_script = /opt/lcg/libexec/lcg-info-genericwrapper_script = /opt/lcg/libexec/lcg-info-wrappertemp_path = /opt/lcg/var/gip/tmp (EMPTY)template = /opt/lcg/etc/GlueService.template

(information relative to the node written following the Glue Schema)

ALL RB Information is static

Page 25: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

Interoperability of LCG IS

RB Local GRIS

SELocal GRIS

CE Local GRIS

BDII-A BDII-B

MSSLocal GRIS

PXLocal GRIS

CE Local GRIS

SELocal GRIS

BDII-C

CELocal GRIS

CE Site GIIS

CELocal GRIS

CE Site GIISCE

Local GRIS

CE Site GIIS

Site 1 Site 2 Site 3

Page 26: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

Monitoring Tools

Page 27: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

GridICE (Gris View)http://gridice2.cnaf.infn.it:50080/gridice/

Page 28: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

Gstat

GOCDB

SITE BSITE A

GSTAT SERVER (Taiwan)

ProductionBDII gstat runs

each 15 minutes through a cron

List of sites

Site A under testing

Agent 3BDII

Agent 2R-GMA

Agent 1ldapsearch

WEB PAGEFilter FilterFilter FilterFilterFilter

DataData Data Data Data Data

Page 29: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

GIIS Monitoring (global view)

sites

countries

totalCPU freeCPU runJob waitJob seAvail TB seUsed TB

maxCPU avgCPU

208 48 28156 12715 20759 170708 13086.9 8114.9 39755 31301

http://goc.grid.sinica.edu.tw/gstat/

Page 30: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

GIIS Monitoring (II): regional view

Page 31: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

GIIS (III): Site level profile(I) (II)

(III) (IV)

Page 32: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

Practical consideration using

LCG Information System

Page 33: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

Job description lang.: requirement and ranking

[Executable = “myexe";StdOutput = "std.out";StdError = "std.err";InputSandBox = {“myexe"};OutputSandBox = {"std.out","std.err"};rank = -other.GlueCEPolicyMaxCPUTime;DefaultRank = -other.GlueCEStateEstimatedResponseTime;VirtualOrganisation = "dteam";requirements = ( other.GlueCEUniqueID == "lcg00125.grid.sinica.edu.tw:2119/jobmanager-lcgpbs-dteam" ) && ( other.GlueCEStateStatus == "Production" ) && ( other.RunTimeEnvironment == "my_own_program_version_minor_version" ) ;]

Page 34: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

Example (I): Job ranking and requirement

• Atlas Production Monitoring Page:• Requirement:

• ( MaxCPUTime * CINT2000 >= 333333 ) && • ( Memory > 600 ) && • ( OutboundConnectivity == True ) && ( CEStatus == Productio

n ) • Ranking:

• if ( Waiting == 0 ){if ( Running == 0 ){ rank = FreeCPUs * 100 } else { rank = (FreeCPUs * 100) / ( Running + 1 ) } } else { rank = - Waiting / ( Running + 1 ) }

Page 35: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

Example (II): FCR (Freedom of Choices for Resources)

https://goc.grid-support.ac.uk/gridsite/bdii/site-apps/FCR-cgi/fcr.cgi

Page 36: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

Example (III):High level cmd: lcg-infosites (I)

$ lcg-infosites --vo dteam ce | grep sinica

**************************************************************These are the related data for dteam: (in terms of CEs)**************************************************************#CPU Free Total Jobs Running Waiting ComputingElement-------------------------------------------------------------------------------------------------

124 123 0 0 0 lcg00125.grid.sinica.edu.tw:2119/jobmanager-lcgpbs-dteam0 0 1 1 0 testbed001.phys.sinica.edu.tw:2119/jobmanager-lcgpbs-dteam

$ lcg-infosites --vo dteam ce -v 2 | head -10

**************************************************************These are the related data for dteam: (in terms of CEs)**************************************************************

RAMMemory Operating System System Version Processor CE Name----------------------------------------------------------------------------------------------------------------------------------------------2000 ScientificLinux 3 PIV lcg00125.grid.sinica.edu.tw1025 ScientificLinux 3 PIII testbed001.phys.sinica.edu.tw

Page 37: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

High level cmd: lcg-infosites (II)• Retrieving information of SE:

• Where are the closeSE?

$ lcg-infosites --vo twgrid se

**************************************************************These are the related data for twgrid: (in terms of SE)**************************************************************Avail Space(Kb) Used Space(Kb) Type SEs----------------------------------------------------------814664 13534300 se.cc.ncu.edu.tw33479944 29051084 melon035.ngpp.ngp.org.sg757116936 1200483004 lcg00123.grid.sinica.edu.tw3960000000 12180000000 castor.grid.sinica.edu.tw6740000000 580000000 castorsc.grid.sinica.edu.tw

$ lcg-infosites --vo twgrid closeSEName of the CE: ce.cc.ncu.edu.tw:2119/jobmanager-lcgpbs-twgridName of the close SE: se.cc.ncu.edu.tw

Name of the CE: melon.ngpp.ngp.org.sg:2119/jobmanager-lcgpbs-twgridName of the close SE: melon035.ngpp.ngp.org.sg

Name of the CE: lcg00125.grid.sinica.edu.tw:2119/jobmanager-lcgpbs-twgridName of the close SE: lcg00123.grid.sinica.edu.twName of the close SE: castor.grid.sinica.edu.twName of the close SE: castorsc.grid.sinica.edu.tw

Page 38: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

Arguments for lcg-infosites

Page 39: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

High level cmd: lcg-info

• This command can be used to list either CE or SE that satisfy a given set of conditions and to print the values of a given set of attributes

• Retrieved information from $LCG_GFAL_INFOSYS environment variable. E.g.:

$ env |grep GFALLCG_GFAL_INFOSYS=lcg00126.grid.sinica.edu.tw:2170

• [Quote] The query syntax is like this:• attr1 op1 valueN, ... attrN opN valueN

where attrN is an attribute name op is =, >= or <=, and the cuts are ANDed. The cuts are comma-separated and spaces are not allowed.

Page 40: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

High level cmd: lcg-info (II)

• Usage

lcg-info --list-ce [--bdii bdii] [--vo vo] [--sed] [--query query] [--attrs list]

lcg-info --list-se [--bdii bdii] [--vo vo] [--sed] [--query query] [--attrs list]

lcg-info --list-attrs

lcg-info --help

Page 41: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

High level cmd: lcg-info (III)

• Arguments of lcg-info:

Page 42: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

High level cmd: lcg-info (IV)Attribute name Glue object class Glue attribute name

MaxTime GlueCE GlueCEPolicyMaxWallClockTimeCEStatus GlueCE GlueCEStateStatusTotalJobs GlueCE GlueCEStateTotalJobsCEVOs GlueCE GlueCEAccessControlBaseRuleTotalCPUs GlueCE GlueCEInfoTotalCPUsFreeCPUs GlueCE GlueCEStateFreeCPUsCE GlueCE GlueCEUniqueIDWaitingJobs GlueCE GlueCEStateWaitingJobsRunningJobs GlueCE GlueCEStateRunningJobsCloseCE GlueCESEBindGroup GlueCESEBindGroupCEUniqueIDCloseSE GlueCESEBindGroup GlueCESEBindGroupSEUniqueIDSEVOs GlueSA GlueSAAccessControlBaseRuleUsedSpace GlueSA GlueSAStateUsedSpaceAvailableSpace GlueSA GlueSAStateAvailableSpaceType GlueSE GlueSETypeSE GlueSE GlueSEUniqueIDProtocol GlueSEAccessProtocol GlueSEAccessProtocolTypeArchType GlueSL GlueSLArchitectureTypeProcessor GlueSubCluster GlueHostProcessorModelOS GlueSubCluster GlueHostOperatingSystemNameCluster GlueSubCluster GlueSubClusterUniqueIDTag GlueSubCluster GlueHostApplicationSoftwareRunTimeEnvironmentMemory GlueSubCluster GlueHostMainMemoryRAMSize

Page 43: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

• Check Sepc info from TWGrid VO:

• List available Tags, say CMS, from TWGrid VO:

High level cmd: lcg-info (V)

$ lcg-info --list-ce --vo twgrid --attrs 'CINT2000,CFP2000'- CE: ce.cc.ncu.edu.tw:2119/jobmanager-lcgpbs-twgrid - CINT2000 1107 - CFP2000 0

- CE: lcg00125.grid.sinica.edu.tw:2119/jobmanager-lcgpbs-twgrid - CINT2000 1378 - CFP2000 1583

- CE: melon.ngpp.ngp.org.sg:2119/jobmanager-lcgpbs-twgrid - CINT2000 381 - CFP2000 0

$ lcg-info --list-ce --vo twgrid --attrs 'Tag' | grep cms VO-cms-slc3_ia32_gcc323 VO-cms-OSCAR_3_6_5 VO-cms-ORCA_8_13_1 VO-cms-FAMOS_1_3_2 VO-cms-ORCA_8_7_4 VO-cms-OSCAR_3_9_8

Page 44: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

glite-sd-query• glite-sd-query help searching for suitable service type of s

torage element publish from specific site, and extensive information are also provided.

• Basic query, SRM service endpoint from specific SE:

• Print more details, including the SAPath and supported VOs associated to the mount point:

[lcg00122] ~> glite-sd-query --host castorsc.grid.sinica.edu.tw -t SRMName: httpg://castorsc.grid.sinica.edu.tw:8443/srm/managerv1Type: SRMEndpoint: httpg://castorsc.grid.sinica.edu.tw:8443/srm/managerv1Version: 1.1.0

[lcg00122] ~> glite-sd-query --host castorsc.grid.sinica.edu.tw -t SRM –x |grep Mount Key: cms:SEMountPoint - Value: /castor/grid.sinica.edu.tw/sc/cms Key: alice:SEMountPoint - Value: /castor/grid.sinica.edu.tw/sc/alice Key: atlas:SEMountPoint - Value: /castor/grid.sinica.edu.tw/sc/atlas Key: dteam:SEMountPoint - Value: /castor/grid.sinica.edu.tw/sc/dteam Key: apesci:SEMountPoint - Value: /castor/grid.sinica.edu.tw/sc/apesci Key: biomed:SEMountPoint - Value: /castor/grid.sinica.edu.tw/sc/biomed Key: twgrid:SEMountPoint - Value: /castor/grid.sinica.edu.tw/sc/twgrid

Page 45: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

lcg-ManageVOTag

• lcg-ManageVOTag help retrieving all VO tag publish from site CE, that end users can easily get updated status of s/w installation/validation from IS directly.

[lcg00122] ~> lcg-ManageVOTag -host lcg00125.grid.sinica.edu.tw -vo atlas --listVO-atlas-release-11.0.42VO-atlas-release-11.0.5VO-atlas-release-11.5.0VO-atlas-production-12.0.3VO-atlas-tier-T1VO-atlas-cloud-TWVO-atlas-production-12.0.31

Page 46: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

Example (IV): APEL Accounting

http://goc.grid-support.ac.uk/gridsite/accounting/interoperate.html

Page 47: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

APROC: Accounting

Page 48: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

Example (V): GridView:http://lxgate24.cern.ch/GRIDVIEW/

Page 49: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

LDAP browser (win + linux)

• Download from http://www-unix.mcs.anl.gov/~gawor/ldap/download.html (latest version is 2.8.2)

Page 50: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

LDAP Browser: SE (Castor)

Page 51: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

LDAP Browser: VO Tag

Page 52: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

lcg utilities (lcg-cr, lcg-rep etc.)• lcg-cr -v --vo dteam -d lcg00123.grid.sinica.edu.tw file:///e

tc/group -l lfn:test• By default, w/o –d, lcg-cr will adopt $VO_DTEAM_DEFAULT_SE as

default target SE to register the file• Adopt GlueSAPath from IS in which the file will be located

• e.g.: $ ldapsearch -x -H ldap://lcg00123.grid.sinica.edu.tw:2135 -b mds-vo-name=local,o=grid | grep dteam | grep GlueSAPath

• Before LFC introduced, RLS service endpoint have to be included in IS as well if specify “-l” option (troubleshooting).

• lcg-rep -v --vo dteam -d castor.grid.sinica.edu.tw lfn:test• Name space (cf. to abs path) retrieved from IS, e.g.:

• GlueSAPath: /castor/grid.sinica.edu.tw/grid/dteam• If service unique Id is missing from IP, replication fail (check gstat)• Given “-p” to def relative path for the files being replicated to.

Page 53: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

Summary• LCG information system consist by:

• GIP, which generate static/dynamic information to:•Site BDII as well as global (top) BDII.

• MDS• BDII

• Users retrieve information from LDAP based IS• High level command available in LCG

• Gstat have been widely used to profiling and monitoring site/global level IS problem, continuously provide grid resource status

Page 54: LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR

References

• R-GMA overview page.• http://www.r-gma.org/

• R-GMA in EGEE• http://hepunx.rl.ac.uk/egee/jra1-uk/

• R-GMA Documenation• http://hepunx.rl.ac.uk/egee/jra1-uk/glite-r1/

• GLUE Schema• http://infnforge.cnaf.infn.it/glueinfomodel/