Upload
thomas-montgomery
View
214
Download
0
Embed Size (px)
Citation preview
www.eu-eela.eu
gLite Information System
UNIANDES OOD Team
Daniel Alberto Burbano Sefair, [email protected]
Michael Angel Pérez Cabarcas, [email protected]
Universidad de Los Andes, (Colombia)
26-27, Febreary 2009
www.eu-eela.eu
Overview• BDII Introduction
• BDII Structure of oper.vo.eu-eela.eu and prod.vo.eu-eela.eu.
• Getting information with lcg-infosites and lcg-info
• Learned Experiences
• Questions to consolidate knowledge
• References
• Questions
www.eu-eela.eu
BDII Introduction• What is?
– System to collect information on the state of resources
• Used for?– To discover resources of the grid and their state– Workload management (WMS)– Monitoring (health status of resources)
• How?– Monitoring state of the resources locally and publishing fresh
data on the information system.– Adopting a data model using for all components.
BDII (Berkeley DB Information System) BDII is a system information based on LDAP (Light Direct Access
Protocol). LDAP is application level protocol that allows access to a directory
service
www.eu-eela.eu
BDII Structure (from lcg-info point of view)
The GRIS can be LFC, WMS, LB
www.eu-eela.eu
lcg-infosites using TopBDIIs (First Level)
[michael@yali ~]$ lcg-infosites --vo prod.vo.eu-eela.eu ce
#CPU Free Total Jobs Running Waiting ComputingElement
----------------------------------------------------------
771 270 0 0 0 gridgate.cs.tcd.ie:2119/jobmanager-pbs-asixhour
771 270 1 1 0 gridgate.cs.tcd.ie:2119/jobmanager-pbs-dthreeday
771 270 201 200 1 gridgate.cs.tcd.ie:2119/jobmanager-pbs-coneday
771 270 0 0 0 gridgate.cs.tcd.ie:2119/jobmanager-pbs-bthirtym
96 50 0 0 0 ce-eela.ciemat.es:2119/jobmanager-lcgpbs-prod_eela
104 104 0 0 0 ce-eela.ceta-ciemat.es:2119/jobmanager-lcgpbs-prod
146 78 1 1 0 ce01.macc.unican.es:2119/jobmanager-lcgpbs-eelaprod
28 28 0 0 0 ce01.eela.if.ufrj.br:2119/jobmanager-lcgpbs-prod
290 86 0 0 0 grid012.ct.infn.it:2119/jobmanager-lcglsf-prod
112 88 0 0 0 ce.eela.cesga.es:2119/jobmanager-lcgsge-eelaprod
22 22 0 0 0 ramses.dsic.upv.es:2119/jobmanager-lcgpbs-eela
• Getting information of the CEs from the TopBDII: bdii.eela.ufrj.br
[michael@yali ~]$ echo $LCG_GFAL_INFOSYS
bdii.eela.ufrj.br:2170
www.eu-eela.eu
[michael@yali ~]$ lcg-infosites --vo prod.vo.eu-eela.eu ce --is bdii-eela.ceta-ciemat.es
#CPU Free Total Jobs Running Waiting ComputingElement
----------------------------------------------------------
771 269 1 1 0 gridgate.cs.tcd.ie:2119/jobmanager-pbs-asixhour
771 269 3 3 0 gridgate.cs.tcd.ie:2119/jobmanager-pbs-dthreeday
771 269 201 200 1 gridgate.cs.tcd.ie:2119/jobmanager-pbs-coneday
771 269 0 0 0 gridgate.cs.tcd.ie:2119/jobmanager-pbs-bthirtym
96 49 2 2 0 ce-eela.ciemat.es:2119/jobmanager-lcgpbs-prod_eela
104 104 0 0 0 ce-eela.ceta-ciemat.es:2119/jobmanager-lcgpbs-prod
146 78 1 1 0 ce01.macc.unican.es:2119/jobmanager-lcgpbs-eelaprod
52 28 0 0 0 kuragua.uniandes.edu.co:2119/jobmanager-lcgpbs-prod
28 28 0 0 0 ce01.eela.if.ufrj.br:2119/jobmanager-lcgpbs-prod
290 81 0 0 0 grid012.ct.infn.it:2119/jobmanager-lcglsf-prod
112 88 0 0 0 ce.eela.cesga.es:2119/jobmanager-lcgsge-eelaprod
22 22 0 0 0 ramses.dsic.upv.es:2119/jobmanager-lcgpbs-eela
• Getting information of the CEs from the TopBDII: bdii-eela.ceta-ciemat.es
[michael@yali ~]$ echo $LCG_GFAL_INFOSYS
bdii.eela.ufrj.br:2170
lcg-infosites using TopBDIIs (First Level)
www.eu-eela.eu
TopBDII: bdii.eela.ufrj.brVO: oper.vo.eu-eela.eu
[michael@yali ~]$ lcg-infosites --vo oper.vo.eu-eela.eu ce
#CPU Free Total Jobs Running Waiting ComputingElement
----------------------------------------------------------
771 273 0 0 0 gridgate.cs.tcd.ie:2119/jobmanager-pbs-asixhour
771 273 1 1 0 gridgate.cs.tcd.ie:2119/jobmanager-pbs-dthreeday
771 273 201 200 1 gridgate.cs.tcd.ie:2119/jobmanager-pbs-coneday
771 273 0 0 0 gridgate.cs.tcd.ie:2119/jobmanager-pbs-bthirtym
96 52 0 0 0 ce-eela.ciemat.es:2119/jobmanager-lcgpbs-oper
22 22 0 0 0 ramses.dsic.upv.es:2119/jobmanager-lcgpbs-edteam
28 28 0 0 0 ce01.eela.if.ufrj.br:2119/jobmanager-lcgpbs-oper
290 87 0 0 0 grid012.ct.infn.it:2119/jobmanager-lcglsf-oper
146 78 0 0 0 ce01.macc.unican.es:2119/jobmanager-lcgpbs-eelaoper
104 104 0 0 0 ce-eela.ceta-ciemat.es:2119/jobmanager-lcgpbs-oper
112 88 0 0 0 ce.eela.cesga.es:2119/jobmanager-lcgsge-eelaoper
[michael@yali ~]$ lcg-infosites --vo oper.vo.eu-eela.eu ce –is bdii-eela.ceta-ciemat.es
#CPU Free Total Jobs Running Waiting ComputingElement
----------------------------------------------------------
771 273 0 0 0 gridgate.cs.tcd.ie:2119/jobmanager-pbs-asixhour
771 273 1 1 0 gridgate.cs.tcd.ie:2119/jobmanager-pbs-dthreeday
771 273 201 200 1 gridgate.cs.tcd.ie:2119/jobmanager-pbs-coneday
771 273 0 0 0 gridgate.cs.tcd.ie:2119/jobmanager-pbs-bthirtym
96 52 0 0 0 ce-eela.ciemat.es:2119/jobmanager-lcgpbs-oper
22 22 0 0 0 ramses.dsic.upv.es:2119/jobmanager-lcgpbs-edteam
28 28 0 0 0 ce01.eela.if.ufrj.br:2119/jobmanager-lcgpbs-oper
290 87 0 0 0 grid012.ct.infn.it:2119/jobmanager-lcglsf-oper
146 78 0 0 0 ce01.macc.unican.es:2119/jobmanager-lcgpbs-eelaoper
104 104 0 0 0 ce-eela.ceta-ciemat.es:2119/jobmanager-lcgpbs-oper
112 88 0 0 0 ce.eela.cesga.es:2119/jobmanager-lcgsge-eelaoper
Getting information of the CE
TopBDII: bdii-eela.ceta-ciemat.esVO: oper.vo.eu-eela.eu
lcg-infosites using TopBDIIs (First Level)
www.eu-eela.eu
[michael@yali ~]$ lcg-infosites --vo prod.vo.eu-eela.eu all --is ce-eela.ceta-ciemat.es
#CPU Free Total Jobs Running Waiting ComputingElement
----------------------------------------------------------
104 104 0 0 0 ce-eela.ceta-ciemat.es:2119/jobmanager-lcgpbs-prod
Avail Space(Kb) Used Space(Kb) Type SEs
----------------------------------------------------------
66020000 1936426 n.a se-eela.ceta-ciemat.es
lcg-infosites using SiteBDIIs (Second level)
The SiteBDII shows the information of the CE and SE.
In the second case (2) the SiteBDII is located in the CE.
[michael@yali ~]$ lcg-infosites --vo prod.vo.eu-eela.eu all --is piaroa.uniandes.edu.co
#CPU Free Total Jobs Running Waiting ComputingElement
----------------------------------------------------------
52 28 0 0 0 kuragua.uniandes.edu.co:2119/jobmanager-lcgpbs-prod
Avail Space(Kb) Used Space(Kb) Type SEs
----------------------------------------------------------
427900000 42 n.a moboro.uniandes.edu.co1
2
www.eu-eela.eu
[michael@yali ~]$ lcg-infosites --vo prod.vo.eu-eela.eu all --is kuragua.uniandes.edu.co
#CPU Free Total Jobs Running Waiting ComputingElement
----------------------------------------------------------
52 28 0 0 0 kuragua.uniandes.edu.co:2119/jobmanager-lcgpbs-prod
[michael@yali ~]$ lcg-infosites --vo prod.vo.eu-eela.eu all --is ce01.eela.if.ufrj.br
#CPU Free Total Jobs Running Waiting ComputingElement
----------------------------------------------------------
28 28 0 0 0 ce01.eela.if.ufrj.br:2119/jobmanager-lcgpbs-prod
[michael@yali ~]$ lcg-infosites --vo prod.vo.eu-eela.eu all --is lnx105.eela.if.ufrj.br
Avail Space(Kb) Used Space(Kb) Type SEs
----------------------------------------------------------
837640000 3281642 n.a lnx105.eela.if.ufrj.br
lcg-infosites using GRIS (Third level)
www.eu-eela.eu
Getting Information entre CE y CE
[michael@yali ~]$ lcg-infosites --vo prod.vo.eu-eela.eu closeSE
Name of the CE: ce-eela.ciemat.es:2119/jobmanager-lcgpbs-prod_eela
se-eela.ciemat.es
se-eela.ciemat.es
Name of the CE: ce-eela.ceta-ciemat.es:2119/jobmanager-lcgpbs-prod
se-eela.ceta-ciemat.es
Name of the CE: ce01.macc.unican.es:2119/jobmanager-lcgpbs-eelaprod
se01.macc.unican.es
Name of the CE: kuragua.uniandes.edu.co:2119/jobmanager-lcgpbs-prod
moboro.uniandes.edu.co
Name of the CE: ce01.eela.if.ufrj.br:2119/jobmanager-lcgpbs-prod
lnx105.eela.if.ufrj.br
Name of the CE: grid012.ct.infn.it:2119/jobmanager-lcglsf-prod
aliserv1.ct.infn.it
This command gets the SE closest to the CE that belongs to the same site
What happen with 2 SEs that belongs to the same Site?In theory, there is an algortihm that compare the geographic location with the velocity of the access way of the storage element. There is a variable that declared the closest SE to the CE.
www.eu-eela.eu
[michael@yali ~]$ lcg-info --list-attrsAttribute name Glue object class Glue attribute name WorstRespTime GlueCE GlueCEStateWorstResponseTime CEAppDir GlueCE GlueCEInfoApplicationDir TotalCPUs GlueCE GlueCEInfoTotalCPUs MaxRunningJobs GlueCE GlueCEPolicyMaxRunningJobs CE GlueCE GlueCEUniqueID WaitingJobs GlueCE GlueCEStateWaitingJobs MaxCPUTime GlueCE GlueCEPolicyMaxCPUTime LRMSVersion GlueCE GlueCEInfoLRMSVersion MaxTotalJobs GlueCE GlueCEPolicyMaxTotalJobs CEStatus GlueCE GlueCEStateStatus LRMS GlueCE GlueCEInfoLRMSType CEVOs GlueCE GlueCEAccessControlBaseRule AssignedJobSlots GlueCE GlueCEPolicyAssignedJobSlots FreeCPUs GlueCE GlueCEStateFreeCPUs RunningJobs GlueCE GlueCEStateRunningJobs EstRespTime GlueCE GlueCEStateEstimatedResponseTimeFreeJobSlots GlueCE GlueCEStateFreeJobSlots Cluster GlueCE GlueCEInfoHostName
lcg-info (List attributes)
This command shows the attributes used to get information about the VOs
[michael@yali ~]$ echo $LCG_GFAL_INFOSYS
bdii.eela.ufrj.br:2170
The first column is used to get information of the Grid infraestrucuture usign lcg-info command. The third column is used as conditions in the JDL.
www.eu-eela.eu
[michael@yali ~]$ lcg-info --vo prod.vo.eu-eela.eu --list-ce --attrs Tag
- CE: ce-eela.ceta-ciemat.es:2119/jobmanager-lcgpbs-prod - Tag LCG-2 LCG-2_1_0 LCG-2_1_1 LCG-2_2_0 LCG-2_3_0 LCG-2_3_1 LCG-2_4_0 LCG-2_5_0 LCG-2_6_0 LCG-2_7_0 GLITE-3_0_0 GLITE-3_0_1 GLITE-3_1_0 R-GMA MPICH
- CE: ce-eela.ciemat.es:2119/jobmanager-lcgpbs-prod_eela - Tag LCG-2 LCG-2_1_0 LCG-2_1_1
lcg-info (Applications)
This command lists the CEs with their applications that can be executed in the VO (prod.vo.eu-eela.eu).
www.eu-eela.eu
[michael@yali ~]$ lcg-info --vo prod.vo.eu-eela.eu --list-ce --attrs Tag –bdii piaroa.uniandes.edu.co:2170
- CE: kuragua.uniandes.edu.co:2119/jobmanager-lcgpbs-prod - Tag LCG-2 LCG-2_1_0 LCG-2_1_1 LCG-2_2_0 LCG-2_3_0 LCG-2_3_1 LCG-2_4_0 LCG-2_5_0 LCG-2_6_0 LCG-2_7_0 GLITE-3_0_0 GLITE-3_0_1 GLITE-3_0_2 R-GMA GAUSSIAN03 RASTER3D ANGA-1.2.10 VO-cms-CMSSW_1_6_12 VO-cms-CMSSW_1_8_4
This command shows the applications that can be executed in the CE of kuragua.uniandes.edu.co.
(prod.vo.eu-eela.eu).o
lcg-info (List applications of a Site)
www.eu-eela.eu
[michael@yali ~]$ lcg-info --vo prod.vo.eu-eela.eu --list-se -attrs CloseCE
- SE: gridstore.cs.tcd.ie - CloseCE gridgate.cs.tcd.ie:2119/jobmanager-pbs-gridwebcom gridgate.cs.tcd.ie:2119/jobmanager-pbs-bthirtym gridgate.cs.tcd.ie:2119/jobmanager-pbs-asixhour gridgate.cs.tcd.ie:2119/jobmanager-pbs-dthreeday gridgate.cs.tcd.ie:2119/jobmanager-pbs-himem gridgate.cs.tcd.ie:2119/jobmanager-pbs-coneday gridgate.cs.tcd.ie:2119/jobmanager-pbs-twoweek gridgate.cs.tcd.ie:2119/jobmanager-pbs-test
- SE: moboro.uniandes.edu.co - CloseCE kuragua.uniandes.edu.co:2119/jobmanager-lcgpbs-prod kuragua.uniandes.edu.co:2119/jobmanager-lcgpbs-cms kuragua.uniandes.edu.co:2119/jobmanager-lcgpbs-oper kuragua.uniandes.edu.co:2119/jobmanager-lcgpbs-dteam kuragua.uniandes.edu.co:2119/jobmanager-lcgpbs-ops
lcg-info --vo prod.vo.eu-eela.eu --list-ce -attrs CloseSE --bdii ce-eela.ciemat.es:2170
This command list the closest queues of the CEs to the SEs.o
lcg-info (List the closest queues)
Using an especific site.
www.eu-eela.eu
[michael@yali ~]$ lcg-info --vo prod.vo.eu-eela.eu --list-ce --query 'TotalCPUs = 52' --attrs 'RunningJobs,FreeCPUs'
- CE: kuragua.uniandes.edu.co:2119/jobmanager-lcgpbs-prod
- RunningJobs 0
- FreeCPUs 27
lcg-info (given a query)
This command request a CE that has 52 CPUs and show the attributes:Running jobs and free CPUs
www.eu-eela.eu
• The following files are used to public the application list that are installed in the sites. This file is located in the CE.
– /opt/edg/var/info/prod.vo.eu-eela.eu
– /opt/edg/var/info/oper.vo.eu-eela.eu
• Restart the computer to get the new values. We don`t know the jobs that control that. If you know send a email.
Learned experiences
www.eu-eela.eu
Learned experiences
• Error Message 1
• Description– The WMS can´t select a CE, but the job can be submitted with the glite-
wms-job-submit –a –o output –r job.jdl
• Detection– One of the possible errors, in this case, the BDII_HOST variable inside of
site-info.def is wrong. It must be equal to the TopBDII.
[dburbano@nobsa ~]$ glite-wms-job-list-match -a -o id –vo VO-name job.jdl
Warning - --vo option ignored
Connecting to the service https://bache.uniandes.edu.co:7443/glite_wms_wmproxy_server
Error - Operation failed
Unable to perform the operation: The Operation is not allowed: Error during matchmaking: Problems during rank evaluation (e.g. GRISes down, wrong JDL rank expression, etc.)
Method: jobListMatch
www.eu-eela.eu
Learned experiences (Solution to the error message 1)
1. Changes the value of BDII_HOST, inside of site-info.def, with the hostname of the TopBDII (bdii.eela.ufrj.br or bdii-eela.ceta-ciemat.es) find on http://eoc.eu-eela.eu/doku.php?id=central_services and yaim again.
2. Verifies in the site-info.def the following variables are correctly declared:1. BDII_REGIONS="CE SE LFC PX MON" # list of the services provided by the site
2. BDII_CE_URL="ldap://$CE_HOST:2170/mds-vo-name=local,o=grid"
3. BDII_SE_URL="ldap://$DPM_HOST:2170/mds-vo-name=local,o=grid"
4. BDII_RB_URL="ldap://$RB_HOST:2170/mds-vo-name=local,o=grid"
3. Verify that the “mds-vo-name” variable of the BDIIs (TopBDII, SiteBDII,GRIS) in /opt/bdii/etc/bdii.conf must be configured with the same value:
BDII_MODIFY_DN=yes
BDII_BIND=mds-vo-name=local,o=grid
Question: The following parameters are correct for gLite 3.1 and gLite 3.0? NO
With gLite 3.1 we use XX_HOST:2170/mds-vo-name=local
With gLite 3.0 we use XX_HOST:2135/mds-vo-name=resource
In theory to mds-vo-name=?
TopBDII must be local
SiteBDII must bdii SiteName
GRIS must be resources
The next slide shows more information
www.eu-eela.eu gLite
Tutorial
19
Resource information: GRIS
• Generic Information Provider (GIP):– Configurable information provider that makes a separation
between static and dynamic information.– Produces “ldif” files and publishes in LDAP servers.– Information can be retrieved contacting a given port
ldapsearch -x -H ldap://<Resource DN>:2135 -b mds-vo-name=local,o=grid
globus-mds
ldapsearch -x -H ldap://<Resource DN>:2170 -b mds-vo-name=resource,o=grid
BDII
ldapsearch -x -H ldap://<SiteBDII DN>:2170 -b mds-vo-name=<SITE NAME>,o=grid
Site BDII
ldapsearch -x -H ldap://<Resource DN>:2170 -b mds-vo-name=local,o=gridTop BDII
www.eu-eela.eu
Learned experiences (SiteBDII files) 1/2
SiteBDII Configuration of
the bdii.conf file.
The configutration time do this very well, so don´t touch. This come from site-info.def file.
www.eu-eela.eu
Learned experiences (SiteBDII files) 2/2
SiteBDII Configuration of the bdii-update.conf file.
Do the framed lines after GIP line are necessary?
You don´t public the framed lines to the TopBDIIs, because allow VO users use this resources.
www.eu-eela.eu
References
• GILDA Tutorials– https://grid.ct.infn.it/twiki/bin/view/GILDA/InformationSystems
• BDII Documentation– https://twiki.cern.ch/twiki/bin/view/EGEE/BDII– https://twiki.cern.ch/twiki/bin/view/LCG/BdiiNotes
• LCG-2 User Guide– https://edms.cern.ch/file/454439//LCG-2-UserGuide.html
• GLUE Schema– http://infnforge.cnaf.infn.it/glueinfomodel
www.eu-eela.eu
Some Exercises
• Topic related Wiki pages:– https://grid.ct.infn.it/twiki/bin/view/GILDA/InformationSystems
www.eu-eela.eu
Questions…