22
25.02.2008 AstroGrid-D Monitoring AstroGrid-D Meeting @ AIP 25.-26.2.2008 Frank Breitling Stephan Braune

AstroGrid-D Monitoring

  • Upload
    breck

  • View
    43

  • Download
    0

Embed Size (px)

DESCRIPTION

AstroGrid-D Monitoring. AstroGrid-D Meeting @ AIP 25.-26.2.2008 Frank Breitling Stephan Braune. Contents. Host Monitoring (for compute resources) Status Goals until the end of the project Perspectives beyond the project Robotic Telescope Monitoring Status - PowerPoint PPT Presentation

Citation preview

Page 1: AstroGrid-D Monitoring

25.02.2008

AstroGrid-D Monitoring

AstroGrid-DMeeting

@ AIP

25.-26.2.2008

Frank BreitlingStephan Braune

Page 2: AstroGrid-D Monitoring

2Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008

Contents

Host Monitoring (for compute resources) Status Goals until the end of the project Perspectives beyond the project

Robotic Telescope Monitoring Status Goals until the end of the project Perspectives beyond the project

Page 3: AstroGrid-D Monitoring

3Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008

Host Monitoring Status

Since Dec. 2007 AGD monitoring solution

It builds on Audit Logging provided by Globus Toolkit V4.0.5 and later PostgreSQL Database (DB) DB Triggers Usage Records (UR) XML format (http://staff.psc.edu/lfm/PSC/Grid/UR-WG/) XML2RDF XSLT Stellaris SPARQL queries

A test setup is running at the AIP since Dec. 2007

Page 4: AstroGrid-D Monitoring

4Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008

Use

r W

orks

tatio

n

GlobusClient

Globus gridResource

Browser

AuditDatabase

StellarisStellarisRDF-

DatabaseEarlier: status information via

EPR-files and monitoring.pl

globusrun_ws

globus_job_run

Trigger

SPARQL QueriesTimelines

curl

AGD Monitoring Architecture

Page 5: AstroGrid-D Monitoring

5Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008

Changes in the Globus Toolkit configuration:

in $GLOBUS_LOCATION/container-log4j.properties:...# GRAM AUDITlog4j.category.org.globus.exec.service.exec.StateMachine.audit=DEBUG, AUDITlog4j.appender.AUDIT=org.globus.exec.utils.audit.AuditDatabaseAppenderlog4j.appender.AUDIT.layout=org.apache.log4j.PatternLayoutlog4j.additivity.org.globus.exec.service.exec.StateMachine.audit=false

output to database (PostgreSQL or MySQL), Database Connection has to be declared in $GLOBUS_LOCATION/etc/gram-service/jndi-config.xml:

<resource ...> <resourceParams> ... <parameter> <name>url</name><value>jdbc:mysql://<host>[:port]/auditDatabase</value> </parameter> <parameter><name>user</name><value>globus</value></parameter> <parameter><name>password</name><value>foo</value></parameter> ... </resourceParams></resource>

table update whenever a job ist started or changed it's status (contrary to SAGAS)

database content is converted into Usage Record format and sent to Stellaris via DB triggers

Activation of Audit Logging in Globusfor WS GRAM (globusrun-ws)

Page 6: AstroGrid-D Monitoring

6Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008

Activation of Audit Logging in Globusfor Pre WS GRAM (globus-job-run)

Changes in the Globus Toolkit configuration:

in $GLOBUS_LOCATION/log4j.properties:...# GRAM AUDITlog4j.category.org.globus.exec.service.exec.StateMachine.audit=DEBUG, AUDITlog4j.appender.AUDIT=org.globus.exec.utils.audit.AuditDatabaseAppenderlog4j.appender.AUDIT.layout=org.apache.log4j.PatternLayoutlog4j.additivity.org.globus.exec.service.exec.StateMachine.audit=false

text file output has to be configured in $GLOBUS_LOCATION/etc/globus-job-manager.conf:

-audit-directory /tmp/globus

file is converted into Usage Record format and sent to Stellaris via a cron job

Page 7: AstroGrid-D Monitoring

7Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008

Audit Fields in PostgreSQL DB

Page 8: AstroGrid-D Monitoring

8Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008

DB Trigger

CREATE FUNCTION update_stellaris() RETURNS "trigger" AS $update_stellaris$use strict;use URI;use Net::hostent;use XML::Writer;use HTTP::Request;use LWP::UserAgent;

my $job_grid_id = URI->new($_TD->{new}{job_grid_id});my $id = unpack("H*", $job_grid_id->query()); my $host=gethost($job_grid_id->host())->name();my $usage_record = "";my $writer = XML::Writer->new(OUTPUT => \$usage_record, NEWLINES => 1, UNSAFE => 1);$writer->xmlDecl("UTF-8");$writer->startTag("JobUsageRecord", "xmlns" => "http://www.gridforum.org/2003/ur-wg#", ...); $writer->startTag("RecordIdentity"); $writer->dataElement("LocalJobId", $_TD->{new}{local_job_id}); $writer->endTag("RecordIdentity"); ..... $writer->raw($_TD->{new}{job_description}); $writer->dataElement("success_flag", $_TD->{new}{success_flag}); $writer->dataElement("finished_flag", $_TD->{new}{finished_flag});$writer->endTag("JobUsageRecord"); $writer->end();

my $req = HTTP::Request->new("PUT", "http://stellaris.astrogrid-d.org/files/hosts/".$host."/urs/".$id, HTTP::Headers->new(Content_Length => length($usage_record)), $usage_record);my $ua = LWP::UserAgent->new(); my $res = $ua->request($req); ..... return;$update_stellaris$ LANGUAGE plperlu;

CREATE TRIGGER update_stellaris_trig BEFORE INSERT OR UPDATE ON gram_audit_table FOR EACH ROW EXECUTE PROCEDURE update_stellaris();

The triggers are installte in the PostgreSQL DB using: audit=# \i trigger.sql Documentation is available at AGD intranet: http://mintaka.aip.de:8080/lenya/intranet/live/workpackages/wg2/GRAM_audit_logging.pdf

Page 9: AstroGrid-D Monitoring

9Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008

SPARQL Queries for Usage Statistics

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

PREFIX ur: <http://www.gridforum.org/2003/ur-wg#>

PREFIX x2r: <http://www.astrogrid-d.org/2007/08/14-xml2rdf#>

SELECT ?job_grid_id ?GlobalUserName ?SubmitHost ?executable

?creation_time ?StartTime ?EndTime ?wdv ?Count ?CPU_Time

WHERE { graph ?g {

?n1 ur:JobIdentity ?JobIdentity .

?JobIdentity ur:job_grid_id ?job_grid_id .

?n1 ur:UserIdentity ?UserIdentity .

?UserIdentity ur:GlobalUserName ?GlobalUserName .

?n1 ur:creation_time ?creation_time .

?n1 ur:SubmitHost ?SubmitHost .

OPTIONAL { ?n1 ur:StartTime ?StartTime .

?n1 ur:EndTime ?EndTime . }

OPTIONAL { ?n1 ur:WallDuration ?wall_duration .

?wall_duration x2r:value ?wdv . }

OPTIONAL { ?n1 ur:Resource ?res .

?res x2r:value ?executable . }

OPTIONAL { ?n1 ur:Count ?Count . }

OPTIONAL { ?n1 ur:CPU_Time ?CPU_Time . }

}} ORDER BY DESC(?creation_time) LIMIT 25

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

PREFIX ur: <http://www.gridforum.org/2003/ur-wg#>

PREFIX x2r: <http://www.astrogrid-d.org/2007/08/14-xml2rdf#>

SELECT distinct ?GlobalUserName ?executable ?SubmitHost sum(?CPU_Time)

WHERE {

graph ?g {

?n1 ur:JobIdentity ?JobIdentity .

?JobIdentity ur:job_grid_id ?job_grid_id .

?n1 ur:UserIdentity ?UserIdentity .

?UserIdentity ur:GlobalUserId ?GlobalUserName .

?n1 ur:SubmitHost ?SubmitHost .

?n1 ur:CPU_Time ?CPU_Time .

OPTIONAL {

?n1 ur:Resource ?res .

?res x2r:value ?executable .

}

}} ORDER BY ?GlobalUserId

Page 10: AstroGrid-D Monitoring

10Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008

Retrieving Usage Statistics via Stellaris

Page 11: AstroGrid-D Monitoring

11Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008

Goals until the end of the project

Integrate monitoring info in Timeline and Resource Map Provide more SPARQL query templates

(See svn://svn.gac-grid.org/software/monitoring/host/)

Provide improved documentation and installation instructions Include all AGD institutes and resource in monitoring Come from test to production mode, i.e. solve remaining problems

Page 12: AstroGrid-D Monitoring

12Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008

Solve instable DB connection

Audit Logging establishes a DB connection only once, i.e. the first time a job is submitted to Globus If the DB goes down, the connection is lost and no further

data received => a restart of the Globus Container necessary Solution: we have informed the GT developers via

mailing lists: gt-user & gram-user but report: http://bugzilla.globus.org/globus/show_bug.cgi?id=5863

Page 13: AstroGrid-D Monitoring

13Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008

Add missing fields in audit logging

Some important information is not provided by audit logging global job id (UUID format) resource usage information as reported by the UNIX time command, i.e.: (i) the elapsed real time (ii) the user CPU time (iii) the system CPU time end time of the job, in the same format as creation_time name of submission client name of execution host (and maybe also the number of used CPUs)

Solution: we have informed the GT developers via mailing lists: gt-user & gram-user but report: http://bugzilla.globus.org/globus/show_bug.cgi?id=5864

Page 14: AstroGrid-D Monitoring

14Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008

Add Usage Record (UR) format

Audit logging is not compatible to the UR format, the OGF standard for monitoring information currently we construct URs via database triggers Solution: we have informed the GT developers via

mailing lists: gt-user & gram-user but report: http://bugzilla.globus.org/globus/show_bug.cgi?id=5865

Page 15: AstroGrid-D Monitoring

15Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008

Simplify installation procedure

Currently the PostgreSQL has to be recompiled with Perl support

DB triggers have to be installed Globus configuration is necessary Solution: we want to optimize the installation process,

maybe with a Globus helper package

Page 16: AstroGrid-D Monitoring

16Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008

Upgrade to Stellaris V 0.2.0

Currently a few problems also exist with Stellaris V 0.2.0 We continue testing and Report every problem to Mikael Högqvist

Page 17: AstroGrid-D Monitoring

17Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008

Perspectives beyond the project

Define a common policy about data privacy,since AGD resources are shared with other grid communities (e.g. LRZ) which might have different restrictions on logging of user information

Suggest AGD monitoring solution to other grid communities

Page 18: AstroGrid-D Monitoring

18Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008

New vision of the RT project as reflected by new name: OpenTel corresponding project page: http://www.gac-grid.org/project-products/RoboticTelescopes.html OpenTel is an open network for rob. telescopes. Open means

open standards open source open for telescopes to join

OpenTel is for professional and amateur astronomers OpenTel is currently the only open network and therefore a unique

and promising approach in robotic astronomy

Robotic Telescopes Status

Page 19: AstroGrid-D Monitoring

19Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008

Project History

Progress so far D2.4 Static metadata: FB, done (15.5.2007) D2.7 Dynamic metadata / Monitoring: FB, 66% complete,

publication expected in March D5.3 First Integration of RTs: FB, done (31.7.2007)

Goals until end of project D5.5 Resource Broker: TR, work in progress. FB will help. D5.8 Scheduler: FB, TR, Thomas G., to be done

Page 20: AstroGrid-D Monitoring

20Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008

Monitoring / Dynamic Metadata

Monitoring a network of robotic telescopes -Deliverable 2.7: STELLA-I & II as info providers for Stellaris

Same database triggers as for host monitoring RDF Calendar format is used for scheduling info (understood by RDF tools) Trigger templates can be easily adjusted for other telescopes Software is collected in a package called “ottools” Timeline showing observation schedule directly from the STELLA DB

(http://photon.aip.de:25000/timeline/telescopes.html) Timeplot showing weather information (tbd)

Page 21: AstroGrid-D Monitoring

21Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008

Goals until the end of the project

Provide a general solution for the integration of other telescopes. This requires:

Metadata management based on user certificates Software package with tools and templates (ottools)

svn://svn.gac-grid.org/software/OpenTel/ottools Comprehensive documentation Improved user interfaces:

Timeline & Timeplot with menu for selection of telescopes, time windows, etc. Timeplot displaying new metadata of time series (temperature, seeing, etc.) Resouce map displaying dynamic metadata

Resource Broker (D5.5) Scheduler (D5.8) Integrate STELLA-I & STELLA-II First observation via the grid

Page 22: AstroGrid-D Monitoring

22Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008

Perspectives beyond the project

Improve software, in particular the scheduler Perform more grid observations, more testing Perform first network observations Integrate more telescopes, in particular from hobby

astronomers. Software contributions would be welcome Collaboration with other networks such as the LCOGT Attract and collaborate with the amateur astronomy and open

source community Find an OpenTel logo