Upload
breck
View
43
Download
0
Embed Size (px)
DESCRIPTION
AstroGrid-D Monitoring. AstroGrid-D Meeting @ AIP 25.-26.2.2008 Frank Breitling Stephan Braune. Contents. Host Monitoring (for compute resources) Status Goals until the end of the project Perspectives beyond the project Robotic Telescope Monitoring Status - PowerPoint PPT Presentation
Citation preview
25.02.2008
AstroGrid-D Monitoring
AstroGrid-DMeeting
@ AIP
25.-26.2.2008
Frank BreitlingStephan Braune
2Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008
Contents
Host Monitoring (for compute resources) Status Goals until the end of the project Perspectives beyond the project
Robotic Telescope Monitoring Status Goals until the end of the project Perspectives beyond the project
3Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008
Host Monitoring Status
Since Dec. 2007 AGD monitoring solution
It builds on Audit Logging provided by Globus Toolkit V4.0.5 and later PostgreSQL Database (DB) DB Triggers Usage Records (UR) XML format (http://staff.psc.edu/lfm/PSC/Grid/UR-WG/) XML2RDF XSLT Stellaris SPARQL queries
A test setup is running at the AIP since Dec. 2007
4Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008
Use
r W
orks
tatio
n
GlobusClient
Globus gridResource
Browser
AuditDatabase
StellarisStellarisRDF-
DatabaseEarlier: status information via
EPR-files and monitoring.pl
globusrun_ws
globus_job_run
Trigger
SPARQL QueriesTimelines
curl
AGD Monitoring Architecture
5Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008
Changes in the Globus Toolkit configuration:
in $GLOBUS_LOCATION/container-log4j.properties:...# GRAM AUDITlog4j.category.org.globus.exec.service.exec.StateMachine.audit=DEBUG, AUDITlog4j.appender.AUDIT=org.globus.exec.utils.audit.AuditDatabaseAppenderlog4j.appender.AUDIT.layout=org.apache.log4j.PatternLayoutlog4j.additivity.org.globus.exec.service.exec.StateMachine.audit=false
output to database (PostgreSQL or MySQL), Database Connection has to be declared in $GLOBUS_LOCATION/etc/gram-service/jndi-config.xml:
<resource ...> <resourceParams> ... <parameter> <name>url</name><value>jdbc:mysql://<host>[:port]/auditDatabase</value> </parameter> <parameter><name>user</name><value>globus</value></parameter> <parameter><name>password</name><value>foo</value></parameter> ... </resourceParams></resource>
table update whenever a job ist started or changed it's status (contrary to SAGAS)
database content is converted into Usage Record format and sent to Stellaris via DB triggers
Activation of Audit Logging in Globusfor WS GRAM (globusrun-ws)
6Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008
Activation of Audit Logging in Globusfor Pre WS GRAM (globus-job-run)
Changes in the Globus Toolkit configuration:
in $GLOBUS_LOCATION/log4j.properties:...# GRAM AUDITlog4j.category.org.globus.exec.service.exec.StateMachine.audit=DEBUG, AUDITlog4j.appender.AUDIT=org.globus.exec.utils.audit.AuditDatabaseAppenderlog4j.appender.AUDIT.layout=org.apache.log4j.PatternLayoutlog4j.additivity.org.globus.exec.service.exec.StateMachine.audit=false
text file output has to be configured in $GLOBUS_LOCATION/etc/globus-job-manager.conf:
-audit-directory /tmp/globus
file is converted into Usage Record format and sent to Stellaris via a cron job
7Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008
Audit Fields in PostgreSQL DB
8Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008
DB Trigger
CREATE FUNCTION update_stellaris() RETURNS "trigger" AS $update_stellaris$use strict;use URI;use Net::hostent;use XML::Writer;use HTTP::Request;use LWP::UserAgent;
my $job_grid_id = URI->new($_TD->{new}{job_grid_id});my $id = unpack("H*", $job_grid_id->query()); my $host=gethost($job_grid_id->host())->name();my $usage_record = "";my $writer = XML::Writer->new(OUTPUT => \$usage_record, NEWLINES => 1, UNSAFE => 1);$writer->xmlDecl("UTF-8");$writer->startTag("JobUsageRecord", "xmlns" => "http://www.gridforum.org/2003/ur-wg#", ...); $writer->startTag("RecordIdentity"); $writer->dataElement("LocalJobId", $_TD->{new}{local_job_id}); $writer->endTag("RecordIdentity"); ..... $writer->raw($_TD->{new}{job_description}); $writer->dataElement("success_flag", $_TD->{new}{success_flag}); $writer->dataElement("finished_flag", $_TD->{new}{finished_flag});$writer->endTag("JobUsageRecord"); $writer->end();
my $req = HTTP::Request->new("PUT", "http://stellaris.astrogrid-d.org/files/hosts/".$host."/urs/".$id, HTTP::Headers->new(Content_Length => length($usage_record)), $usage_record);my $ua = LWP::UserAgent->new(); my $res = $ua->request($req); ..... return;$update_stellaris$ LANGUAGE plperlu;
CREATE TRIGGER update_stellaris_trig BEFORE INSERT OR UPDATE ON gram_audit_table FOR EACH ROW EXECUTE PROCEDURE update_stellaris();
The triggers are installte in the PostgreSQL DB using: audit=# \i trigger.sql Documentation is available at AGD intranet: http://mintaka.aip.de:8080/lenya/intranet/live/workpackages/wg2/GRAM_audit_logging.pdf
9Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008
SPARQL Queries for Usage Statistics
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX ur: <http://www.gridforum.org/2003/ur-wg#>
PREFIX x2r: <http://www.astrogrid-d.org/2007/08/14-xml2rdf#>
SELECT ?job_grid_id ?GlobalUserName ?SubmitHost ?executable
?creation_time ?StartTime ?EndTime ?wdv ?Count ?CPU_Time
WHERE { graph ?g {
?n1 ur:JobIdentity ?JobIdentity .
?JobIdentity ur:job_grid_id ?job_grid_id .
?n1 ur:UserIdentity ?UserIdentity .
?UserIdentity ur:GlobalUserName ?GlobalUserName .
?n1 ur:creation_time ?creation_time .
?n1 ur:SubmitHost ?SubmitHost .
OPTIONAL { ?n1 ur:StartTime ?StartTime .
?n1 ur:EndTime ?EndTime . }
OPTIONAL { ?n1 ur:WallDuration ?wall_duration .
?wall_duration x2r:value ?wdv . }
OPTIONAL { ?n1 ur:Resource ?res .
?res x2r:value ?executable . }
OPTIONAL { ?n1 ur:Count ?Count . }
OPTIONAL { ?n1 ur:CPU_Time ?CPU_Time . }
}} ORDER BY DESC(?creation_time) LIMIT 25
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX ur: <http://www.gridforum.org/2003/ur-wg#>
PREFIX x2r: <http://www.astrogrid-d.org/2007/08/14-xml2rdf#>
SELECT distinct ?GlobalUserName ?executable ?SubmitHost sum(?CPU_Time)
WHERE {
graph ?g {
?n1 ur:JobIdentity ?JobIdentity .
?JobIdentity ur:job_grid_id ?job_grid_id .
?n1 ur:UserIdentity ?UserIdentity .
?UserIdentity ur:GlobalUserId ?GlobalUserName .
?n1 ur:SubmitHost ?SubmitHost .
?n1 ur:CPU_Time ?CPU_Time .
OPTIONAL {
?n1 ur:Resource ?res .
?res x2r:value ?executable .
}
}} ORDER BY ?GlobalUserId
10Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008
Retrieving Usage Statistics via Stellaris
11Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008
Goals until the end of the project
Integrate monitoring info in Timeline and Resource Map Provide more SPARQL query templates
(See svn://svn.gac-grid.org/software/monitoring/host/)
Provide improved documentation and installation instructions Include all AGD institutes and resource in monitoring Come from test to production mode, i.e. solve remaining problems
12Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008
Solve instable DB connection
Audit Logging establishes a DB connection only once, i.e. the first time a job is submitted to Globus If the DB goes down, the connection is lost and no further
data received => a restart of the Globus Container necessary Solution: we have informed the GT developers via
mailing lists: gt-user & gram-user but report: http://bugzilla.globus.org/globus/show_bug.cgi?id=5863
13Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008
Add missing fields in audit logging
Some important information is not provided by audit logging global job id (UUID format) resource usage information as reported by the UNIX time command, i.e.: (i) the elapsed real time (ii) the user CPU time (iii) the system CPU time end time of the job, in the same format as creation_time name of submission client name of execution host (and maybe also the number of used CPUs)
Solution: we have informed the GT developers via mailing lists: gt-user & gram-user but report: http://bugzilla.globus.org/globus/show_bug.cgi?id=5864
14Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008
Add Usage Record (UR) format
Audit logging is not compatible to the UR format, the OGF standard for monitoring information currently we construct URs via database triggers Solution: we have informed the GT developers via
mailing lists: gt-user & gram-user but report: http://bugzilla.globus.org/globus/show_bug.cgi?id=5865
15Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008
Simplify installation procedure
Currently the PostgreSQL has to be recompiled with Perl support
DB triggers have to be installed Globus configuration is necessary Solution: we want to optimize the installation process,
maybe with a Globus helper package
16Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008
Upgrade to Stellaris V 0.2.0
Currently a few problems also exist with Stellaris V 0.2.0 We continue testing and Report every problem to Mikael Högqvist
17Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008
Perspectives beyond the project
Define a common policy about data privacy,since AGD resources are shared with other grid communities (e.g. LRZ) which might have different restrictions on logging of user information
Suggest AGD monitoring solution to other grid communities
18Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008
New vision of the RT project as reflected by new name: OpenTel corresponding project page: http://www.gac-grid.org/project-products/RoboticTelescopes.html OpenTel is an open network for rob. telescopes. Open means
open standards open source open for telescopes to join
OpenTel is for professional and amateur astronomers OpenTel is currently the only open network and therefore a unique
and promising approach in robotic astronomy
Robotic Telescopes Status
19Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008
Project History
Progress so far D2.4 Static metadata: FB, done (15.5.2007) D2.7 Dynamic metadata / Monitoring: FB, 66% complete,
publication expected in March D5.3 First Integration of RTs: FB, done (31.7.2007)
Goals until end of project D5.5 Resource Broker: TR, work in progress. FB will help. D5.8 Scheduler: FB, TR, Thomas G., to be done
20Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008
Monitoring / Dynamic Metadata
Monitoring a network of robotic telescopes -Deliverable 2.7: STELLA-I & II as info providers for Stellaris
Same database triggers as for host monitoring RDF Calendar format is used for scheduling info (understood by RDF tools) Trigger templates can be easily adjusted for other telescopes Software is collected in a package called “ottools” Timeline showing observation schedule directly from the STELLA DB
(http://photon.aip.de:25000/timeline/telescopes.html) Timeplot showing weather information (tbd)
21Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008
Goals until the end of the project
Provide a general solution for the integration of other telescopes. This requires:
Metadata management based on user certificates Software package with tools and templates (ottools)
svn://svn.gac-grid.org/software/OpenTel/ottools Comprehensive documentation Improved user interfaces:
Timeline & Timeplot with menu for selection of telescopes, time windows, etc. Timeplot displaying new metadata of time series (temperature, seeing, etc.) Resouce map displaying dynamic metadata
Resource Broker (D5.5) Scheduler (D5.8) Integrate STELLA-I & STELLA-II First observation via the grid
22Monitoring - F. Breitling, S. Braune (AGD Meeting @ AIP)25.02.2008
Perspectives beyond the project
Improve software, in particular the scheduler Perform more grid observations, more testing Perform first network observations Integrate more telescopes, in particular from hobby
astronomers. Software contributions would be welcome Collaboration with other networks such as the LCOGT Attract and collaborate with the amateur astronomy and open
source community Find an OpenTel logo