View
37
Download
0
Category
Tags:
Preview:
DESCRIPTION
Developer APIs to Condor + A Tutorial on Condor’s Web Service Interface. Todd Tannenbaum, UW-Madison Matthew Farrellee, Red Hat. Interfacing Applications w/ Condor. Suppose you have an application which needs a lot of compute cycles You want this application to utilize a pool of machines - PowerPoint PPT Presentation
Citation preview
Developer APIs to Condor+
A Tutorial on Condor’s Web Service
Interface
Todd Tannenbaum, UW-MadisonMatthew Farrellee, Red Hat
2http://www.cs.wisc.edu/condor
Interfacing Applications w/ Condor
› Suppose you have an application which needs a lot of compute cycles
› You want this application to utilize a pool of machines
› How can this be done?
3http://www.cs.wisc.edu/condor
Some Condor APIs› Command Line tools
condor_submit, condor_q, etc› DRMAA› Condor GAHP› JSDL› RDBMS› Condor Perl Module› Event Log› SOAP
4http://www.cs.wisc.edu/condor
Command Line Tools› Don’t underestimate them!› Your program can create a submit
file on disk and simply invoke condor_submit:system(“echo universe=VANILLA > /tmp/condor.sub”);system(“echo executable=myprog >> /tmp/condor.sub”);. . .system(“echo queue >> /tmp/condor.sub”);system(“condor_submit /tmp/condor.sub”);
5http://www.cs.wisc.edu/condor
Command Line Tools› Your program can create a submit
file and give it to condor_submit through stdin:PERL: fopen(SUBMIT, “|condor_submit”);
print SUBMIT “universe=VANILLA\n”;. . .
C/C++: int s = popen(“condor_submit”, “r+”);write(s, “universe=VANILLA\n”,
17/*len*/);. . .
6http://www.cs.wisc.edu/condor
Command Line Tools› Using the +Attribute with
condor_submit:universe = VANILLAexecutable = /bin/hostnameoutput = job.outlog = job.log+webuser = “zmiller”queue
7http://www.cs.wisc.edu/condor
Command Line Tools› Use -constraint and –format with
condor_q:% condor_q -constraint 'webuser=="zmiller"' -- Submitter: bio.cs.wisc.edu : <128.105.147.96:37866> : bio.cs.wisc.edu ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 213503.0 zmiller 10/11 06:00 0+00:00:00 I 0 0.0 hostname
% condor_q -constraint 'webuser=="zmiller"' -format "%i\t" ClusterId -format "%s\n" Cmd
213503 /bin/hostname
8http://www.cs.wisc.edu/condor
Command Line Tools› condor_wait will watch a job log file
and wait for a certain (or all) jobs to complete:
system(“condor_wait job.log”);
› can specify a timeout
9http://www.cs.wisc.edu/condor
Command Line Tools› condor_q and condor_status –xml
option› So it is relatively simple to build on
top of Condor’s command line tools alone, and can be accessed from many different languages (C, PERL, python, PHP, etc).
› However…
10http://www.cs.wisc.edu/condor
DRMAA
› DRMAA is a OGF standardized job-submission API› Has C (and now Java) bindings› Is not Condor-specific › SourceForge Project http://sourceforge.net/projects/condor-ext
11http://www.cs.wisc.edu/condor
DRMAA› Unfortunately, the DRMAA 1.x API
does not support some very important features, such as: Fault tolerance Transactions
12http://www.cs.wisc.edu/condor
Condor GAHP› The Condor GAHP is a relatively
low-level protocol based on simple ASCII messages through stdin and stdout
› Supports a rich feature set including two-phase commits, transactions, and optional asynchronous notification of events
Good stuff, Todd!
13http://www.cs.wisc.edu/condor
GAHP, contExample:
R: $GahpVersion: 1.0.0 Nov 26 2001 NCSA\ CoG\ Gahpd $S: GRAM_PING 100 vulture.cs.wisc.edu/forkR: ES: RESULTSR: ES: COMMANDSR: S COMMANDS GRAM_JOB_CANCEL GRAM_JOB_REQUEST GRAM_JOB_SIGNAL
GRAM_JOB_STATUS GRAM_PING INITIALIZE_FROM_FILE QUIT RESULTS VERSIONS: VERSIONR: S $GahpVersion: 1.0.0 Nov 26 2001 NCSA\ CoG\ Gahpd $S: INITIALIZE_FROM_FILE /tmp/grid_proxy_554523.txtR: SS: GRAM_PING 100 vulture.cs.wisc.edu/forkR: SS: RESULTSR: S 0S: RESULTSR: S 1R: 100 0S: QUITR: S
14http://www.cs.wisc.edu/condor
JSDL and Condor› GridSAM: open
source web service for job submission and monitoring
› Condor plugin for GridSAM enables JSDL submissions to Condor. This plugin uses
Condor’s Web Service API
http://www.cs.ucl.ac.uk/staff/c.chapman/gridsam-plugin/
15http://www.cs.wisc.edu/condor
RDMS: Quill› Condor
operational data mirrored into an RDBMS
› Job, machine, historical info, …
› Read-only› Load balancing
benefits
QuillSchedd
Job Queue
log
RDBMS
Startd …
Master
Queue +
History Tables
16http://www.cs.wisc.edu/condor
Condor Perl Module› Perl module to parse the “job log
file”› Can use instead of polling w/
condor_q› Call-back event model› (Note: job log can be written in
XML)
http://www.cs.wisc.edu/condor
Event Logging Condor can generate an “Event Log”:
This is a log of significant events about the life of the jobs on a system.
Competing objectives: Limit the amount of space used by the logging,
so that event logging doesn’t become a problem itself
Never “drop” events C++ reader library
18http://www.cs.wisc.edu/condor
Web Service Interface› Simple Object Access Protocol
Mechanism for doing RPC using XML (typically over HTTP or HTTPS)
A World Wide Web Consortium (W3C) standard
› SOAP Toolkit: Transform a WSDL to a client library
19http://www.cs.wisc.edu/condor
Benefits of a Condor SOAP API
› Can be accessed with standard web service tools
› Condor accessible from platforms where its command-line tools are not supported
› Talk to Condor with your favorite language and SOAP toolkit
20http://www.cs.wisc.edu/condor
Condor SOAP API functionality
› Get basic daemon info (version, platform)› Submit jobs› Retrieve job output› Remove/hold/release jobs› Query machine status› Advertise resources› Query job status
21http://www.cs.wisc.edu/condor
Getting machine status via SOAP
Your program
SOAP library
queryStartdAds()
condor_collector
Machine List
SOAP over HTTP
22http://www.cs.wisc.edu/condor
Lets get some details…
23http://www.cs.wisc.edu/condor
The API› Core API, described with WSDL, is
designed to be as flexible as possible File transfer is done in chunks Transactions are explicit
› Wrapper libraries aim to make common tasks as simple as possible Currently in Java and C# Expose an object-oriented interface
24http://www.cs.wisc.edu/condor
Things we will cover› Condor setup› Necessary tools› Job Submission› Job Querying› Job Retrieval› Authentication with SSL and X.509
25http://www.cs.wisc.edu/condor
Condor setup› Start with a working condor_config› The SOAP interface is off by default
Turn it on by adding ENABLE_SOAP=TRUE› Access to the SOAP interface is denied by
default Set ALLOW_SOAP and DENY_SOAP, they
work like ALLOW_READ/WRITE/… Example: ALLOW_SOAP=*/*.cs.wisc.edu
26http://www.cs.wisc.edu/condor
Necessary tools› You need a SOAP toolkit
Apache Axis (Java) - http://ws.apache.org/axis/ Microsoft .Net - http://microsoft.com/net/ gSOAP (C/C++) - http://gsoap2.sf.net/ ZSI (Python) - http://pywebsvcs.sf.net/ SOAP::Lite (Perl) - http://soaplite.com/
› You need Condor’s WSDL files Find them in lib/webservice/ in your Condor release
› Put the two together to generate a client library $ java org.apache.axis.wsdl.WSDL2Java condorSchedd.wsdl
› Compile that client library $ javac condor/*.java
All our examples are in Java using Apache Axis
27http://www.cs.wisc.edu/condor
Client wrapper libraries› The core API has some complex spots› A wrapper library is available in Java and C#
Makes the API a bit easier to use (e.g. simpler file transfer & job ad submission)
Makes the API more OO, no need to remember and pass around transaction ids
› We are going to use the Java wrapper library for our examples You can download it from
http://www.cs.wisc.edu/condor/birdbath/birdbath.jar
28http://www.cs.wisc.edu/condor
Submitting a job› The CLI way…
universe = vanillaexecutable = /bin/cparguments = cp.sub cp.workedshould_transfer_files = yestransfer_input_files = cp.subwhen_to_transfer_output = on_exitqueue 1
$ condor_submit cp.sub
cp.sub:
Explicit bits
clusterid = Xprocid = Yowner = mattrequirements = Z
Implicit bits
29http://www.cs.wisc.edu/condor
Repeat to submit multiple jobs in a single cluster
Repeat to submit multiple clusters
• The SOAP way…1.Begin transaction2.Create cluster3.Create job4.Send files5.Describe job6.Commit transaction
Submitting a job
30http://www.cs.wisc.edu/condor
1. Begin transaction
2. Create cluster3. Create job
4&5. Send files & describe job6. Commit transaction
Schedd schedd = new Schedd(“http://…”);Transaction xact =
schedd.createTransaction();xact.begin(30);int cluster = xact.createCluster();int job = xact.createJob(cluster);File[] files = { new File(“cp.sub”) };xact.submit(cluster, job, “owner”,
UniverseType.VANILLA, “/bin/cp”, “cp.sub cp.worked”, “requirements”, null, files);
xact.commit();
Submission from Java
31http://www.cs.wisc.edu/condor
Schedd’s location
Max time between calls (seconds)
Job owner, e.g. “matt”
Requirements, e.g. “OpSys==\“Linux\””Extra attributes, e.g. Out=“stdout.txt” or Err=“stderr.txt”
Schedd schedd = new Schedd(“http://…”);Transaction xact =
schedd.createTransaction();xact.begin(30);int cluster = xact.createCluster();int job = xact.createJob(cluster);File[] files = { new File("cp.sub") };xact.submit(cluster, job, “owner”,
UniverseType.VANILLA, “/bin/cp”, “cp.sub cp.worked”, “requirements”, null, files);
xact.commit();
Submission from Java
32http://www.cs.wisc.edu/condor
Querying jobs› The CLI way…$ condor_q
-- Submitter: localhost : <127.0.0.1:1234> : localhost ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 1.0 matt 10/27 14:45 0+02:46:42 C 0 1.8 sleep 10000…
42 jobs; 1 idle, 1 running, 1 held, 1 unexpanded
33http://www.cs.wisc.edu/condor
Also, getJobAds given a constraint, e.g. “Owner==\“matt\””
String[] statusName = { “”, “Idle”, “Running”, “Removed”, “Completed”, “Held” };
int cluster = 1;int job = 0;
Schedd schedd = new Schedd(“http://…”);ClassAd ad = new ClassAd(schedd.getJobAd(cluster, job));
int status = Integer.valueOf(ad.get(“JobStatus”));System.out.println(“Job is “ + statusName[status]);
Querying jobs› The SOAP way from Java…
34http://www.cs.wisc.edu/condor
Retrieving a job› The CLI way..› Well, if you are submitting to a local
Schedd, the Schedd will have all of a job’s output written back for you
› If you are doing remote submission you need condor_transfer_data, which takes a constraint and transfers all files in spool directories of matching jobs
35http://www.cs.wisc.edu/condor
Discover available files
Remote file
Local file
Retrieving a job› The SOAP way in Java…
int cluster = 1;int job = 0;Schedd schedd = new Schedd(“http://…”);Transaction xact = schedd.createTransaction();xact.begin(30);FileInfo[] files = xact.listSpool(cluster, job);for (FileInfo file : files) {
xact.getFile(cluster, job, file.getName(), file.getSize(), new File(file.getName()));
}xact.commit();
36http://www.cs.wisc.edu/condor
Authentication for SOAP› Authentication is done via mutual SSL
authentication Both the client and server have certificates and identify
themselves› It is not always necessary, e.g. in some controlled
environments (a portal) where the submitting component is trusted
› A necessity in an open environment -- remember that the submit call takes the job’s owner as a parameter Imagine what happens if anyone can submit to
a Schedd running as root…
37http://www.cs.wisc.edu/condor
Live DEMOS!A live demo Todd?
You are nuts!!
38http://www.cs.wisc.edu/condor
Thank you!
Questions?
39http://www.cs.wisc.edu/condor
Details on settingup authenticated SOAP over HTTPS
40http://www.cs.wisc.edu/condor
Authentication setup› Create and sign some certificates› Use OpenSSL to create a CA
CA.sh -newca› Create a server cert and password-less key
CA.sh -newreq && CA.sh -sign mv newcert.pem server-cert.pem openssl rsa -in newreq.pem -out server-key.pem
› Create a client cert and key CA.sh -newreq && CA.sh -sign && mv
newcert.pem client-cert.pem && mv newreq.pem client-key.pem
41http://www.cs.wisc.edu/condor
Authentication config› Config options…
ENABLE_SOAP_SSL is FALSE by default <SUBSYS>_SOAP_SSL_PORT
• Set this to a different port for each SUBSYS you want to talk to over ssl, the default is a random port
• Example: SCHEDD_SOAP_SSL_PORT=1980 SOAP_SSL_SERVER_KEYFILE is required and has no
default• The file containing the server’s certificate AND
private key, i.e. “keyfile” after cat server-cert.pem server-key.pem > keyfile
42http://www.cs.wisc.edu/condor
Authentication config› Config options continue…
SOAP_SSL_CA_FILE is required• The file containing public CA certificates
used in signing client certificates, e.g. demoCA/cacert.pem
› All options except SOAP_SSL_PORT have an optional SUBSYS_* version For instance, turn on SSL for everyone
except the Collector with• ENABLE_SOAP_SSL=TRUE• COLLECTOR_ENABLE_SOAP_SSL=FALSE
43http://www.cs.wisc.edu/condor
One last bit of config› The certificates we generated have a principal name,
which is not standard across many authentication mechanisms
› Condor maps authenticated names (here, principal names) to canonical names that are authentication method independent
› This is done through mapfiles, given by SEC_CANONICAL_MAPFILE and SEC_USER_MAPFILE
› Canonical map: SSL .*emailAddress=(.*)@cs.wisc.edu.* \1
› User map: (.*) \1› “SSL” is the authentication method,
“.*emailAddress….*” is a pattern to match against authenticated names, and “\1” is the canonical name, in this case the username on the email in the principal
44http://www.cs.wisc.edu/condor
HTTPS with Java› Setup keys…
keytool -import -keystore truststore -trustcacerts -file demoCA/cacert.pem
openssl pkcs12 -export -inkey client-key.pem -in client-cert.pem -out keystore
› All the previous code stays the same, just set some properties javax.net.ssl.trustStore, javax.net.ssl.keyStore,
javax.net.ssl.keyStoreType, javax.net.ssl.keyStorePassword
Example: java -Djavax.net.ssl.trustStore=truststore -Djavax.net.ssl.keyStore=keystore -Djavax.net.ssl.keyStoreType=PKCS12 -Djavax.net.ssl.keyStorePassword=pass Example https://…
Recommended