Upload
logan-mccormick
View
213
Download
0
Tags:
Embed Size (px)
Citation preview
Aug 3-4, 2003
Victoria 1
Montage, Pegasus and ROMEG. B. Berriman, J.C. Good, M. Kong, A. Laity
IPAC/Caltech
J. Jacob, A. Bergou, D. S. KatzJPL
R. WilliamsCACR
E. Deelman, G. Singh, M.-H. Su, C. KesselmanISI
THE US NATIONAL VIRTUAL OBSERVATORY
Aug 3-4, 2003
Victoria 2
Montage
• Version 1.7 approved for public release – Download page will be at montage.ipac.caltech.edu– Complete users guide including caveats
• Tested and validated on 2MASS 2IDR images on single processor Linux platforms– Tested on 10 WCS projections with mosaics smaller than 2 x 2
degrees and coordinate transformations Equ J2000 to Galactic and Ecliptic
– First release emphasizes accuracy in photometry and astrometry
• 20 modules; 7560 Lines of code; 2595 test cases executed• 119 defects reported and 116 corrected
Aug 3-4, 2003
Victoria 3
Montage Test Results Summary
Aug 3-4, 2003
Victoria 4
Montage: The Grid Years
• Re-projection is slow (2 min for a2 MASS image, single processor 1.4 GHz Linux box) parallel processing– Grid is an abstraction - array of processors, grid of clusters,
…– Montage has loosely coupled code - run on any environment
• Prototype version of a methodology for running on any “grid environment”– Many parts of the process can be parallelized– Build a Directed Acyclic Graph (DAG)– DAG is a script that enables parallelization– Describes what is to be run and when, so flow of processing
is specified– DAG is submitted to standard tools for execution
Aug 3-4, 2003
Victoria 5
War and Peace Nebula
Aug 3-4, 2003
Victoria 6
Montage and Pegasus
Data Stage in nodes
Montage compute nodesData stage out nodes
Inter pool transfer nodes
Pegasus takes the abstract workflow description, locates the compute resources and data and produces a concrete DAG which can be run on the Grid
Aug 3-4, 2003
Victoria 7
Why ROME and Why Not Apache?
• Apache accepts http requests over a TCP/IP network and returns html documents– Accepts requests anonymously, parses requests , checks if executable is in path, runs
it– Works very well when response is fast– BUT it has no memory of the request and so cannot manage information and respond
to messages
• Apache’s limitations are exposed when data processing or requests take an indeterminate time (hours, days, even weeks):
– complex database queries,– large-scale image processing or – large scale statistical analysis
A simple, portable request management environment which can work in conjunction with existing browsers, HTTP services and custom client environments to provide reliable execution of long-lived jobs and can communicate status information in more detailed ways to clients.
Aug 3-4, 2003
Victoria 8
ROME Demonstration- Registration
Aug 3-4, 2003
Victoria 9
User preferences
Aug 3-4, 2003
Victoria 10
ROME Demonstration - Job Submission
• Custom order for mosaics of ISSA images submitted to a Linux processor
-
Aug 3-4, 2003
Victoria 11
Job Information Filters
Aug 3-4, 2003
Victoria 12
ROME Interactive Request Monitor
Aug 3-4, 2003
Victoria 13
Rho Oph & Orion
Aug 3-4, 2003
Victoria 14
REQUEST MANAGER (APPLICATION SERVER)
JVM
REQUEST PROCESSOR
JVM
RequestUpdate
Worker Thread
ApplicationApplicationApplication
stdin stdout
optional message socket
Status
Update
Request
InterruptExec
Request Messages
UserRegistration
User
GetRequest
CurrentRequest
ApplicationMessage
Message
RequestProcessor(main thread)
ProcessorRegistration
Processor
ROME Request Object Management Environment
Key:
OtherJAVA
Objects
DatabaseTable
Servelet
EJB
Client
(Browser, OASIS, etc.)
SystemStatus
UpdateRequest
RequestSubmission
DBMS
User
Request
Processor
Message
ROMEArchitectural Diagram
• Clients include Browsers, NVO portals, and user-built custom code
• The heart of ROME is an EJB container tightly coupled with a DBMS
– Container where special hooks exist to simplify synchronization of user and service interaction
– Container and DBMS immaterial - during initial development used WebLogic and Informix
• ROME does not start processing- special “processor” does this
– Contact ROME (via Servlet URL) to get job parameter
– Starts CGI program for user– Process messages from the CGI
program through stdout– Process kill or abort requests
• “Processor” is currently a very simple JAVA VM
• Can be run anywhere on the net. • Can in principle be implemented in
other languages.
• Applications can be as simple as reusing existing CGI programs, but should support more complex processing.
Aug 3-4, 2003
Victoria 15
ROMEProcessing Scenario
• User registers with ROME. This is necessary for messaging (including completion notification). The user “identity” is simply their email address.
• User submits job to ROME through the User Registration servlet. The user is added to the DBMS.
• A processor (there can be many) asks ROME for a job to process (through the Get Next Request servlet).
• The processor starts (or potentially continues talking to) an application (e.g. a CGI program) which does the real work.
• The application at a minimum emits messages (text printed to stdout) when job started and at the completion. In addition it can optionally emit progress report messages at any time.
• On completion, all data products of the application will have been saved to a temporary “workspace” in the application file system. This workspace is HTTP accessible and the completion message from the application contains a pointer to this data.
• All messages are forwarded from the processor to the ROME core where they are stored in the DBMS and forwarded to the user either directly (if they are using a client which can register a message socket with ROME), later (if the user reconnects with such a client), or eventually by email (email is usually only for completion status messages).
• The user (manually through a browser or with degrees of automation through custom GUI clients) retrieves the data.
Aug 3-4, 2003
Victoria 16
Whither Next?
• Submit requests through ROME for processing jobs running on grids– Montage, others, . . .
• Support “executives” requesting complex jobs and pipelines
• Support registries• Pass security certificates• Open Source EJB server