Upload
bennett-wiggins
View
213
Download
0
Embed Size (px)
Citation preview
SZTAKI in DataGrid 2003
What to do this year
Topics
● Application monitoring (GRM)
● Analysis and Presentation (Pulse)
● Performance of R-GMA
I. GRM+R-GMA+GridLab monitor
Grid Application Monitoring with the above tools
What's wrong with the current solution?
● Performance + loosing events
– Simple loop example, generating 2x100K events on my machine, just to produce, not to consume:
700s vs. 1.7s with original GRM's
– If consume as well: (time doubles, of course)● Buffer size=1000 received 40K from 200K events
● Buffer size growing nothing better, later tomcat dies
– ? future solutions: We expect even worse performance if we try to receive all events.
What would be a good solution?
● Stand-alone GRM has one problem:
– How can the local component of GRM (where the application process is running) find a socket path back to GRM's main component (user's machine, practically)?
● In EuroPar'2001 we presented already (as the fourth possible architecture) the solution:
– GRM needs one more component as Site Monitor that connects GRM's local components and the main component
Site Monitors
Main MonitorMM
Application Process
Application Process
Site 1
Local Host
Host 1 Host 2 Host 1
Local MonitorLM
Local MonitorLM
Site 2
Local MonitorLM
Site MonitorSM
Site MonitorSM
Appl. Process
Appl. Process
shm shm shm
GridLab monitor
Tools
● GRM: Instrumentation library and trace collector
● GridLab monitor: transfer trace data from the application processes to GRM
● R-GMA: Information system for finding the application and the monitorcomponents
GridLab monitor
● To deliver trace data from the application to the user efficiently.
– Uses TCP Socket communication
– Data in XDR format and could be optimised for TCP transmission
– Two sw. hops between application and GRM: local and main monitors
– One hw. hop: host of main monitor
R-GMA
● To find the application by the user/GRM
– Where is it running? -> machineX.siteY
– What is its global job id? -> GID
● To find the monitor to be connected
– What is the address of GridLab monitorrunning at siteY?
● To find the application by the monitor
– What processes (PIDs) belong to application GID?
Modifications to GRM
● Instrumentation library
– Publish process info into R-GMA (practically the same way as currently)
– Connect and publish trace to GridLab monitor's LM instead of R-GMA
● GRM
– Look for application asking R-GMA
– Connect to GridLab monitor and query for trace
II. Pulse● Support new R-GMA api in the browser part
● Merge: several input channels into one channel
● Flexible description of channels information and possibility to select fields
● Bring processing code of XML configuration from global function to the components -> flexibility in adding new components
● Analysing components (currently nothing exists)
● Graphical displays, output to GIF/PNG
● ? output to R-GMA ?
III. Performance of R-GMA
● To find the most relevant reasons for the bad performance of R-GMA
● Expected reasons (without measurements)
– XML <-> String conversion many times
– Java Servlets with several components
● Netlogger (?)
● What measure and how? Ideas welcome.
“Infrastructure in SZTAKI”
● Two PCs, on two different LANs
● If strongly needed, one more PC might be used.
● For heavy duty tests, a distributed installation is needed.