Upload
ernest-robbins
View
213
Download
1
Embed Size (px)
DESCRIPTION
Primary Problems › Bootstrap › User Privileges of Bootstrap (non-root) › Remote Control › Reliable Startup and Shutdown › Executable Architecture › Cleanup › File dependencies (libs, etc...) › No knowledge of system
Citation preview
Joe MeeheanComputer Sciences DepartmentUniversity of Wisconsin-Madison
[email protected]://www.cs.wisc.edu/condor
Problems of Dynamic Service Deployment
www.cs.wisc.edu/condor
Motivation› Dynamic Service Deployment
Install and setup a service on-the-fly Shutdown and cleanup a service also
on-the-fly Dynamically deploy a Condor submit
node
› Why is the problem hard?
www.cs.wisc.edu/condor
Primary Problems› Bootstrap› User Privileges of Bootstrap (non-root)› Remote Control› Reliable Startup and Shutdown› Executable Architecture› Cleanup› File dependencies (libs, etc...)› No knowledge of system
www.cs.wisc.edu/condor
Problem Focus› Bootstrap› User Privileges of Bootstrap (non-root)› Remote Control› Reliable Startup and Shutdown› Executable Architecture› Cleanup› File dependencies (libs, etc...)› No knowledge of system
www.cs.wisc.edu/condor
www.cs.wisc.edu/condor
Straw manPid as a Unique Handle to a
Process
www.cs.wisc.edu/condor
Slow Pid Reuse
www.cs.wisc.edu/condor
Slow Pid Reuse
www.cs.wisc.edu/condor
Improved Straw ManPid and Creation Time as a Unique Handle
www.cs.wisc.edu/condor
Properties of Machine Time
www.cs.wisc.edu/condor
Properties of Machine Time
Collision Range
www.cs.wisc.edu/condor
Bulletproof SolutionPid and Creation Time
Handle with Collision Avoidance
www.cs.wisc.edu/condor
Midwife Implementation
www.cs.wisc.edu/condor
Isn't SIGKILL a reliable shutdown mechanism?
www.cs.wisc.edu/condor
while( alive && totalTime < timeout){
attemptShutdown() alive = isAlive(composite_id) if( alive ){ sleep(5) totalTime += 5 }
}
Reliable Shutdown Pseudo Code
www.cs.wisc.edu/condor
www.cs.wisc.edu/condor
Conclusion› Available in Condor v 6.7.18
uniq_pid_midwife uniq_pid_undertaker
› Technical Report Coming Soon
www.cs.wisc.edu/condor
www.cs.wisc.edu/condor
PPID = 7338
PID = 7339PRECISION = 100TIME_UNITS_IN_SECS = 100.000000BDAY = 579893001CONTROL_TIME = 0CONFIRM = 579893308CONTROL_TIME = 0