19
Joe Meehean Computer Sciences Department University of Wisconsin-Madison [email protected] http://www.cs.wisc.edu/condor Problems of Dynamic Service Deployment

Joe Meehean Computer Sciences Department University of Wisconsin-Madison Problems of Dynamic Service

Embed Size (px)

DESCRIPTION

Primary Problems › Bootstrap › User Privileges of Bootstrap (non-root) › Remote Control › Reliable Startup and Shutdown › Executable Architecture › Cleanup › File dependencies (libs, etc...) › No knowledge of system

Citation preview

Page 1: Joe Meehean Computer Sciences Department University of Wisconsin-Madison  Problems of Dynamic Service

Joe MeeheanComputer Sciences DepartmentUniversity of Wisconsin-Madison

[email protected]://www.cs.wisc.edu/condor

Problems of Dynamic Service Deployment

Page 2: Joe Meehean Computer Sciences Department University of Wisconsin-Madison  Problems of Dynamic Service

www.cs.wisc.edu/condor

Motivation› Dynamic Service Deployment

Install and setup a service on-the-fly Shutdown and cleanup a service also

on-the-fly Dynamically deploy a Condor submit

node

› Why is the problem hard?

Page 3: Joe Meehean Computer Sciences Department University of Wisconsin-Madison  Problems of Dynamic Service

www.cs.wisc.edu/condor

Primary Problems› Bootstrap› User Privileges of Bootstrap (non-root)› Remote Control› Reliable Startup and Shutdown› Executable Architecture› Cleanup› File dependencies (libs, etc...)› No knowledge of system

Page 4: Joe Meehean Computer Sciences Department University of Wisconsin-Madison  Problems of Dynamic Service

www.cs.wisc.edu/condor

Problem Focus› Bootstrap› User Privileges of Bootstrap (non-root)› Remote Control› Reliable Startup and Shutdown› Executable Architecture› Cleanup› File dependencies (libs, etc...)› No knowledge of system

Page 5: Joe Meehean Computer Sciences Department University of Wisconsin-Madison  Problems of Dynamic Service

www.cs.wisc.edu/condor

Page 6: Joe Meehean Computer Sciences Department University of Wisconsin-Madison  Problems of Dynamic Service

www.cs.wisc.edu/condor

Straw manPid as a Unique Handle to a

Process

Page 7: Joe Meehean Computer Sciences Department University of Wisconsin-Madison  Problems of Dynamic Service

www.cs.wisc.edu/condor

Slow Pid Reuse

Page 8: Joe Meehean Computer Sciences Department University of Wisconsin-Madison  Problems of Dynamic Service

www.cs.wisc.edu/condor

Slow Pid Reuse

Page 9: Joe Meehean Computer Sciences Department University of Wisconsin-Madison  Problems of Dynamic Service

www.cs.wisc.edu/condor

Improved Straw ManPid and Creation Time as a Unique Handle

Page 10: Joe Meehean Computer Sciences Department University of Wisconsin-Madison  Problems of Dynamic Service

www.cs.wisc.edu/condor

Properties of Machine Time

Page 11: Joe Meehean Computer Sciences Department University of Wisconsin-Madison  Problems of Dynamic Service

www.cs.wisc.edu/condor

Properties of Machine Time

Collision Range

Page 12: Joe Meehean Computer Sciences Department University of Wisconsin-Madison  Problems of Dynamic Service

www.cs.wisc.edu/condor

Bulletproof SolutionPid and Creation Time

Handle with Collision Avoidance

Page 13: Joe Meehean Computer Sciences Department University of Wisconsin-Madison  Problems of Dynamic Service

www.cs.wisc.edu/condor

Midwife Implementation

Page 14: Joe Meehean Computer Sciences Department University of Wisconsin-Madison  Problems of Dynamic Service

www.cs.wisc.edu/condor

Isn't SIGKILL a reliable shutdown mechanism?

Page 15: Joe Meehean Computer Sciences Department University of Wisconsin-Madison  Problems of Dynamic Service

www.cs.wisc.edu/condor

while( alive && totalTime < timeout){

attemptShutdown() alive = isAlive(composite_id) if( alive ){ sleep(5) totalTime += 5 }

}

Reliable Shutdown Pseudo Code

Page 16: Joe Meehean Computer Sciences Department University of Wisconsin-Madison  Problems of Dynamic Service

www.cs.wisc.edu/condor

Page 17: Joe Meehean Computer Sciences Department University of Wisconsin-Madison  Problems of Dynamic Service

www.cs.wisc.edu/condor

Conclusion› Available in Condor v 6.7.18

uniq_pid_midwife uniq_pid_undertaker

› Technical Report Coming Soon

Page 18: Joe Meehean Computer Sciences Department University of Wisconsin-Madison  Problems of Dynamic Service

www.cs.wisc.edu/condor

Page 19: Joe Meehean Computer Sciences Department University of Wisconsin-Madison  Problems of Dynamic Service

www.cs.wisc.edu/condor

PPID = 7338

PID = 7339PRECISION = 100TIME_UNITS_IN_SECS = 100.000000BDAY = 579893001CONTROL_TIME = 0CONFIRM = 579893308CONTROL_TIME = 0