View
219
Download
1
Embed Size (px)
Citation preview
Autonomic System Design
Visa Holopainen, [email protected]
Enabling autonomic behavior in systems software with hot swapping, J. Appavoo et al. 2003
Focus on object-oriented systems software By hot swapping, new algorithms and monitoring code can be added to a
running system without disruption Hot swapping is accomplished either by interpositioning of code, or by
replacement of code Interpositioning involves inserting a new component between two existing ones.
This enables more detailed monitoring when problems occur, while minimizing run-time costs when the system is performing acceptably
Replacement allows an active component to be switched with a different implementation of that component while the system is running
Triggering hot swapping In many cases an object is expected to trigger a replacement itself
(autonomously). For example, if an object is designed to support small files and it registers an
increase in file size, then the object can trigger a hot swap with an object that supports large files
In other cases, the system infrastructure is expected to determine the need for an object replacement through a hot swap. Monitoring is required for this purpose.
Adaptive code vs. hot swapping
Among other features, hot swapping allows systems software to react to changes in environment
More traditional approach towards handling varying environments is to use adaptive code
In a system using adaptive code, all possible configurations must be built to the system beforehand
Adaptive code has many problematic features (presented below)
Illustration of adaptive code vs. hot swapping
An adaptive code implementation (A) vs a hot-swapping implementation (B) of the same function
The adaptive code approach is monolithic and includes monitoring code that collects the data needed by the adaptive algorithm to choose a particular code path
With hot swapping, each algorithm is implemented independently (resulting in reduced complexity per component), and is hot swapped in when needed
Benefits of hot swapping Hot swapping can be beneficial at least in the following respects: Optimizing for the (non) common case
Dynamic replacement allows efficient implementations of common paths to be used when suitable, and less-efficient, less-common implementations to be switched in when necessary
Optimizing for a wide range of file attribute values For example, although the vast majority of files accessed are small (< 4
KB), OSs must also support large files Access patterns
Researchers have shown up to 30 percent fewer cache misses by using the appropriate cache management policy
Multiprocessor optimizations Some applications perform better when distributed to many processors
while others perform better when run on a single processor Enabling client-specific customization Exporting system structure information
Always gathering the necessary profiling information increases overhead
Testing system
A research operating system (K42) has been developed to test the hot swapping approach Runs on PowerPC and MIPS architectures (soon available for x86 also) K42 scales well to multiprocessor systems
Performance advantages of hot swapping have been demonstrated in K42 K42 is available at http://www.research.ibm.com/K42
Adding Autonomic Functionality to object-oriented applications, M. Schanne, W. Tichy, T. Gelhausen, 2003
The goal is to separate autonomic functionality from applications (similar to hot swapping)
This is accomplished by creating a system based on class renaming and proxy/wrapper generation
A list of the proxy objects is kept in registry Proxy objects has always a pointer to the latest version
of the actual object and access to its member functions This is accomplished by ByteCode Engineering Library
(BCEL) Wrapper functions ensure synchronization of variables The design ensures that there is no need for the user
to adapt his source code in any way or even to restart the program
The supported environment: the likes of Java 2 platform
Usable Autonomic Computing Systems: the Administrator’s Pers- pective, R. Barrett, P. Maglio, E. Kandogan, J. Bailey, 2004
Autonomic computing seeks to solve the problem of increasingly complex configurations through increased automation
However, the AC strategy of managing complexity through automation runs the risk of making management harder (more powerful commands)
This is why autonomic systems should: Provide facilities that make rehearsing and planning easy Be designed to allow administrators to quickly undo changes, making operations
(whether on production systems or test systems) less risky and therefore easier Inform the administrator if undo:ing a command will not be possible (easily) Have enhanced capabilities for testing complex end-to-end systems so that
administrators will be confident that their changes are not having unintended consequences
Provide access to arbitrary levels of configuration detail if need be Autonomic system should also
Contain a command line interface (in addition to GUI)
An Architectural Approach to Autonomic Computing, S. White, J. Hanson, I. Whalley, D. Chess, J. Kephart, 2004
An autonomic system can be decomposed to 1) interfaces, 2) interactions and 3) design patterns
A bit RFC-style paper with MUST and SHOULD statements about Autonomic Elements (AE)
MUST Examples: An AE MUST be self-managing An AE MUST handle problems locally whenever possible An AE MUST be capable of establishing and maintaining relationships with other
autonomic elements SHOULD Examples:
An AE SHOULD ask for a realistic set of requirements when requesting a service from another element
An AE SHOULD offer a range of performace, reliability, availability and security associated with its service
An AE SHOULD protect itself against inappropriate service requests and responses
Use of policies
The use of policies is essential for autonomic systems Three (3) policy levels presented
1) Action policies (IF condition THEN action)• An AE employing action policies MUST measure and/or synthesize the
quantities stated in the condition
2) Goal policies (”Response time must not exceed 2 sec.”)• AEs employing goal policies MUST possess sufficient modeling or planning
capabilities to translate goals into actions
3) Utility function policies (automatically determine the most valuable goal in any situation)
• AEs employing utility funtion policies MUST have sophisticated modeling and optimization capabilities to translate utility functions into actions
Interfaces
Making a system autonomic requires additional interfaces to be added to the system Monitoring and test interfaces
Enable an element to be monitored by any other element that has established the appropriate administrative relationships with it
Lifecycle interfaces Enable administrative elements to determine the lifecycle state of an element
(e.g. starting, paused), to cause a state change, and to determine the lifecycle model that applies to the element, and to determine the lifecycle model that applies to the element
Policy interfaces Enable administrative elements to send new policies to an element, and to
determine the policies currently in use by the element Negotiation and binding interfaces
Permit an element to request a service from other elements, or to request to provide a service
Relationships
When an AE has agreed to provide service to another AE, then those two elements have a relationship
Relationships are typically formed at run-time Autonomic systems are built by relationships Request-response paradigm used to form relationships
From autonomic elements to autonomic systems
Assembling an autonomic system requires:1) A collection of AEs that implement the desired function
2) Additional autonomic elements to implement system functions that enable the needed system-level behaviors (=infrastructure elements)
3) Design patterns for system self-management Infrastructure element can be
Registry (provides mechanisms for elements to find one another) Sentinel (provides monitoring services to other elements) Aggregator (combines two or more existing elements and uses them to provide
improved service) Broker (facilitates interaction) Negotiator (assists elements with complex negotiations)
Towards Requirements-Driven Autonomic Systems Design, A. Lapouchnian, S. Liaskos, J. Mylopoulos, Y. Yu, 2005
There are three basic ways to make a system autonomic1) Design the system to support a space of possible behaviors
2) Equip system with planning and social capabilities so that it can delegate tasks to external software components (agents)
3) Build the system so that it has evolutionary capabilities (like biological systems) The first approach was studied in the paper Requirements engineering
Development of a framework for capturing and analyzing stakeholder intentions to generate functional and non-functional requirements
Illustration of requirements engineering: goal model
Top-level ”hard” goal: Schedule meeting AND-composed of lower level
hard goals 4 top-level ”softgoals”
Good quality schedule, Minimal effort, Minimal disturbances, Accurate constraints
Lower level softgoals can be related to higher levels by help (+), hurt (-), make (++) or break (--) relationships
6 alternative ways to fulfill the goal “Schedule Meeting”
An autonomic system should address all different ways of fulfilling the top-level goals
Goal model -> Feature model ->Component Connector model
Goal model is integrated into the knowledge of an autonomic element
Architectural Design of a Distributed Application with Autonomic Quality Requirements, D. Weyns, K. Schelfthout and T. Holvoet, 2005
A reference architecture for situated multi-agent systems (situated MAS) was developed
This reference architecture was applied to a real-world software system The architecture:
A situated MAS consists of an environment populated with agents (autonomous entities)
Intelligence in a situated MAS originates from the interaction between agents, rather than from their individual capabilities
The architecture holds three abstractions: agents, ongoing activities and the environment
High-level model view of the architecture
The Perception module maps the local state of the environment onto a percept for the agent
The Consuption module handles the effects of encironment changes that affect the agent
The Decision module is responsible for action selection
The application
A system in which robots transport loads from one place to another within a warehouse and recharge themselves whenever needed
Old system: centralized server controlled robots Main problem: inflexibility; robots can’t adapt to changing situations
Improvement: Robots are agents acting in a MAS Drawback: more complicated system
Module view of the application
Two kinds of agents: trasport agents and AGV agents Transport agents are ”managers”;
they determine the priority of the transport, assign transports to AGVs and ensure that the transport succeeds
AGV agents are responsible for executing the assigned transport
Architecture of the environment
To cope with the complexity of the environment, it is presented through a layered architecture
Virtual environment uses a middleware layer that enbles agents to communicate with each other
Virtual environment enbles agent routing and prevents collisions
The agent observer a 3-5 meter circle from the virtual environment at a time In this circle the agent marks the
path it is going to use and removes this path when leaving the circle
This way collisions can be avoided Transport agents use the virtual
environment to locate AGV agents
A Control Theory Foundation for Self-Managing Computing Systems, Y. Diao, J. Hellerstein, S. Parekh, R. Griffith, G. Kaiser, D. Phung, 2005
Control theory used as a way to identify a number of requirements for and challenges in building self-managing systems
What does control theory bring to table in terms of self-management? Autonomic computing and control theory have
slightly different points of focus: autonomic computing focuses on the specification and construction of management components that interoperate well, while the focus of control theory is on analyzing and/or developing components and algorithms so that the resulting system achieves the control objectives
For example, control theory provides design techniques for determining the values of parameters in commonly used control algorithms so that the resulting control system is stable and settles quickly in response to disturbances
Feedback Control Theory
Reference Input (I/P) : Desired Output (O/P) (as specified by the human)
Control Error : (Reference I/P – Measured O/P) Control Input : Parameters which affect behavior of the system Disturbance I/P : affects Control I/P Controller : Change Control I/P to achieve Reference I/P Measured O/P : Measurable feature of the system Noise I/P : affects Measured O/P Transducer : Transforms measured O/P to compare with Reference
I/P
Properties of Control Systems
SASO Stable
Bounded Input produces bounded output Unstable systems not usable in mission critical work
Accurate Measure Output converges to Reference (Desired) Input
Short Settling Times Converges to the Stable Value quickly
No Overshoot Achieves objectives in a steady manner
Control Analysis and Design
Notes ServerMaxUsers Actual RIS
)(ky)(kuModel of System Dynamics
)()()1( 11 kubkyaky Model of System Dynamics
)()()1( 11 kubkyaky
47.0
43.0
)(
1
1
1
1
b
a
az
bzN
Transfer Function
Transfer function and Z-transformation used to control and model response times and settling times
Example: control theory approach to web server management
Objective : CPU Utilization < 50% Measured Output : CPU utilization Control Input : “MaxClients” During the first 300 s, the system
operates without feedback control. When the controller is turned on, a reference input of 0.5 is used. At this point, the system begins to oscillate and the amplitude of the oscillations increases. This is a result of a controller design that overreacts to the stochastics in the CPU utilization measurement.
<username>, I Need You!Initiative and Interaction in Autonomic Systems, P. Kaminski, P. Agrawal, H. Kienle, H. Müller, 2005
Autonomic job requirements If I hired a person instead, what qualities would I look for?
attention to detail, strong communication skills, initiative, tempered by job boundaries, self-knowledge and willingness to seek help
Treat users as partners, not masters Basic idea:
The system has an optimization engine that decides if the preferred mode of action in some situation is to 1) connect a human or 2) try to repair the system Decision based on 1) explicit instructions and 2) learning Balance match, bother, rush, risk
The system learns from human actions and becomes more competent in solving problems on its own
Balance initiative and interaction Send messages via e-mail, instant messenger, etc.
Human (operator) is added to the traditional autonomic computing cycle
Autonomic interaction manager
Monitor
PlanAnalyze
ExecuteKnowledge
ask forhelp
receiveadvice