Autonomic System Design Visa Holopainen, [email protected]

Autonomic System Design

Visa Holopainen, [email protected]

Enabling autonomic behavior in systems software with hot swapping, J. Appavoo et al. 2003

Focus on object-oriented systems software By hot swapping, new algorithms and monitoring code can be added to a

running system without disruption Hot swapping is accomplished either by interpositioning of code, or by

replacement of code Interpositioning involves inserting a new component between two existing ones.

This enables more detailed monitoring when problems occur, while minimizing run-time costs when the system is performing acceptably

Replacement allows an active component to be switched with a different implementation of that component while the system is running

Triggering hot swapping In many cases an object is expected to trigger a replacement itself

(autonomously). For example, if an object is designed to support small files and it registers an

increase in file size, then the object can trigger a hot swap with an object that supports large files

In other cases, the system infrastructure is expected to determine the need for an object replacement through a hot swap. Monitoring is required for this purpose.

Adaptive code vs. hot swapping

Among other features, hot swapping allows systems software to react to changes in environment

More traditional approach towards handling varying environments is to use adaptive code

In a system using adaptive code, all possible configurations must be built to the system beforehand

Adaptive code has many problematic features (presented below)

Illustration of adaptive code vs. hot swapping

An adaptive code implementation (A) vs a hot-swapping implementation (B) of the same function

The adaptive code approach is monolithic and includes monitoring code that collects the data needed by the adaptive algorithm to choose a particular code path

With hot swapping, each algorithm is implemented independently (resulting in reduced complexity per component), and is hot swapped in when needed

Benefits of hot swapping Hot swapping can be beneficial at least in the following respects: Optimizing for the (non) common case

Dynamic replacement allows efficient implementations of common paths to be used when suitable, and less-efficient, less-common implementations to be switched in when necessary

Optimizing for a wide range of file attribute values For example, although the vast majority of files accessed are small (< 4

KB), OSs must also support large files Access patterns

Researchers have shown up to 30 percent fewer cache misses by using the appropriate cache management policy

Multiprocessor optimizations Some applications perform better when distributed to many processors

while others perform better when run on a single processor Enabling client-specific customization Exporting system structure information

Always gathering the necessary profiling information increases overhead

Testing system

A research operating system (K42) has been developed to test the hot swapping approach Runs on PowerPC and MIPS architectures (soon available for x86 also) K42 scales well to multiprocessor systems

Performance advantages of hot swapping have been demonstrated in K42 K42 is available at http://www.research.ibm.com/K42

Adding Autonomic Functionality to object-oriented applications, M. Schanne, W. Tichy, T. Gelhausen, 2003

The goal is to separate autonomic functionality from applications (similar to hot swapping)

This is accomplished by creating a system based on class renaming and proxy/wrapper generation

A list of the proxy objects is kept in registry Proxy objects has always a pointer to the latest version

of the actual object and access to its member functions This is accomplished by ByteCode Engineering Library

(BCEL) Wrapper functions ensure synchronization of variables The design ensures that there is no need for the user

to adapt his source code in any way or even to restart the program

The supported environment: the likes of Java 2 platform

Usable Autonomic Computing Systems: the Administrator’s Pers- pective, R. Barrett, P. Maglio, E. Kandogan, J. Bailey, 2004

Autonomic computing seeks to solve the problem of increasingly complex configurations through increased automation

However, the AC strategy of managing complexity through automation runs the risk of making management harder (more powerful commands)

This is why autonomic systems should: Provide facilities that make rehearsing and planning easy Be designed to allow administrators to quickly undo changes, making operations

(whether on production systems or test systems) less risky and therefore easier Inform the administrator if undo:ing a command will not be possible (easily) Have enhanced capabilities for testing complex end-to-end systems so that

administrators will be confident that their changes are not having unintended consequences

Provide access to arbitrary levels of configuration detail if need be Autonomic system should also

Contain a command line interface (in addition to GUI)

An Architectural Approach to Autonomic Computing, S. White, J. Hanson, I. Whalley, D. Chess, J. Kephart, 2004

An autonomic system can be decomposed to 1) interfaces, 2) interactions and 3) design patterns

A bit RFC-style paper with MUST and SHOULD statements about Autonomic Elements (AE)

MUST Examples: An AE MUST be self-managing An AE MUST handle problems locally whenever possible An AE MUST be capable of establishing and maintaining relationships with other

autonomic elements SHOULD Examples:

An AE SHOULD ask for a realistic set of requirements when requesting a service from another element

An AE SHOULD offer a range of performace, reliability, availability and security associated with its service

An AE SHOULD protect itself against inappropriate service requests and responses

Use of policies

The use of policies is essential for autonomic systems Three (3) policy levels presented

1) Action policies (IF condition THEN action)• An AE employing action policies MUST measure and/or synthesize the

quantities stated in the condition

2) Goal policies (”Response time must not exceed 2 sec.”)• AEs employing goal policies MUST possess sufficient modeling or planning

capabilities to translate goals into actions

3) Utility function policies (automatically determine the most valuable goal in any situation)

• AEs employing utility funtion policies MUST have sophisticated modeling and optimization capabilities to translate utility functions into actions

Interfaces

Making a system autonomic requires additional interfaces to be added to the system Monitoring and test interfaces

Enable an element to be monitored by any other element that has established the appropriate administrative relationships with it

Lifecycle interfaces Enable administrative elements to determine the lifecycle state of an element

(e.g. starting, paused), to cause a state change, and to determine the lifecycle model that applies to the element, and to determine the lifecycle model that applies to the element

Policy interfaces Enable administrative elements to send new policies to an element, and to

determine the policies currently in use by the element Negotiation and binding interfaces

Permit an element to request a service from other elements, or to request to provide a service

Relationships

When an AE has agreed to provide service to another AE, then those two elements have a relationship

Relationships are typically formed at run-time Autonomic systems are built by relationships Request-response paradigm used to form relationships

From autonomic elements to autonomic systems

Assembling an autonomic system requires:1) A collection of AEs that implement the desired function

2) Additional autonomic elements to implement system functions that enable the needed system-level behaviors (=infrastructure elements)

3) Design patterns for system self-management Infrastructure element can be

Registry (provides mechanisms for elements to find one another) Sentinel (provides monitoring services to other elements) Aggregator (combines two or more existing elements and uses them to provide

improved service) Broker (facilitates interaction) Negotiator (assists elements with complex negotiations)

Towards Requirements-Driven Autonomic Systems Design, A. Lapouchnian, S. Liaskos, J. Mylopoulos, Y. Yu, 2005

There are three basic ways to make a system autonomic1) Design the system to support a space of possible behaviors

2) Equip system with planning and social capabilities so that it can delegate tasks to external software components (agents)

3) Build the system so that it has evolutionary capabilities (like biological systems) The first approach was studied in the paper Requirements engineering

Development of a framework for capturing and analyzing stakeholder intentions to generate functional and non-functional requirements

Illustration of requirements engineering: goal model

Top-level ”hard” goal: Schedule meeting AND-composed of lower level

hard goals 4 top-level ”softgoals”

Good quality schedule, Minimal effort, Minimal disturbances, Accurate constraints

Lower level softgoals can be related to higher levels by help (+), hurt (-), make (++) or break (--) relationships

6 alternative ways to fulfill the goal “Schedule Meeting”

An autonomic system should address all different ways of fulfilling the top-level goals

Goal model -> Feature model ->Component Connector model

Goal model is integrated into the knowledge of an autonomic element

Architectural Design of a Distributed Application with Autonomic Quality Requirements, D. Weyns, K. Schelfthout and T. Holvoet, 2005

A reference architecture for situated multi-agent systems (situated MAS) was developed

This reference architecture was applied to a real-world software system The architecture:

A situated MAS consists of an environment populated with agents (autonomous entities)

Intelligence in a situated MAS originates from the interaction between agents, rather than from their individual capabilities

The architecture holds three abstractions: agents, ongoing activities and the environment

High-level model view of the architecture

The Perception module maps the local state of the environment onto a percept for the agent

The Consuption module handles the effects of encironment changes that affect the agent

The Decision module is responsible for action selection

The application

A system in which robots transport loads from one place to another within a warehouse and recharge themselves whenever needed

Old system: centralized server controlled robots Main problem: inflexibility; robots can’t adapt to changing situations

Improvement: Robots are agents acting in a MAS Drawback: more complicated system

Module view of the application

Two kinds of agents: trasport agents and AGV agents Transport agents are ”managers”;

they determine the priority of the transport, assign transports to AGVs and ensure that the transport succeeds

AGV agents are responsible for executing the assigned transport

Architecture of the environment

To cope with the complexity of the environment, it is presented through a layered architecture

Virtual environment uses a middleware layer that enbles agents to communicate with each other

Virtual environment enbles agent routing and prevents collisions

The agent observer a 3-5 meter circle from the virtual environment at a time In this circle the agent marks the

path it is going to use and removes this path when leaving the circle

This way collisions can be avoided Transport agents use the virtual

environment to locate AGV agents

A Control Theory Foundation for Self-Managing Computing Systems, Y. Diao, J. Hellerstein, S. Parekh, R. Griffith, G. Kaiser, D. Phung, 2005

Control theory used as a way to identify a number of requirements for and challenges in building self-managing systems

What does control theory bring to table in terms of self-management? Autonomic computing and control theory have

slightly different points of focus: autonomic computing focuses on the specification and construction of management components that interoperate well, while the focus of control theory is on analyzing and/or developing components and algorithms so that the resulting system achieves the control objectives

For example, control theory provides design techniques for determining the values of parameters in commonly used control algorithms so that the resulting control system is stable and settles quickly in response to disturbances

Feedback Control Theory

Reference Input (I/P) : Desired Output (O/P) (as specified by the human)

Control Error : (Reference I/P – Measured O/P) Control Input : Parameters which affect behavior of the system Disturbance I/P : affects Control I/P Controller : Change Control I/P to achieve Reference I/P Measured O/P : Measurable feature of the system Noise I/P : affects Measured O/P Transducer : Transforms measured O/P to compare with Reference

I/P

Properties of Control Systems

SASO Stable

Bounded Input produces bounded output Unstable systems not usable in mission critical work

Accurate Measure Output converges to Reference (Desired) Input

Short Settling Times Converges to the Stable Value quickly

No Overshoot Achieves objectives in a steady manner

Control Analysis and Design

Notes ServerMaxUsers Actual RIS

)(ky)(kuModel of System Dynamics

)()()1( 11 kubkyaky Model of System Dynamics

)()()1( 11 kubkyaky

47.0

43.0

)(

1

1

1

1

b

a

az

bzN

Transfer Function

Transfer function and Z-transformation used to control and model response times and settling times

Example: control theory approach to web server management

Objective : CPU Utilization < 50% Measured Output : CPU utilization Control Input : “MaxClients” During the first 300 s, the system

operates without feedback control. When the controller is turned on, a reference input of 0.5 is used. At this point, the system begins to oscillate and the amplitude of the oscillations increases. This is a result of a controller design that overreacts to the stochastics in the CPU utilization measurement.

<username>, I Need You!Initiative and Interaction in Autonomic Systems, P. Kaminski, P. Agrawal, H. Kienle, H. Müller, 2005

Autonomic job requirements If I hired a person instead, what qualities would I look for?

attention to detail, strong communication skills, initiative, tempered by job boundaries, self-knowledge and willingness to seek help

Treat users as partners, not masters Basic idea:

The system has an optimization engine that decides if the preferred mode of action in some situation is to 1) connect a human or 2) try to repair the system Decision based on 1) explicit instructions and 2) learning Balance match, bother, rush, risk

The system learns from human actions and becomes more competent in solving problems on its own

Balance initiative and interaction Send messages via e-mail, instant messenger, etc.

Human (operator) is added to the traditional autonomic computing cycle

Autonomic interaction manager

Monitor

PlanAnalyze

ExecuteKnowledge

ask forhelp

receiveadvice

Documents

Autonomic System Design Visa Holopainen, [email protected]