31
Autonomic Systems ... Gaurav S. Kc ... Septem Autonomic Systems ... Gaurav S. Kc ... Septem ber 26th, 2002 ber 26th, 2002 1 Autonomic Autonomic Systems Systems Autonomic Autonomic : adaptive : adaptive Self-healing Self-healing : cluster systems via node restart cluster systems via node restart Self-optimizing: Self-optimizing: variable encoding schemes for web audio variable encoding schemes for web audio streaming services streaming services Self-regulating Self-regulating : : apache web server periodically kills child apache web server periodically kills child processes processes Maintenance Maintenance: expensive expensive , time-consuming , time-consuming I want my availability, but I won’t do it myself I want my availability, but I won’t do it myself Automated maintenance Automated maintenance : : Cheaper Cheaper Quicker response than human Quicker response than human 24/7 watch, can afford to “forget and leave 24/7 watch, can afford to “forget and leave running” running”

Autonomic Systems... Gaurav S. Kc... September 26th, 20021 Autonomic Systems Autonomic: adaptive Autonomic: adaptive Self-healing: Self-healing: cluster

Embed Size (px)

Citation preview

Page 1: Autonomic Systems... Gaurav S. Kc... September 26th, 20021 Autonomic Systems Autonomic: adaptive Autonomic: adaptive Self-healing: Self-healing: cluster

Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002 11

Autonomic SystemsAutonomic Systems AutonomicAutonomic: adaptive : adaptive

Self-healingSelf-healing:: cluster systems via node restartcluster systems via node restart

Self-optimizing:Self-optimizing: variable encoding schemes for web audio streaming variable encoding schemes for web audio streaming

servicesservices Self-regulatingSelf-regulating : :

apache web server periodically kills child processesapache web server periodically kills child processes

MaintenanceMaintenance:: expensiveexpensive, time-consuming , time-consuming

I want my availability, but I won’t do it myselfI want my availability, but I won’t do it myself

Automated maintenanceAutomated maintenance:: CheaperCheaper Quicker response than humanQuicker response than human 24/7 watch, can afford to “forget and leave running”24/7 watch, can afford to “forget and leave running”

Page 2: Autonomic Systems... Gaurav S. Kc... September 26th, 20021 Autonomic Systems Autonomic: adaptive Autonomic: adaptive Self-healing: Self-healing: cluster

Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002 22

Items for discussionItems for discussion Can large-scale, distributed applications be self-Can large-scale, distributed applications be self-

healing, self-regulating, self-optimizing?healing, self-regulating, self-optimizing?

Important issues with respect to automated Important issues with respect to automated maintenance of large-scale, software systemsmaintenance of large-scale, software systems Harder to build. Focus on reusable componentsHarder to build. Focus on reusable components Specify maintenance operations during developmentSpecify maintenance operations during development Considering maintenance as runtime adaptationsConsidering maintenance as runtime adaptations Gracefully handle unfamiliar, exceptional conditionsGracefully handle unfamiliar, exceptional conditions

Proposal: design methodologyProposal: design methodology Separation of concerns: Separation of concerns:

Application code vs. Application code vs. adaptation mechanisms {decision logic, implementation}adaptation mechanisms {decision logic, implementation}

Introspection:Introspection: Communicate runtime data to decision logicCommunicate runtime data to decision logic

Intercession:Intercession: Transport reconfiguration code from decision logicTransport reconfiguration code from decision logic

Page 3: Autonomic Systems... Gaurav S. Kc... September 26th, 20021 Autonomic Systems Autonomic: adaptive Autonomic: adaptive Self-healing: Self-healing: cluster

Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002 33

Build large-scale systems Build large-scale systems with reusable componentswith reusable components Inherent problem with the development of large-Inherent problem with the development of large-

scale systemsscale systems Hugely complex, unwise for one group of developers Hugely complex, unwise for one group of developers

to create the whole thing from scratchto create the whole thing from scratch Outsource sub-projects to experts vs. license their Outsource sub-projects to experts vs. license their

technologytechnology Integrate with COTS components:Integrate with COTS components:

Cheaper than to re-implement themCheaper than to re-implement them

Software engineering and practicality reasonsSoftware engineering and practicality reasons component has already been implementedcomponent has already been implemented available immediatelyavailable immediately no duplication of effortno duplication of effort 3 types of software components:3 types of software components:

COTSCOTS In-houseIn-house One-use, specific-purpose componentOne-use, specific-purpose component

Page 4: Autonomic Systems... Gaurav S. Kc... September 26th, 20021 Autonomic Systems Autonomic: adaptive Autonomic: adaptive Self-healing: Self-healing: cluster

Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002 44

Component-based Component-based Software EngineeringSoftware Engineering Software componentSoftware component: :

unit of software that conforms to a component modelunit of software that conforms to a component model e.g. COM+, JavaBeanse.g. COM+, JavaBeans

Defines standards:Defines standards: Composition: how components are composed togetherComposition: how components are composed together Interaction: IDL description of interface elementsInteraction: IDL description of interface elements

Two stages of CBSETwo stages of CBSE1.1. Component developmentComponent development

No feedback from customerNo feedback from customer No waterfall model with iterationsNo waterfall model with iterations Exhibit openness, adaptability, Exhibit openness, adaptability,

2.2. Integrating component into applicationsIntegrating component into applications Requirements analysisRequirements analysis Choose component with required functionalityChoose component with required functionality

Take it or leave it ... Take it or leave it ... but then go on looking for another implementationbut then go on looking for another implementation

Page 5: Autonomic Systems... Gaurav S. Kc... September 26th, 20021 Autonomic Systems Autonomic: adaptive Autonomic: adaptive Self-healing: Self-healing: cluster

Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002 55

Component-based Component-based Software Engineering – iiSoftware Engineering – ii Imperfect match in functionality and requirementsImperfect match in functionality and requirements

““Fixed” contractFixed” contract No means for component evolutionNo means for component evolution

Active Interfaces [12]Active Interfaces [12] Adaptation interface. Open policiesAdaptation interface. Open policies Static adaptation of component functionalityStatic adaptation of component functionality

Interface IncompatibilitiesInterface Incompatibilities Granularity of operations and data-types, interaction Granularity of operations and data-types, interaction

mechanisms, implementation languagesmechanisms, implementation languages Component wrappersComponent wrappers Connectors [14]Connectors [14] SWIG, JNI, SWIG, JNI, popen(..)popen(..), , system(..)system(..)

ConsiderationsConsiderations Application builder is Application builder is notnot going to re-implement the going to re-implement the

componentcomponent Want to maintain encapsulation, information hidingWant to maintain encapsulation, information hiding

Page 6: Autonomic Systems... Gaurav S. Kc... September 26th, 20021 Autonomic Systems Autonomic: adaptive Autonomic: adaptive Self-healing: Self-healing: cluster

Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002 66

Items for discussionItems for discussion Can large-scale, distributed applications be self-Can large-scale, distributed applications be self-

healing, self-regulating, self-optimizing?healing, self-regulating, self-optimizing?

Important issues with respect to automated Important issues with respect to automated maintenance of large-scale, software systemsmaintenance of large-scale, software systems Harder to build. Focus on reusable componentsHarder to build. Focus on reusable components Specify maintenance operations during developmentSpecify maintenance operations during development Considering maintenance as runtime adaptationsConsidering maintenance as runtime adaptations Gracefully handle unfamiliar, exceptional conditionsGracefully handle unfamiliar, exceptional conditions

Proposal: design methodologyProposal: design methodology Separation of concerns: Separation of concerns:

Application code vs. Application code vs. adaptation mechanisms {decision logic, implementation}adaptation mechanisms {decision logic, implementation}

Introspection:Introspection: Communicate runtime data to decision logicCommunicate runtime data to decision logic

Intercession:Intercession: Transport reconfiguration code from decision logicTransport reconfiguration code from decision logic

Page 7: Autonomic Systems... Gaurav S. Kc... September 26th, 20021 Autonomic Systems Autonomic: adaptive Autonomic: adaptive Self-healing: Self-healing: cluster

Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002 77

Static modeling of possible Static modeling of possible runtime reconfigurationsruntime reconfigurations Runtime adaptation of softwareRuntime adaptation of software

Ever-changing resource availabilityEver-changing resource availability Dynamic execution environmentDynamic execution environment

Separation of concernsSeparation of concerns:: application logic vs. adaptationapplication logic vs. adaptation

Granularity of adaptationGranularity of adaptation Micro-level: Micro-level:

component developer-enabled mechanism, setting component developer-enabled mechanism, setting switches via Active Interfaces [12, 13, 16]switches via Active Interfaces [12, 13, 16]

Medium-level: Medium-level: change how components interact with the system, change how components interact with the system,

modify the interface [13, 14]modify the interface [13, 14] Macro-level:Macro-level:

phase in/out (groups of) components as part of the phase in/out (groups of) components as part of the dynamic adaptation [13, 14]dynamic adaptation [13, 14]

Page 8: Autonomic Systems... Gaurav S. Kc... September 26th, 20021 Autonomic Systems Autonomic: adaptive Autonomic: adaptive Self-healing: Self-healing: cluster

Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002 88

Static modeling of possible Static modeling of possible runtime reconfigurations – iiruntime reconfigurations – ii Self-contained adaptation within componentSelf-contained adaptation within component

Automatic generation of adaptation codeAutomatic generation of adaptation code Compiler and language support for high-level Compiler and language support for high-level

specification of adaptation mechanism [13]specification of adaptation mechanism [13] Pre-packaged adaptation mechanism [16]Pre-packaged adaptation mechanism [16]

Automatic integration of new component Automatic integration of new component versionsversions Configuration management [15]Configuration management [15]

Installations, updates, un-installationsInstallations, updates, un-installations Tentative use of new versions [14]Tentative use of new versions [14]

Transparent testing in deployed environmentTransparent testing in deployed environment

Page 9: Autonomic Systems... Gaurav S. Kc... September 26th, 20021 Autonomic Systems Autonomic: adaptive Autonomic: adaptive Self-healing: Self-healing: cluster

Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002 99

Items for discussionItems for discussion Can large-scale, distributed applications be self-Can large-scale, distributed applications be self-

healing, self-regulating, self-optimizing?healing, self-regulating, self-optimizing?

Important issues with respect to automated Important issues with respect to automated maintenance of large-scale, software systemsmaintenance of large-scale, software systems Harder to build. Focus on reusable componentsHarder to build. Focus on reusable components Specify maintenance operations during developmentSpecify maintenance operations during development Considering maintenance as runtime adaptationsConsidering maintenance as runtime adaptations Gracefully handle unfamiliar, exceptional conditionsGracefully handle unfamiliar, exceptional conditions

Proposal: design methodologyProposal: design methodology Separation of concerns: Separation of concerns:

Application code vs. Application code vs. adaptation mechanisms {decision logic, implementation}adaptation mechanisms {decision logic, implementation}

Introspection:Introspection: Communicate runtime data to decision logicCommunicate runtime data to decision logic

Intercession:Intercession: Transport reconfiguration code from decision logicTransport reconfiguration code from decision logic

Page 10: Autonomic Systems... Gaurav S. Kc... September 26th, 20021 Autonomic Systems Autonomic: adaptive Autonomic: adaptive Self-healing: Self-healing: cluster

Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002 1010

Writing code toWriting code toimplement dynamic adaptationsimplement dynamic adaptations Hard to dynamically adapt componentsHard to dynamically adapt components

Lack proper understanding of the internalsLack proper understanding of the internals Execute (un) trusted, unfamiliar code, with no idea Execute (un) trusted, unfamiliar code, with no idea

how to fix if things failhow to fix if things fail

RecognizeRecognize the need to adapt the need to adapt

Utilize the available runtime mechanismsUtilize the available runtime mechanisms Pre-existing reconfiguration mechanismsPre-existing reconfiguration mechanisms

Dispatch directives to carry out local micro-adaptationsDispatch directives to carry out local micro-adaptations

Use adaptability of middleware to effectively carry out Use adaptability of middleware to effectively carry out medium- and macro-scale adaptationsmedium- and macro-scale adaptations

Architectural design-driven adapted, guided by Architectural design-driven adapted, guided by component-interaction specificationscomponent-interaction specifications

The inability to reconfigure when required, is a form of failureThe inability to reconfigure when required, is a form of failure

Page 11: Autonomic Systems... Gaurav S. Kc... September 26th, 20021 Autonomic Systems Autonomic: adaptive Autonomic: adaptive Self-healing: Self-healing: cluster

Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002 1111

Items for discussionItems for discussion Can large-scale, distributed applications be self-Can large-scale, distributed applications be self-

healing, self-regulating, self-optimizing?healing, self-regulating, self-optimizing?

Important issues with respect to automated Important issues with respect to automated maintenance of large-scale, software systemsmaintenance of large-scale, software systems Harder to build. Focus on reusable componentsHarder to build. Focus on reusable components Specify maintenance operations during developmentSpecify maintenance operations during development Considering maintenance as runtime adaptationsConsidering maintenance as runtime adaptations Gracefully handle unfamiliar, exceptional conditionsGracefully handle unfamiliar, exceptional conditions

Proposal: design methodologyProposal: design methodology Separation of concerns: Separation of concerns:

Application code vs. Application code vs. adaptation mechanisms {decision logic, implementation}adaptation mechanisms {decision logic, implementation}

Introspection:Introspection: Communicate runtime data to decision logicCommunicate runtime data to decision logic

Intercession:Intercession: Transport reconfiguration code from decision logicTransport reconfiguration code from decision logic

Page 12: Autonomic Systems... Gaurav S. Kc... September 26th, 20021 Autonomic Systems Autonomic: adaptive Autonomic: adaptive Self-healing: Self-healing: cluster

Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002 1212

Self-healing systemsSelf-healing systems Failure is inevitableFailure is inevitable: [20]: [20]

human error: human error: stress level proportional to probability of making a stress level proportional to probability of making a

mistake [22]mistake [22] can shield from user error, systems lack protection from can shield from user error, systems lack protection from

administrator's errors [22]administrator's errors [22]

unanticipated problem:unanticipated problem: beyond careful and thorough testingbeyond careful and thorough testing directed security attackdirected security attack lack of handling mechanismlack of handling mechanism

software aging: transient bugssoftware aging: transient bugs recovery requires a restartrecovery requires a restart build-up of transient bugsbuild-up of transient bugs failure-prone state during executionfailure-prone state during execution

Page 13: Autonomic Systems... Gaurav S. Kc... September 26th, 20021 Autonomic Systems Autonomic: adaptive Autonomic: adaptive Self-healing: Self-healing: cluster

Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002 1313

Self-healing systems – iiSelf-healing systems – ii Availability of system Availability of system

Highly resilientHighly resilient Programmed to handle every expected problemProgrammed to handle every expected problem Self-heals: manages to survive Self-heals: manages to survive unexpectedunexpected situations situations

Availability ratio: MTTF / (MTTF+MTTR)Availability ratio: MTTF / (MTTF+MTTR) increaseincrease base longevity period (BLP) base longevity period (BLP) decreasedecrease recovery time recovery time

Problem-handling mechanismProblem-handling mechanism:: reactive, failure-driven:reactive, failure-driven:

detect occurred failure, follow with restart of affected detect occurred failure, follow with restart of affected subsystems from a stable statesubsystems from a stable state

preventive/proactive, failure-avoidance: preventive/proactive, failure-avoidance: detect increased likelihood of failure, and gradual detect increased likelihood of failure, and gradual

degradation of performance, avert imminent failuredegradation of performance, avert imminent failure

Page 14: Autonomic Systems... Gaurav S. Kc... September 26th, 20021 Autonomic Systems Autonomic: adaptive Autonomic: adaptive Self-healing: Self-healing: cluster

Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002 1414

Technique:Technique:Software Rejuvenation [18, 19]Software Rejuvenation [18, 19] Graceful termination, Immediate restartGraceful termination, Immediate restart

Restart at a clean, internal stateRestart at a clean, internal state Build-up of transient bugsBuild-up of transient bugs Numerical accumulation errors, unreleased system Numerical accumulation errors, unreleased system

resources, memory leak, data corruptionresources, memory leak, data corruption

Levels of rejuvenationLevels of rejuvenation Total rejuvenation Total rejuvenation

Scheduled downtime can be fairly cheapScheduled downtime can be fairly cheap Minimal interruption during low usage periodsMinimal interruption during low usage periods

Partial rejuvenationPartial rejuvenation Transparently rejuvenate selected subcomponentsTransparently rejuvenate selected subcomponents Decoupling between subcomponentsDecoupling between subcomponents Reduced recovery time only for subsystem restartReduced recovery time only for subsystem restart

Recursive rejuvenation [21]Recursive rejuvenation [21] Rejuvenate progressively larger subsystems recursivelyRejuvenate progressively larger subsystems recursively Functional or data dependencies between Functional or data dependencies between

subcomponentssubcomponents

Page 15: Autonomic Systems... Gaurav S. Kc... September 26th, 20021 Autonomic Systems Autonomic: adaptive Autonomic: adaptive Self-healing: Self-healing: cluster

Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002 1515

Program check-pointingProgram check-pointing Periodically save program state to persistent storagePeriodically save program state to persistent storage Can rewind to previous statesCan rewind to previous states

auditing, logsauditing, logs recovery to a valid staterecovery to a valid state install corrective patch, resume [22]install corrective patch, resume [22]

The power of hindsight to enable retroactive repairThe power of hindsight to enable retroactive repair Demonstrates “Demonstrates “what ifwhat if” semantics” semantics Database systems: Database systems:

rollback to consistent state if cannot commit safelyrollback to consistent state if cannot commit safely

Zero-tolerance of system compromiseZero-tolerance of system compromise Pre-emptive defense against security attacksPre-emptive defense against security attacks

Randomized, but valid binary code sequenceRandomized, but valid binary code sequence Sanity checking of control structuresSanity checking of control structures Choose immediate shutdown rather than have system get Choose immediate shutdown rather than have system get

compromisedcompromised Immediate restart, with new randomized codeImmediate restart, with new randomized code

Other self-healing techniquesOther self-healing techniques

Page 16: Autonomic Systems... Gaurav S. Kc... September 26th, 20021 Autonomic Systems Autonomic: adaptive Autonomic: adaptive Self-healing: Self-healing: cluster

Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002 1616

Items for discussionItems for discussion Can large-scale, distributed applications be self-Can large-scale, distributed applications be self-

healing, self-regulating, self-optimizing?healing, self-regulating, self-optimizing?

Important issues with respect to automated Important issues with respect to automated maintenance of large-scale, software systemsmaintenance of large-scale, software systems Harder to build. Focus on reusable componentsHarder to build. Focus on reusable components Specify maintenance operations during developmentSpecify maintenance operations during development Considering maintenance as runtime adaptationsConsidering maintenance as runtime adaptations Gracefully handle unfamiliar, exceptional conditionsGracefully handle unfamiliar, exceptional conditions

Proposal: design methodologyProposal: design methodology Separation of concerns: Separation of concerns:

Application code vs. Application code vs. adaptation mechanisms {decision logic, implementation}adaptation mechanisms {decision logic, implementation}

Introspection:Introspection: Communicate runtime data to decision logicCommunicate runtime data to decision logic

Intercession:Intercession: Transport reconfiguration code from decision logicTransport reconfiguration code from decision logic

Page 17: Autonomic Systems... Gaurav S. Kc... September 26th, 20021 Autonomic Systems Autonomic: adaptive Autonomic: adaptive Self-healing: Self-healing: cluster

Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002 1717

Dynamic profiling, Dynamic profiling, generation of runtime datageneration of runtime data Adaptation subsystemAdaptation subsystem::

Monitoring logic and decision-makingMonitoring logic and decision-making Execution of adaptation mechanismExecution of adaptation mechanism

Automated decision and implementationAutomated decision and implementation Adaptation for recovery or otherwise, without human Adaptation for recovery or otherwise, without human

interventionintervention

Runtime model of the system architectureRuntime model of the system architecture Decision based on evolving modelDecision based on evolving model Runtime data generated by each componentRuntime data generated by each component

Embedded probes: PSLEmbedded probes: PSL Static-adaptable Active Interfaces [12]Static-adaptable Active Interfaces [12]

Context-dependent data format and contentContext-dependent data format and content E-mail management system: size, frequency, E-mail management system: size, frequency,

sender/recipient addresses, types of attachments, sender/recipient addresses, types of attachments, encryption strengthencryption strength

Page 18: Autonomic Systems... Gaurav S. Kc... September 26th, 20021 Autonomic Systems Autonomic: adaptive Autonomic: adaptive Self-healing: Self-healing: cluster

Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002 1818

Communication of Communication of runtime data to decision logicruntime data to decision logic Extended RPC-style communicationExtended RPC-style communication

Client communicates with server at unknown locationClient communicates with server at unknown location RPC clients (execution logic) should be unaware of RPC clients (execution logic) should be unaware of

the presence of RPC servers (decision logic)the presence of RPC servers (decision logic) Need to multiplex emitted dataNeed to multiplex emitted data Asynchronous callbackAsynchronous callback

I can't wait, let me know when you're done!I can't wait, let me know when you're done! Basic Basic Message Passing Message Passing to unknown recipientsto unknown recipients

Event notification systemEvent notification systemSubscribe to published events-of-interestSubscribe to published events-of-interest

Item of interestItem of interest Something that happened somewhere, runtime dataSomething that happened somewhere, runtime data

Generators of items of interestGenerators of items of interest Core system execution, reporting runtime dataCore system execution, reporting runtime data

Consumers of items of interestConsumers of items of interest Monitoring subsystem, interested in runtime dataMonitoring subsystem, interested in runtime data

Page 19: Autonomic Systems... Gaurav S. Kc... September 26th, 20021 Autonomic Systems Autonomic: adaptive Autonomic: adaptive Self-healing: Self-healing: cluster

Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002 1919

Event systemsEvent systems Centralized event systemsCentralized event systems

event-driven GUI programmingevent-driven GUI programming Event Delegation Model: AWT, SWING, JavaBeansEvent Delegation Model: AWT, SWING, JavaBeans

Tightly-coupled client-server model: JINITightly-coupled client-server model: JINI Indirection, anonymity of servers via mediator objectIndirection, anonymity of servers via mediator object

Stable execution environmentStable execution environment Well-ordered delivery mechanismsWell-ordered delivery mechanisms Fast, reliable, predictableFast, reliable, predictable

Distributed event systemsDistributed event systems Supercharged mediator between decoupled entitiesSupercharged mediator between decoupled entities

FilteringFiltering AggregatingAggregating Store-and-forward, Store-and-retrieveStore-and-forward, Store-and-retrieve Mutual anonymityMutual anonymity

Unreliable execution environmentUnreliable execution environment Delayed deliveryDelayed delivery Data lossData loss

Page 20: Autonomic Systems... Gaurav S. Kc... September 26th, 20021 Autonomic Systems Autonomic: adaptive Autonomic: adaptive Self-healing: Self-healing: cluster

Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002 2020

Channel-based routingChannel-based routing:: Single channel per event type [9]Single channel per event type [9]

birds of a feather flock togetherbirds of a feather flock together faster turnaround time; simple, efficient deliveryfaster turnaround time; simple, efficient delivery not scalable to large classes of eventsnot scalable to large classes of events

Subject-based routingSubject-based routing: : NNTP: events on a common theme / interestNNTP: events on a common theme / interest Mailing lists, CVS notificationsMailing lists, CVS notifications

Content-based (semantic) routingContent-based (semantic) routing:: Interested in a subset of a class of eventsInterested in a subset of a class of events selective delivery via specifying acceptability criteriaselective delivery via specifying acceptability criteria Event-data determines propagationEvent-data determines propagation Data replication only if necessary [10, 11]Data replication only if necessary [10, 11] Event composition [8]Event composition [8]

Distributed event systemsDistributed event systems

Page 21: Autonomic Systems... Gaurav S. Kc... September 26th, 20021 Autonomic Systems Autonomic: adaptive Autonomic: adaptive Self-healing: Self-healing: cluster

Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002 2121

Centralized routing nodeCentralized routing node Approximation of localized event systemApproximation of localized event system

Hierarchical collection of nodesHierarchical collection of nodes Subscriptions only go up, notifications cascade downSubscriptions only go up, notifications cascade down DisadvantagesDisadvantages

Overloading of higher-level routing nodesOverloading of higher-level routing nodes Network partitioning via single node failureNetwork partitioning via single node failure

AdvantagesAdvantages Simple routing algorithmsSimple routing algorithms Simple client-server relationships amongst routing Simple client-server relationships amongst routing

nodesnodes

(A)cyclic peer-to-peer network(A)cyclic peer-to-peer network Sophisticated routing algorithmsSophisticated routing algorithms Improved fault-toleranceImproved fault-tolerance

Content-basedContent-basedevent routing topologiesevent routing topologies

Page 22: Autonomic Systems... Gaurav S. Kc... September 26th, 20021 Autonomic Systems Autonomic: adaptive Autonomic: adaptive Self-healing: Self-healing: cluster

Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002 2222

Items for discussionItems for discussion Can large-scale, distributed applications be self-Can large-scale, distributed applications be self-

healing, self-regulating, self-optimizing?healing, self-regulating, self-optimizing?

Important issues with respect to automated Important issues with respect to automated maintenance of large-scale, software systemsmaintenance of large-scale, software systems Harder to build. Focus on reusable componentsHarder to build. Focus on reusable components Specify maintenance operations during developmentSpecify maintenance operations during development Considering maintenance as runtime adaptationsConsidering maintenance as runtime adaptations Gracefully handle unfamiliar, exceptional conditionsGracefully handle unfamiliar, exceptional conditions

Proposal: design methodologyProposal: design methodology Separation of concerns: Separation of concerns:

Application code vs. Application code vs. adaptation mechanisms {decision logic, implementation}adaptation mechanisms {decision logic, implementation}

Introspection:Introspection: Communicate runtime data to decision logicCommunicate runtime data to decision logic

Intercession:Intercession: Transport reconfiguration code from decision logicTransport reconfiguration code from decision logic

Page 23: Autonomic Systems... Gaurav S. Kc... September 26th, 20021 Autonomic Systems Autonomic: adaptive Autonomic: adaptive Self-healing: Self-healing: cluster

Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002 2323

Activation of reconfiguration codeActivation of reconfiguration code Re-use eventsRe-use events

the source (client/decision logic) determines who gets the source (client/decision logic) determines who gets reconfigured, so cannot have the server (execution reconfigured, so cannot have the server (execution logic) subscribe to theselogic) subscribe to these

event systems not designed to carry large amount of event systems not designed to carry large amount of binary code, if needed for component installation, etcbinary code, if needed for component installation, etc

Mobile agents [5]Mobile agents [5]autonomous program that executes on someone’s behalfautonomous program that executes on someone’s behalf

decision logic instructs agents to carry out runtime decision logic instructs agents to carry out runtime reconfiguration tasksreconfiguration tasks

Late-binding of reconfiguration mechanism at targetLate-binding of reconfiguration mechanism at target AsynchronousAsynchronous primary advantage of agents: reconfiguration might primary advantage of agents: reconfiguration might

consist of significant amount of computing, ideally consist of significant amount of computing, ideally performed locally at execution logic rather than a long performed locally at execution logic rather than a long series of RPC invocationsseries of RPC invocations

Page 24: Autonomic Systems... Gaurav S. Kc... September 26th, 20021 Autonomic Systems Autonomic: adaptive Autonomic: adaptive Self-healing: Self-healing: cluster

Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002 2424

Mobile code infrastructuresMobile code infrastructures ConstituentsConstituents

Server: hosting, execution, transportationServer: hosting, execution, transportation Place [6]Place [6] Agent Server [1, 3, 7]Agent Server [1, 3, 7] Worklet Virtual Machine: PSLWorklet Virtual Machine: PSL

AgentsAgents

Incorporate dynamic interfacesIncorporate dynamic interfaces Agent installs specific-purpose interfaces to Agent installs specific-purpose interfaces to

components for customized accesscomponents for customized access ““Wrapper while you wait”, but can configure as Wrapper while you wait”, but can configure as

neededneeded

Page 25: Autonomic Systems... Gaurav S. Kc... September 26th, 20021 Autonomic Systems Autonomic: adaptive Autonomic: adaptive Self-healing: Self-healing: cluster

Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002 2525

Automatic mobility of programsAutomatic mobility of programs Strong mobilityStrong mobility

OS support for process relocation [5]OS support for process relocation [5]

Weak mobilityWeak mobility State- and code-transfer at application levelState- and code-transfer at application level Programming-language, runtime support [6]Programming-language, runtime support [6]

Special-purpose language [6]Special-purpose language [6] Scripting languages [6]Scripting languages [6]

Agent code is in textual formAgent code is in textual form General purpose language [23]General purpose language [23]

Late-binding of class definitions by dynamic code loadingLate-binding of class definitions by dynamic code loading Serialization of objectsSerialization of objects

Simulated strong mobilitySimulated strong mobility Local function continuations [2]Local function continuations [2] Modified JVM [4]Modified JVM [4]

Page 26: Autonomic Systems... Gaurav S. Kc... September 26th, 20021 Autonomic Systems Autonomic: adaptive Autonomic: adaptive Self-healing: Self-healing: cluster

Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002 2626

Security issues: mobile codeSecurity issues: mobile code A greater vulnerability: unknown codeA greater vulnerability: unknown code

Protect agent from server, and vice versa [1, 3, 7]Protect agent from server, and vice versa [1, 3, 7]

Language supportLanguage support Bytecode verification in JVM Bytecode verification in JVM

Type-system protection from malicious classesType-system protection from malicious classes Integrity-checking of bytecode instructionsIntegrity-checking of bytecode instructions

Cannot define / load core system classesCannot define / load core system classes

Application-level security considerationsApplication-level security considerations:: Authentication, authorizationAuthentication, authorization

Permissions model based on certification, credentialsPermissions model based on certification, credentials Data encryption during transitData encryption during transit Tampering detection via digital signaturesTampering detection via digital signatures

Page 27: Autonomic Systems... Gaurav S. Kc... September 26th, 20021 Autonomic Systems Autonomic: adaptive Autonomic: adaptive Self-healing: Self-healing: cluster

Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002 2727

Conclusions, future directionsConclusions, future directions Autonomic large-scale, distributed systemsAutonomic large-scale, distributed systems

Criteria for construction and automated maintenanceCriteria for construction and automated maintenance State of the art researchState of the art research

Autonomic systems exist for specific domainsAutonomic systems exist for specific domains Technologies / tools available for building general Technologies / tools available for building general

framework for adaptationframework for adaptation

Dynamic architectural modelingDynamic architectural modeling Accurate modeling of the system during executionAccurate modeling of the system during execution Decision made on evolving modelDecision made on evolving model Adaptation heuristics based on:Adaptation heuristics based on:

Historical patternsHistorical patterns Temporal dataTemporal data

Page 28: Autonomic Systems... Gaurav S. Kc... September 26th, 20021 Autonomic Systems Autonomic: adaptive Autonomic: adaptive Self-healing: Self-healing: cluster

Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002 2828

Bibliography – Mobile agentsBibliography – Mobile agents1.1. Design of the Ajanta System for Mobile Agent ProgrammingDesign of the Ajanta System for Mobile Agent Programming

Anand R. Tripathi, Neeran M. Karnik, Tanvir Ahmed, Ram D. Singh, Arvind Prakash, Vineet Kakani, Anand R. Tripathi, Neeran M. Karnik, Tanvir Ahmed, Ram D. Singh, Arvind Prakash, Vineet Kakani, Manish K. Vora, Mukta PathakManish K. Vora, Mukta PathakJournal of Systems and Software, May 2002Journal of Systems and Software, May 2002

2.2. How to Migrate AgentsHow to Migrate AgentsMatthew Hohlfeld, Bennet YeeMatthew Hohlfeld, Bennet YeeTechnical Report CS98-588, Computer Science and Engineering Department, University of Technical Report CS98-588, Computer Science and Engineering Department, University of California at San Diego, La Jolla, CA, June 1998California at San Diego, La Jolla, CA, June 1998

3.3. Experiences and Future Challenges in Mobile Agent ProgrammingExperiences and Future Challenges in Mobile Agent ProgrammingAnand R. Tripathi, Tanvir Ahmed, Neeran M. KarnikAnand R. Tripathi, Tanvir Ahmed, Neeran M. KarnikMicroprocessor and Microsystems 2001Microprocessor and Microsystems 2001

4.4. Pickling threads state in the Java systemPickling threads state in the Java systemS. Bouchenak, D. HagimontS. Bouchenak, D. HagimontIn Proc. of the Technology of Object-Oriented Languages and Systems (TOOLS), 2000In Proc. of the Technology of Object-Oriented Languages and Systems (TOOLS), 2000

5.5. Mobile Agents: Are they a good idea?Mobile Agents: Are they a good idea?Colin G. Harrison, David M. Chess, Aaron KershenbaumColin G. Harrison, David M. Chess, Aaron KershenbaumIBM Research Report, T.J.Watson Research Center, NY, 1995IBM Research Report, T.J.Watson Research Center, NY, 1995

6.6. Programming languages for mobile codeProgramming languages for mobile codeTommy ThornTommy ThornACM Computing Surveys, 29(3):213-239, 1997. Also Technical Report 1083, University of Rennes ACM Computing Surveys, 29(3):213-239, 1997. Also Technical Report 1083, University of Rennes IRISAIRISA

7.7. Design Issues in Mobile Agent Programming SystemsDesign Issues in Mobile Agent Programming SystemsNeeran M. Karnik, Anand R. TripathiNeeran M. Karnik, Anand R. TripathiIEEE Concurrency, July-Sep 1998IEEE Concurrency, July-Sep 1998

Page 29: Autonomic Systems... Gaurav S. Kc... September 26th, 20021 Autonomic Systems Autonomic: adaptive Autonomic: adaptive Self-healing: Self-healing: cluster

Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002 2929

Bibliography – Event systemsBibliography – Event systems8.8. Generic Support for Distributed ApplicationsGeneric Support for Distributed Applications

Jean Bacon, Ken Moody, John Bates, Richard Hayton, Chaoying Ma, Andrew McNeil, Oliver Seidel, Jean Bacon, Ken Moody, John Bates, Richard Hayton, Chaoying Ma, Andrew McNeil, Oliver Seidel, Mark SpiteriMark SpiteriIEEE Computer, pages 68-77, March 2000IEEE Computer, pages 68-77, March 2000

9.9. Host Groups: A Multicast Extension to the Internet ProtocolHost Groups: A Multicast Extension to the Internet ProtocolS. E. Deering, D. R. CheritonS. E. Deering, D. R. CheritonNetwork Working Group: RFC 0966Network Working Group: RFC 0966

10.10. State of the Art Review of Distributed Event ModelsState of the Art Review of Distributed Event ModelsRené MeierRené MeierDept. of Computer Science, Trinity College Dublin, Ireland, March 2000. Technical report TCD-CS-Dept. of Computer Science, Trinity College Dublin, Ireland, March 2000. Technical report TCD-CS-2000-162000-16

11.11. Achieving Expressiveness and Scalability in an Internet-Scale Event Notification ServiceAchieving Expressiveness and Scalability in an Internet-Scale Event Notification ServiceAntonio Carzaniga, David S. Rosenblum, Alexander L. WolfAntonio Carzaniga, David S. Rosenblum, Alexander L. WolfIn Proceedings of the Nineteenth ACM Symposium on Principles of Distributed Computing (PODC In Proceedings of the Nineteenth ACM Symposium on Principles of Distributed Computing (PODC 2000)2000)

Page 30: Autonomic Systems... Gaurav S. Kc... September 26th, 20021 Autonomic Systems Autonomic: adaptive Autonomic: adaptive Self-healing: Self-healing: cluster

Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002 3030

Bibliography – System adaptationBibliography – System adaptation12.12. A Model for Designing Adaptable Software ComponentsA Model for Designing Adaptable Software Components

George HeinemanGeorge HeinemanIn 22nd Annual International Computer Software and Applications Conference, pages 121--127, In 22nd Annual International Computer Software and Applications Conference, pages 121--127, Vienna, Austria, August 1998. In 22nd Annual International Computer Software and Applications Vienna, Austria, August 1998. In 22nd Annual International Computer Software and Applications Conference, pages 121--127, Vienna, Austria, August 1998Conference, pages 121--127, Vienna, Austria, August 1998

13.13. Language and Compiler Support for Adaptive Distributed ApplicationsLanguage and Compiler Support for Adaptive Distributed ApplicationsVikram Adve, Vinh Vi Lam, Brian EnsinkVikram Adve, Vinh Vi Lam, Brian EnsinkACM SIGPLAN Workshop on Optimization of Middleware and Distributed Systems (OM 2001) ACM SIGPLAN Workshop on Optimization of Middleware and Distributed Systems (OM 2001) Snowbird, Utah, June 2001 (in conjunction with PLDI2001)Snowbird, Utah, June 2001 (in conjunction with PLDI2001)

14.14. Increasing the Confidence in Off-the-Shelf Components: A Software Connector-Based Increasing the Confidence in Off-the-Shelf Components: A Software Connector-Based ApproachApproachMarija Rakic, Nenad MedvidovicMarija Rakic, Nenad MedvidovicProceedings of SSR '01 on 2001 Symposium on Software Reusability : Putting Software Reuse in Proceedings of SSR '01 on 2001 Symposium on Software Reusability : Putting Software Reuse in ContextContext

15.15. A Cooperative Approach to Support Software Deployment Using the Software DockA Cooperative Approach to Support Software Deployment Using the Software DockRichard S. Hall, Dennis Heimbigner, Alexander L. WolfRichard S. Hall, Dennis Heimbigner, Alexander L. WolfInternational Conference on Software Enginering, May 1999International Conference on Software Enginering, May 1999

16.16. The Illinois GRACE Project: Global Resource Adaptation through CoopErationThe Illinois GRACE Project: Global Resource Adaptation through CoopErationSarita V. Adve, Albert F. Harris, Christopher J. Hughes, Douglas L. Jones, Robin H. Kravets, Klara Sarita V. Adve, Albert F. Harris, Christopher J. Hughes, Douglas L. Jones, Robin H. Kravets, Klara Nahrstedt, Daniel Grobe Sachs, Ruchira Sasanka, Jayanth Srinivisan, Wanghong YuanNahrstedt, Daniel Grobe Sachs, Ruchira Sasanka, Jayanth Srinivisan, Wanghong YuanIn proceedings of Workshop on Self-Healing, Adaptive and self-MANaged Systems (SHAMAN) 2002In proceedings of Workshop on Self-Healing, Adaptive and self-MANaged Systems (SHAMAN) 2002

Page 31: Autonomic Systems... Gaurav S. Kc... September 26th, 20021 Autonomic Systems Autonomic: adaptive Autonomic: adaptive Self-healing: Self-healing: cluster

Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002Autonomic Systems ... Gaurav S. Kc ... September 26th, 2002 3131

Bibliography – Bibliography – Dynamic healing, MiscellaneousDynamic healing, Miscellaneous17.17. Autonomic ComputingAutonomic Computing

Paul Horn, IBM ResearchPaul Horn, IBM Research

18.18. Software Rejuventation: Analysis, Module and ApplicationsSoftware Rejuventation: Analysis, Module and ApplicationsYennun Huang, Chandra Kintala, Nick Kolettis, N. Dudley FultonYennun Huang, Chandra Kintala, Nick Kolettis, N. Dudley FultonProceedings of the 25th International Symposium on Fault-Tolerant Computing (FTCS-25), Proceedings of the 25th International Symposium on Fault-Tolerant Computing (FTCS-25), Pasadena, CA, pp. June 1995, pp. 381-390Pasadena, CA, pp. June 1995, pp. 381-390

19.19. IBM director software rejuvenationIBM director software rejuvenation..White paperWhite paper

20.20. Recovery Oriented Computing (ROC): Motivation, Definition, Techniques, and Case StudiesRecovery Oriented Computing (ROC): Motivation, Definition, Techniques, and Case StudiesDavid Patterson, Aaron Brown, Pete Broadwell, George Candea, Mike Chen, James Cutler, Patricia David Patterson, Aaron Brown, Pete Broadwell, George Candea, Mike Chen, James Cutler, Patricia Enriquez, Armando Fox, Emre Kiciman, Matthew Merzbacher, David Oppenheimer, Naveen Sastry, Enriquez, Armando Fox, Emre Kiciman, Matthew Merzbacher, David Oppenheimer, Naveen Sastry, William Tetzlaff, Jonathan Traupmann, Noah TreuhaftWilliam Tetzlaff, Jonathan Traupmann, Noah TreuhaftUC Berkeley Computer Science Technical Report UCB//CSD-02-1175, March 15, 2002UC Berkeley Computer Science Technical Report UCB//CSD-02-1175, March 15, 2002

21.21. Reducing Recovery Time in a Small Recursively Restartable SystemReducing Recovery Time in a Small Recursively Restartable SystemGeorge Candea, James Cutler, Armando Fox, Rushabh Doshi, Priyank Garg, Rakesh GowdaGeorge Candea, James Cutler, Armando Fox, Rushabh Doshi, Priyank Garg, Rakesh GowdaAppears in Proceedings of the International Conference on Dependable Systems and Networks Appears in Proceedings of the International Conference on Dependable Systems and Networks (DSN-2002), June 2002(DSN-2002), June 2002

22.22. Rewind, Repair, Replay: Three R's to DependabilityRewind, Repair, Replay: Three R's to DependabilityAaron B. Brown, David A. PattersonAaron B. Brown, David A. PattersonTo appear in 10th ACM SIGOPS European Workshop, Saint-Emilion, France, September 2002To appear in 10th ACM SIGOPS European Workshop, Saint-Emilion, France, September 2002

23.23. Dynamic Class Loading in the Java(TM) Virtual MachineDynamic Class Loading in the Java(TM) Virtual MachineSheng Liang, Gilad BrachaSheng Liang, Gilad BrachaConference on Object-oriented programming, systems, languages, and applications (OOPSLA'98)Conference on Object-oriented programming, systems, languages, and applications (OOPSLA'98)