16
Self-stabilization in NEST Mikhail Nesterenko (based on presentation by Anish Arora, Ohio State University)

Self-stabilization in NEST Mikhail Nesterenko (based on presentation by Anish Arora, Ohio State University)

Embed Size (px)

DESCRIPTION

Co-conspirators Mohamed Gouda UTexas, Austin Ted Herman UIowa Sandeep Kulkarni Michigan State Mikhail Nesterenko Kent State

Citation preview

Self-stabilization in NEST

Mikhail Nesterenko(based on presentation by Anish Arora, Ohio State University)

Goals

Scalable dependability via new notions of stabilization• e.g. weak, protective, bounded stabilization

Stabilization at all levels of NEST system stack• e.g., at application level, via component-frameworks and automated synthesis• e.g., at middleware level, via stabilizing monitoring

Stabilization Notions: Original Concept

legitimate states from where safety and livenessare satisfied

illegitimate states reached possiblydue to faults

•Closure: Set of legitimate states is closed under system execution•Convergence: Starting from any system state, every system computation eventually reaches a legitimate state

Weak Stabilization

• Closure• Weak Convergence: Starting from any system state, some system computation eventually reaches a legitimate state

Protective Stabilization

• Closure • Convergence (strong or weak)• Protection: No transition is unsafe ( )

Bounded Stabilization

• Closure• Bounded Convergence:

Set of fault-span states is closed under system execution Starting from any fault-span state, every system computation reaches a legitimate state in bounded time

Fault-span states, convergence time is bounded

Stabilization in NEST System Stack

AP

Timed AP

APC

Stabilizing application

componentframework

synthesisNonstabilizingapplication Stabilization synthesis framework

Implementing stabilizing apps

Stabilizing system/app monitoring

Project: Stabilizing Monitoring Service

Model: • apps/daemons/nodes periodically send a refresh to service • period is chosen within some interval [LF .. HF]

Service ensures in stabilizing manner: • apps/daemons/nodes are up • monitoring service of a node is up

Layered Architecture

Layer 0: Hardware watchdog implements a hardware self-rebooting mechanism

Layer 1: Basic monitoring ensures that registered app/daemons are up

Layer 2: Remote and Advanced monitoring ensures other nodes and distributed process groups are up generation of suspicions for dependent apps/daemons adaptation of refresh periods & registered apps/daemons

Project: Implementing Stabilizing Applications

Input:a (weakly-) stabilizing protocol consisting of processes

communicating via messages in Abstract Protocol (AP) notation

Output:a weakly-stabilizing implementation using UNIX processes and UDP communication

Approach

AP

Timed AP

APC

preserves all safety and liveness properties

preserves some properties, including weak-stabilization

Input

Output

•Abstract timeouts•Zero message delay •Action/fault atomicity•Action fairness

•Real timeouts•Non-zero message delay •Action/fault atomicity•Action fairness

•Real timeouts•Non-zero message delay•Event/weak fault atomicity•Weak action fairness

Project: Stabilization Synthesis Framework

NonstabilizingAPC

Stabilizing APC

dependability componentframework

NonstabilizingAP

Stabilizing AP

synthesis procedure

Approach

• Exponential-time synthesis procedure, with adequate polynomial-time heuristic sufficient for synthesis of byzantine agreement

• Dependability component framework enables reuse of application-independent aspects of stabilization application-dependent parameter used to instantiate this

framework, e.g. network type, communication patterns

Sample Component Frameworks

• Reactive link-predicate stabilization component Retransmission based Use of ACK/NACKs

• Proactive link-predicate stabilization component Forward error correction based Sending parity packets in advance

• Group-of-nodes state-predicate stabilization component

Deliverables and Milestones

• Stabilizing Monitoring Framework: Aug’02: Implementation of basic node monitoring Aug’03: Implementation of advanced node/group monitoring Apr’04: Demo of monitoring service use by NEST application

• Implementing Stabilizing Applications: Aug’02: AP-to-APC transformer implementation Apr’03: Demo of stabilizing transformer-based NEST application Aug’04: Transformer for stabilization of sequential processes

• Stabilizing Synthesis Framework: Aug’02: Demo of tool for synthesis of stabilizing AP protocols Apr’03: BNF & semantics of APC dependability component composition

language Aug’03: Application-independent code for reactive & proactive component

frameworks Apr’04: Demo of stabilizing framework-based NEST application