79
Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Embed Size (px)

Citation preview

Page 1: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Membership and Clique Avoidance in TTP/C

Gunther Bauer, Michael Paulitsch

Presented by Michael Sirivianos02/01/2005

Page 2: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Overview

Membership in hard Real Time systems. What is it and why?

Objectives TTP/C Overview Group membership.

Clique Avoidance and Implicit Acks Cluster Model-Fault Model General Properties Analysis Conclusions

Page 3: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

What is a RT Membership Service?

Safety critical RT systems use a bus system for communication.

A class C system offers the required FT.

A membership service gives timely and consistent info on the state of all nodes.

Page 4: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Why do we need it?

Membership service establishes replica-deterministic agreement on all messages. Prevents clique formation and certain classes of arbitrary faults Allows global knowledge thus consistent and timely reaction to

faults.

Membership is a critical function for the correct operation of the communication system. Should be placed below the app. Layer within the TTP layer.

Page 5: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

TTP/C Overview

Services: Message transport at specific time instances, with

minimal jitter. Fault-tolerant clock synchronization Fault-tolerant membership management.

TDMA media access Not necessarily equal sized time slots. MEssage Description List contains TDMA

schedule and groups several rounds of TDMA in cluster rounds. Statically assigned to all nodes.

Page 6: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

TTP/C Overview, cont. State of the distributed system (C-state). It comprises of:

Membership The global time last frame B/C started. Number of current TDMA slot

I (protocol state info) and X (protocol+app. data info) frames periodically transmit and carry C-state.

N (app. data info) frames. Determining consistency of C-state, by calculating CRC over both app. data and C-state.

Page 7: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

TTP/C Overview, cont. A node in the cluster, which is included in the

schedule but has been inactive, can be integrated using global time and C-state info from the I/X frames.

Page 8: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

TTP Protocol Stack

Application software in Host

FTU Membership

Redundancy Management

SRU Membership

Clock Synchronization

Media Access: TDMA

Host Layer

FTU CNI

FTU Layer

RM Layer

SRU Layer

Data Link/Physical Layer

Basic CNI

Page 9: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

TTP Protocol Stack (cont.) Data Link/Physical Layer

Provide the means to exchange frames between the nodes SRU Layer

Store the data fields of the received frames RM Layer

Provide the mechanisms for the cold start of a TTP/C cluster FTU Layer

Group two or more nodes into FTUs Host Layer

Provide the application software Basic CNI

A data-sharing interface between the RM layer and FTU layer FTU CNI

The interface between FTU layer and Host Layer

Page 10: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Structure of a TTP/C Based System

Page 11: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Timeline in TTP/C TDMA Cycle

One FTU sends message twice The pattern is repeated when TDMA round ends

Cluster Cycle Cluster cycle involves scheduling all possible messages and

tasks

Page 12: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

TTP/C Frame Structure

N-Frame:

Page 13: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Paper Objective

Investigate properties of the Clique Avoidance algorithm. Performance analysis and study of interaction with Implicit Acks mechanism.

Study ability to resolve and detect conflicts in membership views of nodes within a cluster.

Provide time bounds for detecting and removing faulty members.

For their analysis, they assume arbitrary failures with bounded frequency.

Page 14: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Initial TTP/C Fault Hypothesis.Nodes.

Only one faulty node within the duration of a TDMA round.

A node may become faulty only after any previously faulty node has either shut down or operates correctly again.

Transmission fault is consistent (nodes will consistently consider the respective frame faulty or correct)

A node does not send faulty or correct data outside its assigned sendings slots.

A node never hides its identity when sending frames.

Page 15: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Initial TTP/C Fault Hypothesis.Network.

Only one channel can be faulty during a TDMA slot. A channel does not spontaneously create correct frames A channel will deliver a frame either within some known

time bounds or never.

Bus Guardian transforms node errors, to comply with hypothesis.

Central Guardian a more cost effective solution. Handles several arbitrary faults.

Page 16: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Cluster Model - Extended Fault Hypothesis

No more failures besides the one that caused a cluster partition can occur two TDMA rounds before and after the failure. Thus, initially there is a single clique in which all nodes are assigned to.

Partition failure should cause both partitions to contain more than one member. Should affect both channels and be inconsistent. Contrary to the to initial hypothesis.

TTP/C can handle faults in violation of hypothesis, but in this case there is no guarantee it selects the correct clique.

Page 17: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Group Membership Protocol

Clique Avoidance algorithm Removes faulty nodes from cluster Prevents several coexisting cliques

Implicit Acknowledgement The node inspects the membership list sent by the

receiving nodes, to determine whether its message was correctly received.

Page 18: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Cluster Model - Slot

n slots per TDMA round

Page 19: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Clique Avoidance A reception is considered correct if the received C-state

matches the local C-state and data are not corrupted. i.e transmission time is correct and memberships match after adding sender. After a successful reception sender is added to receivers ML. After incorrect reception, sender is removed from ML. If the ML of the receiver differs only by the sender, then reception is

successful. Accept Counter is increased for every successful reception. Failed Counter is increased for every incorrect reception. If Failed counter >= Accept counter, node raises Ack Error and

shuts down (freezes). FC and AC are reset to 0 in each TDMA round.

Page 20: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Clique formation under the extended fault hypothesis

Prior to failure, there is consensus on membership. Transient failure occurs at slot 0, when node A is

transmitting. Asymmetric send fault. As a result, several nodes in cluster correctly received

A’s transmission and the rest did not. Two cliques are formed. The one of members with

membership that includes A and the one of members that do not include A.

Page 21: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Implicit Acks - Successors After successful transmission, A increases AC. B checks frame for correctness. 1. A waits for expected message from B.2. If reception was successful, B adds A in its ML and transmits a

non corrupted message. 3. If ML’s are the same or B’s differ only by A , then A considers B its successor.

If ML’s are the same, then A is acked. (case 1). It increases its AC and adds B in ML.

If B’s ML differs by A, then A increases FC and removes B. B’s reception was not successful and B removed A. (case 2)

4. Otherwise A removes B from its ML. It increases FC unless B did not transmit at all. A goes to step 1.(case 3)

Page 22: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Implicit Acks - Successors

4. A waits for expected message from subsequent node C.5. If A finds successor C that contains A in ML, then

it is acknowledged. B is assumed faulty and both FC and ML were updated

correctly. A increases AC and adds C in ML. (case 4)

5. However, if C’s ML does not include A, A considers himself erroneous. A removes itself from

local list and adds both B, C. Increases AC. It has the same ML with B, and C (case 5)

Page 23: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Implicit Acks - Defector

In case 5, A changes clique membership. Becomes

defector.

Other nodes become aware of a defector only in its next sending round, by the transmitted ML.

If defector becomes implicitly acknowledged, then it is no longer defector. If not, it freezes due to CA.

Page 24: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure. Slot 0 Preparation Phase

Node AC FC View

A0 5 0 A0 A1 A2 A3 A4

A1 4 0 A0 A1 A2 A3 A4

A2 3 0 A0 A1 A2 A3 A4

A3 2 0 A0 A1 A2 A3 A4

A4 1 0 A0 A1 A2 A3 A4

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Page 25: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure. Slot 0 Transmission Phase

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Node A0

Transmitted

Membership

A0 A1 A2 A3 A4

Page 26: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure. Slot 0 Evaluation Phase

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Node Correctly Received Frame ?

A0 Yes (Itself)

A1 No

A2 Yes

A3 Yes

A4 No

Page 27: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure. Slot 1 Preparation Phase

Node AC FC View

A0 1 0 A0 A1 A2 A3 A4

A1 4 1 A1 A2 A3 A4

A2 4 0 A0 A1 A2 A3 A4

A3 3 0 A0 A1 A2 A3 A4

A4 1 1 A1 A2 A3 A4

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Page 28: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure. Slot 1 Transmission Phase

Node A1

Transmitted

Membership

A1 A2 A3 A4

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Page 29: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure. Slot 1 Evaluation Phase

Node Same Membership ? Action

A0 No, by A0 A1 becomes 1stSucc. Remove A1, Inc. FC

A1 Yes(Itself) Inc. AC

A2 No, by A0 Remove A1, Inc. FC

A3 No, by A0 Remove A1, Inc. FC

A4 Yes Inc. AC

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Page 30: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure. Slot 2 Preparation Phase

Node AC FC View

A0 1 1 A0 A2 A3 A4

A1 1 0 A1 A2 A3 A4

A2 4 1 A0 A2 A3 A4

A3 3 1 A0 A2 A3 A4

A4 2 1 A1 A2 A3 A4

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Page 31: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure. Slot 2 Transmission Phase

Node A2

Transmitted

Membership

A0 A2 A3 A4

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Page 32: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure. Slot 2 Evaluation Phase

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Node Same Membership ? Action

A0 Yes A2 becomes 2nd Succ.

Inc. AC. It is acked.

A1 No, by A0, A1 Remove A2 , Inc FC

A2 Yes (itself) Inc. AC

A3 Yes Inc. AC

A4 No, by A0, A1 Remove A2 , Inc FC

Page 33: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure. Slot 3 Preparation Phase

Node AC FC View

A0 2 1 A0 A2 A3 A4

A1 1 1 A1 A3 A4

A2 1 0 A0 A2 A3 A4

A3 4 1 A0 A2 A3 A4

A4 2 2 A1 A3 A4

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Page 34: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure. Slot 3 Transmission Phase

Node A3

Transmitted

Membership

A0 A2 A3 A4

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Page 35: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure. Slot 3 Evaluation Phase

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Node Same Membership ? Action

A0 Yes Inc. AC

A1 No, by A0, A1 , A2 Remove A3 , Inc FC

A2 Yes Inc. AC (It is acked).

A3 Yes (Itself) Inc. AC

A4 No, by A0, A1 , A3 Remove A3 , Inc FC

Page 36: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure. Slot 4 Preparation Phase

Node AC FC View

A0 3 1 A0 A2 A3 A4

A1 1 2 A1 A4

A2 2 0 A0 A2 A3 A4

A3 1 0 A0 A2 A3 A4

A4 2 3 A1 A4

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Page 37: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure. Slot 4 Preparation Phase

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

FC > AC

Node A4 Freezes !

Page 38: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure. Slot 4 Evaluation Phase

Node Null message? Action

A0 Yes Remove A4

A1 Yes Remove A4

A2 Yes Remove A4

A3 Yes Remove A4

A4 Frozen Frozen

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Page 39: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure. Slot 5 Preparation Phase

Node AC FC View

A0 3 1 A0 A2 A3

A1 1 2 A1

A2 2 0 A0 A2 A3

A3 1 0 A0 A2 A3

A4 Frozen Frozen Frozen

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Page 40: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure. Slot 5 Transmission Phase

Node A1

Transmitted

Membership

A0 A2 A3

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Page 41: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure. Slot 5 Evaluation Phase

Node Same Membership ? Action

A0 Yes (Itself) Inc AC

A1 No, by A0, A1 , A2 , A3 Inc. FC

A2 Yes Inc. AC

A3 Yes Inc. AC

A4 Frozen Frozen

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Page 42: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure. Slot 6 Preparation Phase

Node AC FC View

A0 3 1 A0 A2 A3

A1 1 3 A1

A2 3 0 A0 A2 A3

A3 2 0 A0 A2 A3

A4 Frozen Frozen Frozen

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Page 43: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure. Slot 6 Preparation Phase

FC > AC

Node A1 Freezes!

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Page 44: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure. Slot 6 Evaluation Phase

Node Null message? Action

A0 Yes Remove A1

A1 Frozen Frozen

A2 Yes Remove A1

A3 Yes Remove A1

A4 Frozen Frozen

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Page 45: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure-Defection. Slot 0 Preparation Phase

Node AC FC View

A0 5 0 A0 A1 A2 A3 A4

A1 4 0 A0 A1 A2 A3 A4

A2 3 0 A0 A1 A2 A3 A4

A3 2 0 A0 A1 A2 A3 A4

A4 1 0 A0 A1 A2 A3 A4

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Page 46: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure-Defection. Slot 0 Transmission Phase

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Node A0

Transmitted

Membership

A0 A1 A2 A3 A4

Page 47: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure-Defection Slot 0 Evaluation Phase

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Node Correctly Received Frame ?

A0 Yes (Itself)

A1 No

A2 No

A3 Yes

A4 No

Page 48: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure-Defection. Slot 1 Preparation Phase

Node AC FC View

A0 1 0 A0 A1 A2 A3 A4

A1 4 1 A1 A2 A3 A4

A2 3 1 A1 A2 A3 A4

A3 3 0 A0 A1 A2 A3 A4

A4 1 1 A1 A2 A3 A4

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Page 49: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure-Defection. Slot 1 Transmission Phase

Node A1

Transmitted

Membership

A1 A2 A3 A4

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Page 50: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure-Defection. Slot 1 Evaluation Phase

Node Same Membership ? Action

A0 No, by A0 A1 becomes 1stSucc. Remove A1, Inc. FC

A1 Yes(Itself) Inc. AC

A2 Yes Inc. AC

A3 No, by A0 Remove A1, Inc. FC

A4 Yes Inc. AC

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Page 51: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure-Defection. Slot 2 Preparation Phase

Node AC FC View

A0 1 1 A0 A2 A3 A4

A1 1 0 A1 A2 A3 A4

A2 4 1 A1 A2 A3 A4

A3 3 1 A0 A2 A3 A4

A4 2 1 A1 A2 A3 A4

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Page 52: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure-Defection. Slot 2 Transmission Phase

Node A2

Transmitted

Membership

A1 A2 A3 A4

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Page 53: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure-Defection. Slot 2 Evaluation Phase

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Node Same Membership ? Action

A0 No, by A0, A1 A2 becomes 2nd Succ.

Inc AC. Defects.

A1 Yes Inc AC

A2 Yes (itself) Inc AC

A3 No, by A0, A1 Remove A2 , Inc FC

A4 Yes Inc AC

Page 54: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure-Defection. Slot 3 Preparation Phase

Node AC FC View

A0 2 1 A1 A2 A3 A4

A1 2 0 A1 A2 A3 A4

A2 1 0 A1 A2 A3 A4

A3 3 2 A0 A3 A4

A4 3 1 A1 A2 A3 A4

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Page 55: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure-Defection. Slot 3 Transmission Phase

Node A3

Transmitted

Membership

A0 A3 A4

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Page 56: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure-Defection. Slot 3 Evaluation Phase

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Node Same Membership ? Action

A0 No, by A0, A1 , A2 Remove A3 , Inc FC

A1 No, by A0, A1 , A2 Remove A3 , Inc FC

A2 No, by A0, A1 , A2 Remove A3 , Inc FC

A3 Yes (Itself) Inc AC

A4 No, by A0, A1 , A2 Remove A3 , Inc FC

Page 57: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure-Defection. Slot 4 Preparation Phase

Node AC FC View

A0 2 2 A1 A2 A4

A1 2 1 A1 A2 A4

A2 1 1 A1 A2 A4

A3 1 0 A0 A3 A4

A4 3 2 A1 A2 A4

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Page 58: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure-Defection. Slot 4 Transmission Phase

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Node A4

Transmitted

Membership

A1 A2 A4

Page 59: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure-Defection. Slot 4 Evaluation Phase

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Node Same Membership ? Action

A0 Yes Inc AC

A1 Yes Inc AC

A2 Yes Inc AC

A3 No, by A0, A1 , A2 , A3 Remove A4 , Inc FC

A4 Yes (Itself Inc AC

Page 60: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure-Defection. Slot 5 Preparation Phase

Node AC FC View

A0 3 2 A1 A2 A4

A1 3 1 A1 A2 A4

A2 2 1 A1 A2 A4

A3 1 1 A0 A3

A4 1 0 A1 A2 A4

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Page 61: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure-Defection. Slot 5 Transmission Phase

Node A1

Transmitted

Membership

A0 A1 A2 A4

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Page 62: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure-Defection. Slot 5 Evaluation Phase

Node Same Membership ? Action

A0 Yes (Itself) Inc AC

A1 Yes Inc. AC

A2 Yes Inc. AC

A3 No, by A1 , A2 , A3 , A4 Remove A0 , Inc FC

A4 Yes Inc. AC

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Page 63: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure-Defection. Slot 6 Preparation Phase

Node AC FC View

A0 1 0 A0 A1 A2 A4

A1 4 1 A0 A1 A2 A4

A2 3 1 A0 A1 A2 A4

A3 1 2 A3

A4 4 0 A0 A1 A2 A4

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Page 64: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure-Defection. Slot 8 Preparation Phase

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Node AC FC View

A0 3 0 A0 A1 A2 A4

A1 2 0 A0 A1 A2 A4

A2 1 0 A0 A1 A2 A4

A3 1 4 A3

A4 2 0 A0 A1 A2 A4

Page 65: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure-Defection. Slot 8 Preparation Phase

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

FC > AC

Node A3 Freezes!

Page 66: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Partition failure. Slot 8 Evaluation Phase

Node Null message? Action

A0 Yes Remove A3

A1 Yes Remove A3

A2 Yes Remove A3

A3 Frozen Frozen

A4 Yes Remove A3

Node A0 A1 A2 A3 A4 A0 A1 A2 A3 A4

Time 0 1 2 3 4 5 6 7 8 9

Page 67: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Cluster Model- IoI

The length of Interval of Interest is one TDMA round. Each node has its own.

Page 68: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

CA General Properties Property 1. Node becomes defector iff it sends in slot 0 (moment of failure).

Proof. Suppose partition created cliques A, B.

Let A0 A. If A1, A2 B then case 5, A0 reconsiders clique membership. If failure s.t. members of A are still able to communicate with B, in next round A will send correct frame containing B’s ML including itself and will be added in B.

Let Ax , x > 0, either belonging to same clique as

Ax+1 or to another. If Ax , Ax+1 in same clique then Ax is acked.

If not, there is a preceding slot where another member of Ax‘s or Ax+1 ‘s clique has sent. Thus, Ax , Ax+1 disagree on membership status and Ax+1 does not become Ax ‘s 1st successor.

Page 69: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

CA General Properties

Property 2. The results of CA algorithm during the first TDMA round do not depend on the existence of a defector.

Proof. Let A0 A. All Ax s.t. x > 0 do not know if A0 is defector, because other nodes evaluate membership of A0 during eval. phase of slot 0. Takes up to eval. phase of slot 2 for A0 to become defector.

A0 is not influenced too, since it performs CA in its prep. Phase. At this time (slot 0) it is not defector since no failure can occur in previous round (hypothesis).

Page 70: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Analysis Property 4. If there is no defector, no member of larger clique will

shut down. All members of smaller clique will shut down within 2 TDMA rounds from clique formation. No member of smaller clique will send in 2nd TDMA round.

Proof. Let |A| + |B| = n. A node freezes iff FC >= n/2. Let |A| > |B|.

Max FC for members in A is FCA <= |B| < n/2.

Min AC for members in A is ACA >= |A| > n/2.

Thus, no member of A will shutdown.

B’s members who did not shut down by end of 1st round will have FCB = |A| > n/2, ACB <= |B| < n/2 and will shutdown in 2nd round.

Page 71: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Property 4 Example

A0 A1 A2 A3 A4 A5 A0 A1 A2 A3 A4 A5

0 1 2 3 4 5 6 7 8 9 10 11

A4 freezes in 1st TDMA round after failure. In slot 4 prep. phase it has AC = FC = 3

A2 freezes in 2nd TDMA round after failure. In slot 8 prep. phase it has FC = 5, AC=1

Page 72: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Analysis (cont.)

Property 5. If there is no defector and |A| = |B| = n/2, the clique of which the last member has sent before the last member of the other clique will win. No member of winning clique will shut down and no member of losing clique will send in 2nd round.

Proof. As long as neither AL ,BL have sent, no clique will shutdown since FCA

<= |B| -1 = n/2 –1, FCB <= |A| -1.

Let AL have its sending slot first. AL will not shut down. All members of B will shutdown because FCB

= |A|. At least one of members of B will die in 1st round after failure.

Page 73: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Analysis (cont.) Property 6. If there is a defector, all other nodes will shutdown or

continue just as if there was no defector. The defector will shutdown if it switches to the losing clique.

Proof. From property 2 nothing changes so we can apply properties 4, 5.

Defector will have a new clique. Let |A| > n/2.

If it switches to A and if initially |A| > |B| , it will find FC <= |B| < AC = |A|.

If initially |B| = |A|, then at least a member of B has shutdown in 1 st round, FC <=(|B|-1) < AC = |A|. In both cases Defector survives and becomes member of A.

Page 74: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Analysis (cont.) However, if |A| > |B| and if it switches to losing B, it will see:

FC = |A| > AC <= |B|. FC = |A|-1 successive members of A plus 1st successor, which is in B.AC <= |B|. From prop. 4, members of B may have shutdown during its IoI) and freeze.

Members of A in 2nd TDMA round will find :FC <= |B| - 2 < AC = |A |-1. Defector has left clique and shutdown. 1st and 2nd successor of defector in B are also down.

If initially |A| = |B|, at least a member of B shuts down in 1st round, so Defector has FC = |A| > AC <= |B| - 1 and will shut down.

Page 75: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Integration Integrating node copies ML from any I frame and waits

for its first slot. It does not perform Clique Avoidance nor sends data. In a partitioned cluster, it joins the clique of the node it

copied ML from. It only sends in its next assigned slot so that its counters

are affected by all nodes in cluster. This is to avoid sending in second TDMA round after

partition, while it is in losing clique. If it joins the losing clique it, will have FC > AC and

freeze.

Page 76: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Analysis Summary The clique with the majority of nodes always wins After partitioning, one of equally sized partitions

wins. Defector does not change the clique selection. No member of winners including a newly joined

Defector will ever shut down. All losers will freeze at the latest in the second round

after failure. There will always be at least ceil(n/2 – 1) survivors.

Page 77: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Conclusions The algorithm prevents cluster partitioning after one

TDMA round instability interval.

Provides nodes with consistent membership view among the active nodes.

This is true even after a single arbitrary fault within 2 TDMA rounds. This does not comply with the initial fault consistency hypothesis.

Page 78: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

Open Issues A0 makes an “educated” guess based on its successors

feedback. If both successor are in losing clique, then he guessed wrong.

How can we reduce the probability of defecting to losers? Perhaps add more states. 3rd successor state?

Analysis of probability to defect to losers with respect to number of nodes in the cluster and number of successor states.

Page 79: Membership and Clique Avoidance in TTP/C Gunther Bauer, Michael Paulitsch Presented by Michael Sirivianos 02/01/2005

References

G. Bauer and M. Paulitsch. An Investigation of Membership and Clique Avoidance in TTP/C. 19th IEEE SRDS, (2000).

Herman Kopetz, Gunter Grusteidl. TTP- A Protocol for Fault-Tolerant Real-Time Systems. IEEE Comp. Society Press (1994).

Gunther Bauer, Hermann Kopetz, Wilfried Steiner Byzantine Fault Containment in TTP/C. Workshop on Real-Time LANs in the Internet Age (2002)