21
Presented at University of Alabama CIS, Birmingham Monday, April 9, 2001 Patterns-based Fault Tolerant CORBA Implementation for Predictable Performance Aniruddha Gokhale [email protected] In collaboration with Balachandran Natarajan ([email protected] ) Douglas C. Schmidt ([email protected] ) Shalini Yajnik ([email protected])

Patterns-based Fault Tolerant CORBA Implementation for Predictable Performance

  • Upload
    myrna

  • View
    41

  • Download
    1

Embed Size (px)

DESCRIPTION

Patterns-based Fault Tolerant CORBA Implementation for Predictable Performance. Aniruddha Gokhale [email protected]. In collaboration with Balachandran Natarajan ( [email protected] ) Douglas C. Schmidt ( [email protected] ) Shalini Yajnik ([email protected]). Motivation. - PowerPoint PPT Presentation

Citation preview

Page 1: Patterns-based Fault Tolerant CORBA Implementation for Predictable Performance

Presented at

University of Alabama CIS, Birmingham

Monday, April 9, 2001

Patterns-based Fault Tolerant CORBA Implementation for Predictable Performance

Aniruddha [email protected]

In collaboration withBalachandran Natarajan ([email protected])

Douglas C. Schmidt ([email protected])Shalini Yajnik ([email protected])

Page 2: Patterns-based Fault Tolerant CORBA Implementation for Predictable Performance

2

Aniruddha Gokhale Patterns for FT CORBA

Motivation•Distributed applications are becoming more complex & mission-critical

•Increasing demand for cots-based multi-dimensional quality-of-service (QoS) support•E.g., Simultaneous requirements for efficiency, predictability, scalability, security, & dependability

•Key open challenge is QoS-enabled dependability

WIDE AREANETWORK

SA TE L L ITE ST R A C K IN G

ST A T IO NP E E R S

STA TUS INF O

C O M M AN D S B U L K D AT A

TR A N SF E R

LOCAL AREA NETWORK

GROUNDSTATION

PEERS

GATEWAY

Page 3: Patterns-based Fault Tolerant CORBA Implementation for Predictable Performance

3

Aniruddha Gokhale Patterns for FT CORBA

Promising Solution: Fault Tolerant (FT) Distributed Object Computing Middleware

Challenges•Limitations of non-OO FT strategies that focus on application processes

•Techniques based on process-based failure detection & recovery are not applicable to distributed object computing applications due to:1.Overly coarse granularity2.Inability to restore complex

object relationships3.Restrictions on process

checkpointing & recovery

Page 4: Patterns-based Fault Tolerant CORBA Implementation for Predictable Performance

4

Aniruddha Gokhale Patterns for FT CORBA

Overview of Fault Tolerant CORBAOverview•Provides a standard set of CORBA interfaces, policies, & services

•Entity Redundancy of objects is used for fault tolerance via•Replication•Fault detection & •Recovery from failure

Features•Inter-Operable Group References (IOGR)

•Replication Manager•Fault Detector & Notifier•Message Logging for recovery•Fault tolerance Domains

Page 5: Patterns-based Fault Tolerant CORBA Implementation for Predictable Performance

5

Aniruddha Gokhale Patterns for FT CORBA

Interoperable Object Group References•Composite & enhanced Interoperable Object Reference (IOR) for referencing server object groups

•Comprises one or more TAG_INTERNET_IOP profiles, which in turn must contain a TAG_FT_GROUP and zero or more TAG_IIOP_ALTERNATE_ADDRESS components•TAG_PRIMARY component in at most one TAG_INTERNET_IOP profile

•Client ORBS operate on IOGRs in the same way as with IORs

Page 6: Patterns-based Fault Tolerant CORBA Implementation for Predictable Performance

6

Aniruddha Gokhale Patterns for FT CORBA

DOORS & FT-CORBA•DOORS is a “Distributed OO Reliable Service” developed prior to FT-CORBA•Uses the service strategy to provide FT to CORBA objects

•Patterns and mechanisms in DOORS were integrated into FT-CORBA standard

•DOORS implements most of FT-CORBA standard•Focus on passive-replication•Available as open-source for non-commercial use from Lucent

•Runs atop the TAO open-source real-time ORB•www.theaceorb.com

Page 7: Patterns-based Fault Tolerant CORBA Implementation for Predictable Performance

7

Aniruddha Gokhale Patterns for FT CORBA

8.Client sends requests to the primary

2.RM delegates replica creation to local factories

4.The local factories send the replica IOR’s to the RM for it to create the IOGR

5.The RM registers the IOGR with a CORBA Naming Service (NS)

7.Clients contact the NS for IOGR

FT-CORBA Component Interaction1.External object asks RM to set

properties for replica group and create it

6.The RM asks fault detectors to initiate fault monitoring of replicas

3.The local factories create CORBA objects

Page 8: Patterns-based Fault Tolerant CORBA Implementation for Predictable Performance

8

Aniruddha Gokhale Patterns for FT CORBA

2.Detector propagates fault to Notifier

4.RM promotes backup to primary

5.RM requests local factory to create a new backup and gets new IOR

7.RM registers new IOGR with NS

Fault Detection and Recovery1.Fault detector detects failure of

primary

6.RM creates new IOGR and informs all replicas of it

3.Notifier pushes fault to RM

PRIMARY BACKUP

8.Client sends request to old primary

9.Old primary throws exception LOCATION_FORWARD

10.Client sends request to new primary

PRIMARYBACKUP

Page 9: Patterns-based Fault Tolerant CORBA Implementation for Predictable Performance

9

Aniruddha Gokhale Patterns for FT CORBA

ORB Core Optimizations

Optimization Opportunities to Improve Fault Tolerant CORBA Performance

CORBA Service Optimizations

•Efficient IOGR parsing & connection establishment

•Reliable handling & ordering of GIOP messages

•Predictable behavior during transparent connection establishment & retransmission

•Tracking requests with respect to the server object group

•Support for dynamic system configuration•Bounded recovery time•Minimize overhead of FT CORBA components

Page 10: Patterns-based Fault Tolerant CORBA Implementation for Predictable Performance

10

Aniruddha Gokhale Patterns for FT CORBA

Analysis• Failure detection time

increases with the polling interval

• Average failure detection time is half the polling interval

Challenge• Choosing small polling interval• Minimize message overhead

Effect of Polling Interval on Failure Detection Times

Fault detection time measured as the time between the failure of replica & the FaultDetector detecting failureFault detection time measured as the time between the failure of replica & the FaultDetector detecting failure

Page 11: Patterns-based Fault Tolerant CORBA Implementation for Predictable Performance

11

Aniruddha Gokhale Patterns for FT CORBA

Effect of Polling Interval on Recovery Time

Analysis• Average failure detection

time is half the polling interval

• Replica Group Management time is constant

Challenge• Minimize replica group

management time

Recovery Time = Failure detection time + Replica Group Management TimeRecovery Time = Failure detection time + Replica Group Management Time

Page 12: Patterns-based Fault Tolerant CORBA Implementation for Predictable Performance

12

Aniruddha Gokhale Patterns for FT CORBA

Design patterns capture the static & dynamic roles & relationships in solutions that occur repeatedly

Architectural patterns express a fundamental structural organization for software systems that provide a set of predefined subsystems, specify their relationships, & include the rules and guidelines for organizing the relationships between them

Optimization principle patterns document rules for avoiding common design & implementation mistakes that degrade performance

Patterns codify expert knowledge to help generate software architectures by capturing recurring structures & dynamics and resolving common design forces

Overview of Patterns

www.posa.uci.edu/

Page 13: Patterns-based Fault Tolerant CORBA Implementation for Predictable Performance

13

Aniruddha Gokhale Patterns for FT CORBA

Decoupling Polling and Recovery

Fault Detector

PollingThread

Replica

Replica

Replica

Replica

FaultNotifier

Context•Periodic polling & recovery request done in the same polling thread can block the thread

Forces•Must guarantee polling of other objects while recovery request is sent

•Must minimize concurrency overhead

Solution•Apply the Leader-Followers or AMI architectural pattern

Solution•Apply the Leader-Followers or AMI architectural pattern

Problem•Blocking can cause missed polls

HANGS

HANGS

Page 14: Patterns-based Fault Tolerant CORBA Implementation for Predictable Performance

14

Aniruddha Gokhale Patterns for FT CORBA

Decoupling Recovery Initiation From Recovery Execution

Context•Replication Manager serializes failure reports

Forces•Bounded amount of time for failure recovery irrespective of number of failure reports

Solution•Apply the Active Object design pattern

Solution•Apply the Active Object design pattern

Problem•Reduced responsiveness

Page 15: Patterns-based Fault Tolerant CORBA Implementation for Predictable Performance

15

Aniruddha Gokhale Patterns for FT CORBA

Supporting Interchangeable Behavior

Solution•Apply the Strategy design pattern

Solution•Apply the Strategy design pattern

Context•FT properties can be set statically (as defaults) or set dynamically

Forces•Need highly extensible services that can be composed transparently from configurable properties

Problem•Hard-coding properties make the FT-CORBA design inflexible & non-extensible

Page 16: Patterns-based Fault Tolerant CORBA Implementation for Predictable Performance

16

Aniruddha Gokhale Patterns for FT CORBA

Consolidating StrategiesContext.•FT CORBA implementations can have many properties.• e.g.,membership, replication, consistency, monitoring, # of replicas, etc.

Forces•Ensure semantically compatible properties

•Simplify management of properties

Solution•Apply the Abstract Factory design pattern

Solution•Apply the Abstract Factory design pattern

Problem•Risk of combining semantically incompatible properties

Page 17: Patterns-based Fault Tolerant CORBA Implementation for Predictable Performance

17

Aniruddha Gokhale Patterns for FT CORBA

Dynamic ConfigurationContext•There are many potential FT properties that can be used

Forces•The behavior of FT-CORBA properties should be decoupled from the time when they are actually configured

Solution•Apply the Component Configurator design pattern

Solution•Apply the Component Configurator design pattern

Problem•Static configuration of properties is inflexible & overly resource intensive

Page 18: Patterns-based Fault Tolerant CORBA Implementation for Predictable Performance

18

Aniruddha Gokhale Patterns for FT CORBA

Context• FT-CORBA mandates a hierarchical lookup of properties based on strings

• Property lookup is required during object group creation & recovery

Forces•Efficient lookups of properties guided by the order specified in the FT-CORBA standard

Solution•Use the Chain of Responsibility design pattern & Perfect Hashing optimizations

Solution•Use the Chain of Responsibility design pattern & Perfect Hashing optimizations

Efficient Property Name-Value Lookup

Problem• Inefficient property lookup degrades QoS

Page 19: Patterns-based Fault Tolerant CORBA Implementation for Predictable Performance

19

Aniruddha Gokhale Patterns for FT CORBA

Research Directions Middleware for Ad hoc/Wireless networks

FT CORBA enhancements for JINI-like systems CORBA Pluggable Protocol for Bluetooth devices Middleware enhancements for 3G wireless/mobile internet

Fault tolerance Sequenced Initialization and Recovery (dealing with object

dependencies) Handling failure groups and collocated groups Fault Escalation strategies and Fault Analysis Growth/degrowth, runtime upgrades

QoS-enabled framework of middleware components Higher level middleware framework shielding applications

from lower level middleware Multi-dimensional QoS support Patterns-based architecture of plug & play components Code generation tools for repetitive tasks

Page 20: Patterns-based Fault Tolerant CORBA Implementation for Predictable Performance

20

Aniruddha Gokhale Patterns for FT CORBA

Concluding Remarks PROBLEM PATTERN CATEGORY

1 Missed Polls in FaultDetector

Leader/Followers, AMI Architectural

2 Excessive Overhead of recovery

Active Object Design

Optimize for common case

Optimization

Store extra information Optimization

3 Excessive overhead of service lookup

Eliminate gratuitous waste

Optimization

Strategy Design 4 Tight coupling of data structures Abstract Factory Design

5 Lack of dynamic configuration

Component Configurator

Design

Chain of responsibility Design 6 Inefficient Property value lookup Perfect hash Functions Optimization

•Researchers & developers of distributed systems face common challenges, e.g.:

•The application of patterns, frameworks, & components can help to resolve these challenges

•Carefully applying these techniques can yield efficient, scalable, predictable, dependable, & flexible middleware & applications

•Connection management, service initialization, error handling, flow control, event demuxing, distribution, concurrency control, fault tolerance, synchronization, scheduling, & persistence

Page 21: Patterns-based Fault Tolerant CORBA Implementation for Predictable Performance

21

Aniruddha Gokhale Patterns for FT CORBA

EXTRA SLIDES