33
Integrating Transaction and R li ti M t Replication Management Services in CORBA Services in CORBA Hadi Salimi Computer Engineering Department Iran University of Science and Technology Th I Tehran, Iran [email protected] January 14, 2006 Where Are We? Object Distributed Fault Tolerant Object Orientation 60s Distributed Systems 80s Fault Tolerant Software 70s CORBA 90s Availability Safety OTS FT -CORBA OTS FT CORBA 2/65 We Are Here!!!

Integrating Transaction and Rliti M tReplication ...webpages.iust.ac.ir/hsalimi/Files/Salimi - MS Thesis.pdf · Distributed Computing Services (RPC, Message Passing) 4/65 ... contex

Embed Size (px)

Citation preview

Integrating Transaction and R li ti M tReplication Management

Services in CORBAServices in CORBA

Hadi SalimiComputer Engineering Department

Iran University of Science and TechnologyT h ITehran, Iran

[email protected] 14, 2006

Where Are We?

Object Distributed Fault TolerantObject Orientation 60s

Distributed Systems 80s

Fault Tolerant Software 70s

CORBA 90s Availability Safety

OTSFT-CORBA OTSFT CORBA

2/65

We Are Here!!!

CORBAIt stands for Common Object Request Broker ArchitectureBroker Architecture.OMG, (a consortium over 800

i ) h d d h CORBAcompanies) has produced the CORBA standard.CORBA [1] allows objects to invoke services from other objects, hiding differences in location, programming language or platform.

3/65

CORBA, A Bird-View,

Object ServicesDomain-Dependant

Weather

EMSTransactions

ConcurrencyEvents

Persistence

...Part

ManagerAccounting SystemsFinancial

S i

Object Services

Application S i

pServices

WFMS

ySecurityNaming ...ServiceServices

Object Request Broker

Operating System/Hardware

Networking (TCP/IP)

Distributed Computing Services (RPC, Message Passing)

4/65

Operating System/Hardware

Adopted from [2]

CORBA FeaturesUniform view of the world: everything is an objectUniform view of communication: via request invocationCommunication between objects independentCommunication between objects independent of:

Programming languagesPhysical locationsPlatform typesNetworking protocolsg p

Provides useful Object Serviceschange management, naming, transactions, security query etc

5/65

security, query, etc.

Interoperable Object Reference (IOR)p j ( )

Profile..... .....

IIOP Version

Host Port ComponentObject Key

POA ID Object ID Service Specific Info

IORs are used by the ORB to transparently l t bj t

6/65

locate objects

Software Fault Tolerance

ImpairmentsFault

Adopted Impairments

FailureErrorAdopted

From [3]

Dependability MeansValidation

Construction

Safety

ReliabilityAvailability

Attributes Safety

ConfidentiallyMaintainability

Integrity

7/65

Confidentially

SafetyySoftware safety can be defined [7] as features and procedures which ensurefeatures and procedures which ensure that the likelihood of an unplanned event is minimized and itsevent is minimized and its consequences are controlledTh b i id l i jThereby preventing accidental injury or death, whether intentional or

i t ti lunintentional

8/65

AvailabilityyMeasure of the time that the system is available for useavailable for useA = MTBF/(MTBF + MTTR) where

MTBF is Mean Time Between FailuresMTTR is Mean Time to Repair

MTTR

MTBF

9/65

Software Techniquesq

Safety Transactions

Dependable Software Availability RedundancySoftware Attributes

Availability Redundancy

…Time Data

10/65

Software

TransactionIt was originally developed in the context of database management systemsdatabase management systems. From a database point of view it is a very elegant way to keep the data consistent evenelegant way to keep the data consistent even in the presence of highly concurrent data accesses [4].

There are two mistakes one can make along the road to truth. Not going all the way and not starting.

11/65---Buddha

Transaction (cont.)( )Distributed Systems’ point of view

Th ll t d difThey allow a process to access and modify multiple distributed resources as a single atomic operation [8]atomic operation [8]. If the process backs out halfway during the transaction everything is restored to thetransaction, everything is restored to the point just before the transaction started.

12/65

RedundancyySoftware redundancy includes additional programs modulesadditional programs, modules, functions, or objects used to support fault tolerance [9]fault tolerance [9]

DataTiTimeSoftware

13/65

CORBA & FT

RedundancyTransactions CORBA

Object Transaction Fault TolerantObject Transaction Service

Fault Tolerant CORBA

14/65

OTS Architecture [10][ ]

Transaction Transaction Server Recoverable ServerClient

Transactional Object

Recoverable Object

context

context

Resource

Begin or End Transactions

2-PhaseNot Involved in

context

2-Phase Commit Protocol

Registers Resources

Transaction Completion

context

Transaction Service

Transaction Coordinator

15/65

FT-CORBA [11][ ]At 1998 OMG issued the RFP

l 2000 h fi i l dIn early 2000 the first version releasedThe last version, December 2001Using

ReplicationReplicationObject Groups

16/65

ReplicationpThe various approaches to f l l CO lik h ifault-tolerant CORBA are alike on their use of replication [12].The behind idea is to mask the failure of an object by making extra objects. In the case of a failure, the fault tolerant ORB transparently redirects a p yfailed request to a live replica.

17/65

Replication Modelsp

Rep 1

Rep 2

ActiveHow to keep

replicas consistent?

Object Replication Rep 3

Active PassivePrimary

Warm ColdBackup 2

Backup 1Passive

18/65

Object Groupsj pObject group represents a replicated object and the group membersobject and the group members represent the individual replicas of the objectobject.Each object group has an Interoperable Obj G R f (IOGR)Object Group Reference (IOGR).Using an IOGR, a fault tolerant ORB is capable of accessing each object replica.

19/65

FT-CORBA Architecture

Fault DetectorFault Notifier

Replication Manager

Object Group Manager

Property Manager

Host H

Host H1

Server Replica

S1

Host H2

Server Replica

S2

Generic Factory

Host H

Client Object

CFault

DetectorFactory

Logging

S1Fault

DetectorFactory

Logging

S2

ORB

20/65

FT-CORBA ImplementationspDelta-4 [13]

1990 – supported by CEC through ESPRIT ProjectArjuna [14]j [ ]

1994 – Newcastle UniversityOrbix-Isis [15]

1994 – IONA Technologies Co.Electra [16][ ]

1995 – Zurich UniversityDOORS [17]

1997 - Bell Labs ResearchOGS [31][ ]

1998 – Swiss Federal Institute of TechnologyIRL [18]

1999 – Rome UniversityAQuA [30]

1999 – Illinois UniversityEternal [19]

2001 – US Air force research Lab.

21/65

Why Integration?y g

Replication Transaction

S d d FT CORBA OTSStandard FT-CORBA OTS

Tier Control Tier Data Tier

Attribute Availability Safety

Technique Roll-forward Roll-back

22/65

Roll-ForwardinggReplication has brought a new termination style intonew termination style into transaction literature.Roll-forwarding meansRoll forwarding means although there are errors during the execution of a

2

1

transaction, the transaction can be committed safely, j t b di ti th

commited

0

3

just by redirecting the failed request to a fresh replica

23/65

replica.

Related WorkThe new models of replication and transaction are a bit different form theirtransaction are a bit different form their traditional concepts. A lAs an example:

Concurrency is considered separately. The support of transaction server variety is essential.The granularity of transactional elements is coarser.

24/65

Related Work (cont.)( )So, the related works can be categorized into:categorized into:

Integration of traditional conceptsI t i t t d h f t [21]Is not interested here, you can refer to [21], [22], [23], [24], [25] and [26].

Integration of new conceptsIntegration of new conceptsHave mainly focused on reconciling replication and transaction in distributed systems, specially the ones that are built my means of CORBA.

25/65

Related Work 1Felber and Narasimhan who are pioneers in FT-CORBAare pioneers in FT CORBA, have issued a new publication [27] recently.[ ] yThey have indicated that reconciling replication and transaction concepts in CORBA is still an open issue.

26/65

Related Work 2 Froland and Guerraoui have claimed that [28] the current standards can notthat [28] the current standards can not be integrated due to some limitations:

OTS d t t fi i dOTS does not support fine-grained replication.Duplicated elimination cannot beDuplicated elimination cannot be guaranteed. Output determinism problemOutput determinism problem.

27/65

Related Work 3 Felber and Narasimhan have introduced a protocol [20] in order to usea protocol [20] in order to use transactional operations on replicated objectsobjects.They have clarified their protocol with a i l b k b l f isimple bank balance transfer operation

example.

28/65

Related Work 3 (cont.)( )

Roll-Forward

29/65

Roll ForwardApproach

Related Work 3 (cont.)( )

Roll-Back

30/65

Roll BackApproach

Related Work 3 (cont.)( )

Their proposed approach

31/65

Related Work 4Zhao and Moser [29] have shown that sending a multicast message tothat sending a multicast message to all transaction resources can decrease the overhead of 2 phasedecrease the overhead of 2-phase commit protocol.

32/65

Related Work 4 (cont.)( )

33/65

Why they do not add up?y y pAlmost all of them suppose that middle tier objects are statelessmiddle tier objects are stateless.All of them use total order multicast

l Th d b k f hprotocols. The drawback of these protocols will be highlighted. All of them suppose the existence of nested transaction.

34/65

Total Order Multicast Protocols

Client ObjectObject

Replicas O1Suppose that O1 intends to make

Object

intends to make an atomic call to some objects. Object

CrashO2

jWhat will happen if O2 fail during gthis operation?

35/65

Total-Order Multicast Protocols

What will happen if C fails after

C

G1updating G1, but just right before

G1

updating G2?

G1=2

36/65

Nested Transaction

A nested transaction can be committed successfully if a failure occur in one of its sub transactionsits sub-transactions

37/65

Replication-Aware Transactions (RATs)p ( )

From another point of view, any failure in the scope of a transaction that isin the scope of a transaction that is executing on a group of replicated objects can be easily ignored by usingobjects can be easily ignored by using Replication-Aware Transactions (RATs):

I th f t t l bj t di tiIn the case of stateless objects, redirecting a failed request to a live replica is the remedyremedy.But in the case of statefull objects, this technique is not enough

38/65

technique is not enough.

Replication-Aware Transactions (cont.)p ( )

In the case of statefull objects, omitting the object from its group and recreatingthe object from its group and recreating it will solve the problem. W ll hi i i l llWe call this termination style roll-over.For each object recreation, the state of the object should be transmitted.

39/65

Roll back-forward-over

13

( )

12

(b)

13

4

( )2

4

0t

(a)

3

0commited

(b)25

0commited

(c)

0not commited

0commited 0commited

Roll-back Roll-Forward Roll-overRoll back Roll Forward Roll over

40/65

Ordinary Object Stateless Replicated Object

Statefull Replicated Object

OTS ExtensionvoidResourceManager::registerResource(

CosTransactions::Vote()ResourceManager::registerResource(

CosTransactions::Resource_ptr r , const char* name) throw()

{ResourceRecord record;

ResourceManager::prepare(){…..

ResourceRecord record;Record.name = name; //... other initializations for record

switch(v){

if (strstr(name , "~$Replicated$~"))record.setReplicated(true)

elserecord setReplicated(false);

case CosTransactions::VoteRollback:if (res -> isReplicated())

resources_.Remove(res); record.setReplicated(false);

resources_.push_back(record);}

else return v;//other cases come here

}

41/65

}

Fault Detection

Fault Detector Resource ObjObject

Replicated Object

Host H

Object

How to detect the:

Host H

How to detect the:Crash of HCrash of Replicated Object

42/65

Crash of Resource Object

Group Failurep“~$Replicated$~” Replica Object Group Identifier

What will happen if all replica objects

Constant Variable

fail due to a:Network FailureRare but possible group failure

In this case, the resource objects thatIn this case, the resource objects that belong to an object group should be distinguishable.

43/65

distinguishable.

Replication Style Supportp y ppActive Replication:

T d t f ti li t dTo update a group of active replicated objects, RATs can be helpful.

C

oup

G1

Gro

44/65

Replication Style Support (cont.)p y pp ( )Warm-Passive Replication:

I thi th i bj t h ldIn this case, the primary object should update the backup objects in the scope of a RATa RAT.

Primary Object

O1Update Operation

Backup Objects

45/65

Backup Objects

Replication Style Support (cont.)p y pp ( )Cold-Passive Replication:

Ch k i ti h ld b f d i thCheckpointing should be performed in the scope of the transaction that the primary object participates in itobject participates in it.

Primary Object

Current Transaction

Backup Objects

46/65

Backup Objects

ImplementationpMy Implementations:

E t di OTS i h th tExtending OTS in such a way that can support roll-over approach.Implementing a light weight ReplicationImplementing a light-weight Replication Manager. Implementing a prototype to evaluate ourImplementing a prototype to evaluate our model.

47/65

Replication Managerp gOur implemented replication manager:

K bj t f t f h h tKeeps an object factory for each hostRecreates objects whenever they failKeeps object groups as a bunch of replica objectA i h bj t t fAssigns each object group a set of properties (e.g. minimum number of replica objects)replica objects)

48/65

The Prototypeyp

Account AAccount A Replica Objects

ClientExtended

OTSReplication Manager

Client Interceptor

OTS Manager

MS SQL Server DBMS

Adopted Account B

Replica

49/65

pfrom [29] Objects

Measurement ParametersWe ran the implemented prototype with the change on:the change on:

Number of Replica Objects (N)F il P b bili f li bj (P)Failure Probability of a replica object (P)Transaction Model (TM)

And for each run we measured:The overall transaction throughput (T) in terms of the number of committed transaction per second.

50/65

Performance EvaluationWe compared the transaction throughput of our approach with thethroughput of our approach with the transaction throughput of:

F lb ’ h [20] hi h t dFelber’s approach [20], which uses nested transactionsZhao’s approach [29] that usesZhao’s approach [29], that uses roll-back approach.

51/65

Experimental ResultspN=2

1 6

1.8

2T

1

1.2

1.4

1.6

Roll-Back

Roll-Over

0 2

0.4

0.6

0.8 Nested

0

0.2

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1P

52/65

Experimental Results (cont.)p ( )N=3

1.6

1.8

T

1

1.2

1.4

Roll-Back

Roll Over

0.4

0.6

0.8Roll-Over

Nested

0

0.2

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1P

53/65

Experimental ResultspN=4

1.4

1.6T

0.8

1

1.2

Roll-Back

Roll-Over

0 2

0.4

0.6

0.8 Roll Over

Nested

0

0.2

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1P

54/65

Experimental ResultspN=5

1.4

1.6

T

0 8

1

1.2

Roll-Back

Roll Over

0 2

0.4

0.6

0.8 Roll-Over

Nested

0

0.2

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1P

55/65

All together g

1 6

2T

2T

N=2N=3 0.8

1.2

1.6

0.8

1.2

1.6

0

0.4

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1P

0

0.4

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 P

1.6

2

T

1.6

2

T

N=4N=5

0.4

0.8

1.2

0.4

0.8

1.2

56/65

0

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 P0

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 P

Object Sizej

The object size can affect the performance of RATs, because the object state needs to be transmitted on each object recreation.

T

1.4

1.6

1.8

T

0.8

1

1.2

1.4

Roll-Back

100B

1K

0

0.2

0.4

0.6 10K

P

57/65

0

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1P

ConclusionWe showed that current replica consistency techniques that are basedconsistency techniques that are based on total order multicast protocols cannot guarantee system safetycannot guarantee system safety.We presented a new transaction model h b li d li d bjthat can be applied to replicated objects.

In this model, a failure in the scope of a transaction that is running on a group of replicated objects can be ignored.

58/65

Conclusion (cont.)( )We implemented a prototype to evaluate this extensionevaluate this extension. The experimental results show that this

d l i i l l b fi i lmodel is particularly beneficial:When dealing with crowded object groupsIn faulty environmentsFor light weight objects

59/65

Other Open IssuespActive_With_Voting (Quorum-Based or Consensus-Based) Replication [33]Consensus Based) Replication [33]Applying transactions on active with voting replication stylereplication styleFault Tolerant CCMModeling of this approach [32]Modeling of this approach [32]Transaction and Concurrency Integration

60/65

References[1] Object Management Group, “The Common Object Request Broker: Architecture

and Specification, 2.6 Edition,” OMG Technical Committee Document formal/02-01-02, Jan. 2002.

[2] D Slama J Garbis and Russell P “Enterprise CORBA” Prentice Hall 1999[2] D. Slama. J. Garbis, and Russell P. Enterprise CORBA , Prentice Hall, 1999 [3] J. C. Laprie, "Dependable Computing: Concepts, Limits, Challenges," Proceedings

of PTCS-25, Pasadena, 1995, pp. 42-54.[4] G. Weikum, G. Vossen, “Transactional Information Systems”, Academic Press,

2002, San Francisco, CA, USA., , ,[5] P. Narasimhan, “Transparent Fault Tolerant for CORBA”, PHD thesis, University of

California, December 1999. [6] Object Oriented Concepts, Inc., "CORBA/C++ Programming with ORBacus",

Student Workbook, Version 1.0.5, December 2000. http://www.ooc.com[7] L P ll “S ft F lt T l T h i d I l t ti ” A t h[7] L. Pullum, “Software Fault Tolerance Techniques and Implementations”, Artech

House, ISBN 1-58053-137-7 [8] A. S. Tanenbaum, M. V. Steen, “Distributed Systems: Principles and Paradigms”,

Prentice Hall, 2002, ISBN: 0-13-088893-1 [9] Guerraoui R and Schiper A (1997) Software-Based Replication for Fault[9] Guerraoui, R. and Schiper, A. (1997) Software Based Replication for Fault

Tolerance. IEEE Computer, 30(4) pp. 68-74.

61/65

References (cont.)( )[10] Object Management Group, “Object Transaction Service Specification”, Version

1.4, OMG Technical Committee Document, formal/03-09-02, September 2003. [11] Object Management Group, “Fault Tolerant CORBA (Final Adopted Specification)”,

OMG Technical Committee Document formal/01 12 29 Dec 2001OMG Technical Committee Document formal/01-12-29, Dec. 2001.[12] Felber, P., Guerraoui, R., and Schiper, A. (2000) Replication of CORBA Objects.

Lecture Notes in Computer Science, 1752, Springer Verlag, Berlin, pp. 254-76.[13] D. Powell, “Distributed Fault Tolerance: Lessons form Delta-4”, IEEE Computer,

1994.[14] G. D. Parrington, S. K Shrivastava, S. M. Wheater, M. C. Little, “Design and

Implementation of Arjuna”, Technical Report, Department of Computing Science, University of Newcastle,

[15] IONA and Isis, An Introduction to Orbix+Isis, IONA Technologies Ltd. and Isis Distributed Systems Inc 1994Distributed Systems, Inc., 1994.

[16] S. Matfeis. “Run-Time Support for Object-Oriented Distributed Programming”. PhD thesis, University of Zurich, February 1995.

[17] B. Natarajan, A. Gokhale and S. Yajnik, “DOORS: Toward High-performance Fault Tolerant CORBA”, in the proceedings of the 2nd Distributed Objects and , p g jApplications (DOA) Conference. Antwerp, Belgium, September 2000.

[18][19] L. E. Moser, P. M. Melliar-Smith, and P. Narasimhan, “Consistent object replication

in the Eternal system”, Theory and Practice of Object Systems, 4(2):81-92, 1998

62/65

1998.

References (cont.)( )[20] P. Felber, P. Narasimhan, “Reconciling Replication and Transactions for the End-

to-End Reliability of CORBA Applications”, in the proceedings of international symposium on Distributed Objects and Applications (DOA’02), pp. 737-754, October 2002October 2002.

[21] D. R. Cheriton and D. Skeen, “Understanding the limitations of causally and totally ordered communication”, Proc. of 14th ACM Symp. on Operating Systems Principles, Operating Sytems Review, 27 (5), pp. 44-57, December 1993.

[22] Rebuttals from Cornell, Operating Systems Review, 28 (1), January 1994.[23] A. Schiper and M. Raynal, “From Group Communication to Transactions in

Distributed Systems”, CACM, 39(4), April 1996.[24] M. Martynezy, R. Peris, and S. Evalo, “Group Transactions: An Integrated

Approach to Transaction and Group Communication”, Workshop on Concurrency in Dependable Computing June 2001Concurrency in Dependable Computing, June 2001.

[25] S. Frolund and R. Guerraoui, “Implementing e-Transactions with Asynchronous Replication,” IEEE Transactions on Parallel and Distributed Systems, vol. 12, No. 2, pp. 133-146 (2001).

[26] M. C. Little and S. K. Shrivastava, “Integrating Group Communication with T ti f I l ti P i t t R li t d Obj t ” L t N t iTransactions for Implementing Persistent Replicated Objects,” Lecture Notes in Computer Science, Vol. 1752, Springer-Verlag, 2000.

[27] Felber P. and Narasimhan P. “Experiences, Strategies, and Challenges in Building Fault-Tolerant CORBA Systems”, IEEE Transactions on Computers, Vol. 53, No. 5, May 2004

63/65

, y

References (cont.)( )[28] S. Frolund and R. Guerraoui, “CORBA Fault Tolerance: why it does not add up?” ,

In Proc. of the IEEE Workshop on Future Trends in Distributed Systems, Cape Town, December 1999.,

[29] W. Zhao, L. E. Moser and P. M. Melliar-Smith, “Unification of Replication and Transaction Processing in Three-Tier Architectures”, In Proc. of the International Conference on Distributed Systems, 2002.

[30] M Cukier J Ren C Sabnis W H Sanders D E Bakken M E Berman D A Karr[30] M. Cukier, J. Ren, C. Sabnis, W.H. Sanders, D.E. Bakken, M.E. Berman, D.A. Karr, and R. Schantz, “AQuA: An Adaptive Architecture that Provides Dependable Distributed Objects,” Proc. IEEE 17th Symposium on Reliable Distributed Systems, pp. 245-253, Oct. 1998.

[31] S M tf i “R Ti S t f Obj t O i t d Di t ib t d P i ” PhD[31] S. Matfeis. “Run-Time Support for Object-Oriented Distributed Programming” PhD thesis, University of Zurich, February 1995.

[32] I. Majzik and G. Huszerl, “Towards Dependability Modeling of FT-CORBA Architectures”, In Proc. 4th European Dependable Computing Conference (EDCC-4), pp. 121-139., Toulouse, France, 23-25 October 2002.

[33] J. C. Martinez and J. F. Paris, “A New Voting Approach to Fault-Tolerant CORBA”, Proceedings of the International Conference on Software Engineering Applied to Networking and Parallel/Distributed Computing (SNDP ’00), Reims, France, May

64/65

g / p g ( ), , , y2000, pages 358-365.

Thanks …Thanks …

Hadi Salimih [email protected][email protected]

January 14, 2005