102
1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

  • View
    219

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

1

ICS 214B: Transaction Processing and Distributed Data Management

Distributed Database Systems

Page 2: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 2

So far: Centralized DB systems

Software:

ApplicationSQL Front EndQuery ProcessorTransaction Proc.File Access

P

M ...

• Simplifications: single front end one place to keep locks if processor fails, system fails, ...

Page 3: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 3

Next: distributed database systems

• Multiple processors ( + memories)• Heterogeneity and autonomy of

“components”

Page 4: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 4

Why do we need Distributed Databases?

• Example: Big Corp. has offices in London, New York, and Hong Kong.

• Employee data:– EMP(ENO, NAME, TITLE, SALARY, …)

• Where should the employee data table reside?

Page 5: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 5

Big Corp. Data Access Pattern

• Mostly, employee data is managed at the office where the employee works– E.g., payroll, benefits, hire and fire

• Periodically, Big Corp needs consolidated access to employee data– E.g., Big Corp. changes benefit plans and

that affects all employees.– E.g., Annual bonus depends on global net

profit.

Page 6: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 6

EMP

Internet

LondonPayroll app

London

New YorkPayroll app

New York

Hong KongPayroll app

Hong Kong

Problem:NY and HK payrollapps run very slowly!

Page 7: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 7

LondonEmp

Internet

LondonPayroll app

London

New YorkPayroll app

New York

Hong KongPayroll app

Hong Kong

HKEmp

NYEmp

Much better!!

Page 8: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 8

Internet

LondonPayroll app

Annual Bonus app

London

New YorkPayroll app

New York

Hong KongPayroll app

Hong Kong

LondonEmp NY

Emp

HKEmp

Distribution providesopportunities forparallel execution

Page 9: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 9

Internet

LondonPayroll app

Annual Bonus app

London

New YorkPayroll app

New York

Hong KongPayroll app

Hong Kong

LondonEmp NY

Emp

HKEmp

Page 10: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 10

Internet

LondonPayroll app

Annual Bonus app

London

New YorkPayroll app

New York

Hong KongPayroll app

Hong Kong

Lon, NYEmp NY, HK

Emp

HK, LonEmp

Replication improvesavailability

Page 11: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 11

Heterogeneity and Autonomy

Application

RDBMS FilesStocktickertape

Portfolio History ofdividends,ratios,...

Page 12: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 12

• data management with multiple processors and

possible autonomy, heterogeneity

– Impact on: Data organization Query processing Access structures Concurrency control Recovery

Page 13: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 13

• transaction monitors – Coordinate transaction execution

Multiple DBMSs High performance

– Have workflow facilities– Manage communications with client

“terminals”

Page 14: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 14

DB architectures

(1) Shared memory

P P P...

M

Page 15: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 15

DB architectures

(2) Shared disk

...

...

P

M

P P

M M

Page 16: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 16

DB architectures(3) Shared nothing

P

M

P

M

P

M

...

Page 17: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 17

DB architectures(4) Hybrid example – Hierarchical or Clustered

M

P P P...

M

P P P...

Page 18: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 18

Issues for selecting architecture

• Reliability• Scalability• Geographic distribution of data• Data “clusters”• Performance• Cost

Page 19: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 19

Parallel or distributed DB system?

• More similarities than differences!

Page 20: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 20

• Typically, parallel DBs:– Fast interconnect– Homogeneous software– High performance is goal– Transparency is goal

Page 21: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 21

• Typically, distributed DBs:– Geographically distributed– Data sharing is goal (may run into heterogeneity, autonomy)– Disconnected operation possible

Page 22: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 22

Distributed Database Challenges

• Distributed Database Design– Deciding what data goes where– Depends on data access patterns of

major applications– Two subproblems:

Fragmentation: partition tables into fragments

Allocation: allocate fragments to nodes

Page 23: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 23

Distributed Database Challenges

• Distributed Query Processing– Centralized query plan goal: minimize

number of disk I/Os– Additional factors in distributed

scenario: Communication costs Opportunity for parallelism

– Space of possible query plans is much larger!

Page 24: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 24

Distributed Database Challenges

• Distributed Concurrency Control– Transactions span nodes

Must be globally serializable

– Two main approaches: Locking Timestamps

– Distributed Deadlock Management– Multiple data copies – need to be kept

in sync when updates occur

Page 25: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 25

Distributed Database Challenges

• Reliability of Distributed Databases– Centralized database failure model:

processor fails

– Distributed database failure model: One or more processors may fail Network may fail Network may be partitioned

– Data must be kept in sync

Page 26: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 26

To illustrate synchronization problems: “Two Generals” Problem

Page 27: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 27

The one general problem (Trivial!)

Battlefield

G

Troops

Page 28: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 28

The two general problem:

<------------------------------->

messengers

Blue army Red army

BlueG Red

G

Enemy

Page 29: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 29

• Blue and red army must attack at same time

• Blue and red generals synchronize through messengers

• Messengers can be lost

Rules:

Page 30: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 30

Application

RDBMS FilesStocktickertape

Portfolio History ofdividends,ratios,...

Distributed Database Challenges

• Heterogeneity

Page 31: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 31

Example: unable to get statisticsfor query optimization

Example: blue general may have mind of his (or her) own!

Distributed Database Challenges

• Autonomy

Page 32: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 32

• Distributed DB Design

Page 33: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 33

Distributed DB DesignTop-down approach: - have DB…- how to split and allocate the sites

Bottom-up approach: - multi-database (possibly heterogeneous,

autonomous)- no design issues!

Page 34: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 34

Two issues in DDB design:

• Fragmentation• Allocation

Note: issues not independent, but will cover separately

Page 35: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 35

Employee relation E (#,name,loc,sal,…) 40% of queries: 40% of queries: Qa: select * Qb: select *

from E from Ewhere loc=Sa where

loc=Sband… and ...

Motivation: Two sites: Sa, Sb Qa QbSa Sb

Page 36: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 36

# NM Loc Sal

E 578

Sa 10Sally Sb 25Tom Sa 15

Joe

# NM Loc Sal # NM Loc Sal58

Sa 10Tom Sa 15Joe 7 Sb 25Sally

..

..

....

F

At SaAt Sb

Page 37: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 37

F = { F1, F2 }

F1 = loc=Sa(E) F2 = loc=Sb(E)

called primary horizontal fragmentation

Page 38: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 38

Fragmentation• Horizontal Primary

depends on local attributes

R Derived depends on foreign

relation

• Vertical

R

Page 39: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 39

Three common horizontal fragmentation techniques

• Round robin• Hash partitioning• Range partitioning

Used mostly in parallel dbs

Used in parallel dbs and distributed dbs

Page 40: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 40

• Round robinR D0 D1 D2

t1 t1t2 t2t3 t3t4 t4... t5• Evenly distributes data

• Good for scanning full relation

• Not good for point or range queries

• Not suitable for databases distributed over WAN

Page 41: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 41

• Hash partitioningR D0 D1 D2

t1h(k1)=2 t1t2h(k2)=0 t2t3h(k3)=0 t3t4h(k4)=1 t4...• Good for point queries on key; also for joins on key

• Not good for range queries; point queries not on key

• If hash function good, even distribution

• Not suitable for databases distributed over a WAN

Page 42: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 42

• Range partitioningR D0 D1 D2

t1: A=5 t1t2: A=8 t2t3: A=2 t3t4: A=3 t4...

4 7

partitioningvector

V0 V1

• Good for point queries on A; also for joins on A

• Good for some range queries on A

• Need to select good vector: else unbalanced

• data skew, execution skew

Page 43: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 43

Which are good fragmentations?

Example:

F = { F1, F2 }

F1 = sal<10 E F2 = sal>20 E

Problem: Some tuples lost!

Page 44: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 44

Which are good fragmentations?Second example:

F = { F3, F4 }

F3 = sal<10 E F4 = sal>5 E

Tuples with 5 < sal < 10 are duplicated...

Page 45: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 45

Better design

Example: F = { F5, F6, F7 }

F5 = sal 5 E F6 = 5<sal<10 E

F7 = sal 10 E

Then replicate F6 if convenient (part of allocation problem)

Page 46: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 46

Desired properties for fragmentationR F = {F1, F2, …, Fn}• Completeness

– For every data item x R, FiF such that xFi

• Disjointness xFi, Fj such that xFj, i j

• Reconstruction– There is function g such that R = g(F1, F2, …, Fn)

Page 47: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 47

Desired properties for horizontal fragmentation

R F = {F1, F2, …, Fn}

• Completeness– For every tuple tR, FiF such that

tFi

• Disjointness tFi, Fj such that tFj, i j

• Reconstruction – can safely ignore– Completeness R = Fi

FiF

Page 48: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 48

How do we get completeness and disjointness?

(1) Check it “manually”!

e.g., F1 = sal<10 E ; F2 = sal10 E

Page 49: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 49

How do we get completeness and disjointness?

(2) “Automatically” generate fragments with these properties

• Horizontal fragments are defined by selection predicates

• Generate a set of selection predicates with the desired properties

Page 50: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 50

Example of generation• Say queries use predicates:

A<10, A>5, Loc = SA, Loc = SB• Next: - generate “minterm”

predicates- eliminate useless ones

• Given simple predicates Pr= { p1, p2,.. pn }minterm predicates are of the form p1* p2* … pn*

where pk* is pk or is ¬pk

Page 51: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 51

Minterm predicates (part I)(1) A<10 A>5 Loc=SA Loc=SB

(2) A<10 A>5 Loc=SA ¬(Loc=SB)(3) A<10 A>5 ¬(Loc=SA) Loc=SB

(4) A<10 A>5 ¬(Loc=SA) ¬(Loc=SB)(5) A<10 ¬(A>5) Loc=SA Loc=SB

(6) A<10 ¬(A>5) Loc=SA ¬(Loc=SB)(7) A<10 ¬(A>5) ¬(Loc=SA) Loc=SB

(8) A<10 ¬(A>5) ¬(Loc=SA) ¬(Loc=SB)

A 5

5 < A < 10

Page 52: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 52

Minterm predicates (part II)

(9) ¬(A<10) A>5 Loc=SA Loc=SB

(10) ¬(A<10) A>5 Loc=SA ¬(Loc=SB)(11) ¬(A<10) A>5 ¬(Loc=SA) Loc=SB

(12) ¬(A<10) A>5 ¬(Loc=SA) ¬(Loc=SB)(13) ¬(A<10) ¬(A>5) Loc=SA Loc=SB

(14) ¬(A<10) ¬(A>5) Loc=SA ¬(Loc=SB)(15) ¬(A<10) ¬(A>5) ¬(Loc=SA) Loc=SB

(16) ¬(A<10) ¬(A>5) ¬(Loc=SA) ¬(Loc=SB)

A 10

Page 53: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 53

Final fragments:

F2: 5 < A < 10 Loc=SA F3: 5 < A < 10 Loc=SB F6: A 5 Loc=SA F7: A 5 Loc=SB F10: A 10 Loc=SA F11: A 10 Loc=SB

Page 54: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 54

Note: elimination of useless fragments depends on application semantics:

e.g.: if LOC could be SA, SB, we need to add fragments

F4: 5 <A <10 Loc SA Loc SB

F8: A 5 Loc SA Loc SB

F12: A 10 Loc SA Loc SB

Page 55: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 55

Why does this algorithm work?

• Must prove that the set of fragments is:– Complete– Disjoint

Page 56: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 56

Summary• Given simple predicates Pr= { p1, p2,..

pn }minterm predicates are

M={m | m = pk*, 1 k n }

where pk* is pk or is ¬ pk

pkPr

• Fragments m R for all m M are

complete and disjoint

Page 57: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 57

.

Distributed commit problem

Action:a1,a2

Action:a3

Action:a4,a5

Transaction T

Commit must be atomic

Page 58: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 58

Distributed commit problem

• Commit must be atomic– site failures– communication failures– network partitions– timeout failures

• Solution: Atomic commit protocol– must ensure that despite failures, if all failures repaired,

then transactions commits or aborts at all sites.• Most common ACP: Two-phase commit (2PC)

– Centralized 2PC– Distributed 2PC– Linear 2PC– Many other variants…

Page 59: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 59

Terminology

• Resource Managers (RMs)– Usually databases

• Participants– RMs that did work on behalf of

transaction

• Coordinator– Component that runs two-phase

commit on behalf of transaction

Page 60: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 60

Coord

inato

r

Part

icip

ant

REQUEST-TO-PREPARE

PREPARED*

COMMIT*

DONE

Page 61: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 61

Coord

inato

r

Part

icip

ant

REQUEST-TO-PREPARE

NO

ABORT

DONE

Page 62: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 62

States of the Transaction

• At Coordinator:– Initiated (I) -- transaction known to system– Preparing (P) -- prepare message sent to participants– committed (C) -- has committed– Aborted (A) -- has aborted

• At participant:– Initiated (I)– Prepared (P) -- prepared to commit, if the coordinator

so desires– committed (C) – Aborted (A)

Page 63: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 63

Protocol Database

• Coordinator maintains a protocol database (in main memory) for each transaction

• Protocol database – enables coordinator to execute 2PC– answers inquiries by participants about status of

transaction cohorts may make such inquiries if they fail during recovery

– entry for transaction deleted when coordinator is sure that no one will ever inquire about transaction again (when it has been acked by all the participants)

Page 64: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 64

two-phase commit (messages)

Coordinator Participant

I

P

C

A

I

P

C

A

commit-request request-prepare*

no abort*

prepared* Commit*

commit ack

request-prepare prepared

request-prepare no

abort ack

F ack*

ack*

Page 65: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 65

• Notation: Incoming message Outgoing message

( * = everyone)• When participant enters “P” state:

– it must have acquired all resources– it can only abort or commit if so

instructedby a coordinator

• Coordinator only enters “C” state if all participants are in “P”, i.e., it is certain that all will eventually commit

Page 66: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 66

Two phase commit -- normal actions (coordinator)

– make entry into protocol database for transaction marking its status as initiated when coordinator first learns about transaction

– Add participant to the cohort list in protocol database when coordinator learns about the cohorts

– Change status of transaction to preparing before sending prepare message. (it is assumed that coordinator will know about all the participants before this step)

– On receipt of PREPARE message from cohort, mark cohort as PREPARED. If all cohorts PREPARED, then change status to COMMITTED and send COMMIT message.

must force a commit log record to disk before sending commit message.– on receipt of ACK message from cohort, mark cohort as ACKED. When all

cohorts have acked, then delete entry of transaction from protocol database. Must write a completed log record to disk before deletion from protocol

database. No need to force the write though.

Page 67: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 67

Two Phase Commit - normal actions (participant)

• On receipt of PREPARE message, write PREPARED log record before sending PREPARED message– needs to be forced to disk since coordinator

may now commit.

• On receipt of COMMIT message, write COMMIT log record before sending ACK to coordinator– cohort must ensure log forced to disk before

sending ack -- but no great urgency for doing so.

Page 68: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 68

Timeout actions• At various stages of protocol, transaction waits from messages at both

coordinator and participants. • If message not received, on timeout, timeout action is executed:• Coordinator Timeout Actions

– waiting for votes of participants: ABORT transaction, send aborts to all.– waiting for ack from some participant: forward the transaction to

recovery process that periodically will send COMMIT to participant. When participant will recover, and all participants send an ACK, coordinator writes a completion log record and deletes entry from protocol database.

• Cohort timeout actions:– waiting for prepare: abort the transaction, send abort message to

coordinator. Alternatively, it could wait for the coordinator to ask for prepare. – Waiting for decision: forward transaction to recovery process. Recovery

process executes status-transaction call to the coordinator. Such a transaction is blocked for recovery of failure. The participant could have used a different termination protocol -- e.g., polling other participants. (cooperative Termination)

Page 69: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 69

2PC is blockingSample scenario:

Coord P2W

P1 P3W

P4W

Page 70: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 70

Case I:P1 “W”; coordinator sent commits

P1 “C”Case II:P1 NO; P1 A

P2, P3, P4 (surviving participants)

cannot safely abort or commit transaction

coord

P1

P2

P3

P4w

w

w

Page 71: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 71

Recovery Actions (cohort)

• All sites execute REDO-UNDO pass• Detection: A site knows it is a cohort if it

finds a prepared log record for a transaction• If the log does not contain a commit log

record:– reacquire all locks for the transaction– ask coordinator for the status of transaction

• If log contains a commit log record– do nothing

Page 72: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 72

Recovery Action (coordinator)• If protocol database was made fault-tolerant by logging every change,

simply reconstruct the protocol database and restart 2PC from the point of failure.

• However, since we have only logged the commit and completion transitions and nothing else:– if the log does not contain a commit. Simply abort the transaction.

If a cohort asks for status in the future, its status is not in the protocol database and it will be considered as aborted.

– If commit log record, but no completion log record, recreate transactions entry committed in the protocol

database and the recovery process will ask all the participants if they are still waiting for a commit message. If no one is waiting, the completion entry will be written.

– If commit log record + completion log record do nothing.

Page 73: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 74

Variants of 2PC

• Linear

Coord

• Hierarchical

ok ok ok

commit commit commit

Page 74: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 75

• Distributed

– Nodes broadcast all messages– Every node knows when to commit

Variants of 2PC

Page 75: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 76

Cooperative Termination Protocol

• Bad case– Participant P recovers from failure– Has prepared record for transaction T– No commit or abort record for T– Coordinator is down

• Participant P is blocked until coordinator recovers

Page 76: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 77

Cooperative termination protocol

• But perhaps some other participant can help?

• Requires participants “know” each other!

Page 77: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 78

Cooperative Termination Protocol

• Participant P sends a DECISION-REQUEST message to other participants

• Alive participants respond with COMMIT, ABORT, or UNCERTAIN

• If any participant replies with a decision (COMMIT or ABORT), P acts on decision– And sends decision to UNCERTAIN

participants

Page 78: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 79

Cooperative Termination Protocol

• When P receives a DECISION-REQUEST– If it knows decision, responds with

COMMIT or ABORT– If it has not prepared transaction,

responds ABORT– If it is prepared but does not know

decision, responds UNCERTAIN

Page 79: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 80

Cooperative TerminationSample scenario:

Coord P1C

P2W

P3W

Page 80: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 81

Cooperative TerminationSample scenario:

Coord P1W

P2W

P3A

Page 81: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 82

Cooperative TerminationSample scenario:

Coord P1W

P2W

P3W

Page 82: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 83

Is there a non-blocking protocol?

Theorem: If communications failure or total site failures (i.e.,

all sites are down simultaneously) are possible, then every atomic protocol may cause processes to become blocked.

Two exceptions:if we ignore communication failures, it is possible to design such a protocol (Skeen et. al. 83)If we impose some restrictions on transactions (I.e., what data they can read/write) such a protocol can also be designed (Mehrotra et. al. 92)

Page 83: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 84

Next…• Three-phase commit (3PC)

– Nonblocking if reliable network (no communications failure) and no total site failures

– Handling communications failures

Page 84: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 85

Why 2PC blocks?

• Since operational site on timeout in prepare state does not know if the failed site(s) had committed or aborted the transaction.

• Polling all operational sites does not work since all the operational sites might be in doubt.

Page 85: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 86

Approach to Making ACP Non-blocking• For a given state S of a transaction T in the ACP, let the concurrency

set of S be the set of states that other sites could be in. • For example, in 2PC, the concurrency set of PREPARE state is

{PREPARE, ABORT, COMMIT}• We develop non-blocking protocol, we will

– ensures that concurrency set of a transaction does not contain both a commit and an abort

– There exists no non-committable state whose concurrency set contains a commit. A state is committable if occupancy of the state by any site implies everyone has voted to commit the transaction.

• Necessity of these conditions illustrated by considering a situation with only 1 site operational. If either of the above violated, there will be blocking.

• Sufficiency illustrated by designing a termination protocol that will terminate the protocol correctly if the above assumptions hold.

Page 86: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 90

Coord

inato

r

Part

icip

ant

REQUEST-TO-PREPARE

PREPARED

COMMIT

DONE

PRECOMMIT

ACK

Page 87: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 91

Coord

inato

r

Part

icip

ant

REQUEST-TO-PREPARE

NO

ABORT

DONE

Page 88: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 92

Coordinator Participant

Log start-3PC record(participant list)

Log commit record(state C)

Log prepared record(state W)

Log committed record(state C)

REQUEST-PREPARE

PREPARED

COMMIT

PRECOMMIT

ACK

Page 89: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 93

Coordinator Participant

REQUEST-PREPARE

PREPARED

COMMIT

PRECOMMIT

ACK

1. Timeout: Abort

2. Timeout: ignore

1. Timeout: abort

2. TimeoutTermination Protocol

3. TimeoutTermination Protocol

Page 90: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 94

Process categories

• Three categories– Operational

Process has been up since start of 3PC

– Failed Process has halted since start of 3PC, or

is recovering

– Recovered Process that failed and has completed

recovery

Page 91: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 95

Three Phase Commit - Termination Protocol

• Choose a backup coordinator from the remaining operational sites.

• Backup coordinator sends messages to other operational sites to make transition to its local state (or to find out that such a transition is not feasible) and waits for response.

• Based on response as well as its local state, it continues to commit or abort the transaction.

• It commits, if its concurrency set includes a commit state. Else, it aborts.

Page 92: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 96

Termination Protocol

Start 3PC

Coordinatorfails

Decisionreached

All siteslearn decision

• Only operational processes participate in termination protocol. • Recovered processes wait until decision is reached and then learn decision

Page 93: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 97

Coordinator Participant

REQUEST-PREPARE

PREPARED

COMMIT

PRECOMMIT

ACK

Abortable (A)

Uncertain (U)

Precommitted (PC)

Committed (C)

Page 94: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 98

Termination Protocol• Elect new coordinator

– Use Election Protocol (coming soon…)

• New coordinator sends STATE-REQUEST to participants

• Makes decision using termination rules

• Communicates to participants

Page 95: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 99

Coord

inato

r

Part

icip

ant

STATE-REQUEST*

ABORTABLE

ABORT*

Page 96: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 100

Coord

inato

r

Part

icip

ant

STATE-REQUEST*

COMMITTED

COMMIT*

Page 97: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 101

Coord

inato

r

Part

icip

ant

STATE-REQUEST*

UNCERTAIN*

ABORT*

Page 98: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 102

Coord

inato

r

Part

icip

ant

STATE-REQUEST*

PRECOMMITTED, NO COMMITTED

COMMIT*

PRECOMMIT*

ACK*

Page 99: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 103

Termination ProtocolSample scenario:

Coord P1W

P2W

P3W

Page 100: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 104

Termination ProtocolSample scenario:

Coord P1W

P2W

P3PC

Page 101: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 105

Note: 3PC unsafe with communication failures!

W

W

W

P

P

abort commit

Page 102: 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

ICS214B Notes 11 106

• After coordinator receives DONE message, it can forget about the transaction– E.g., cleanup control structures