55
Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - [email protected] 2012-2013

Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - [email protected] 2012-2013

Embed Size (px)

Citation preview

Page 1: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

Parallel DatabasesCOMP3017 Advanced Databases

Dr Nicholas Gibbins - [email protected]

Page 2: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

2

Definitions

Parallelism

–An arrangement or state that permits several operations or tasks to be performed simultaneously rather than consecutively

Parallel Databases

– have the ability to split

– processing of data

– access to data

– across multiple processors

Page 3: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

3

Why Parallel Databases

•Hardware trends

•Reduced elapsed time for queries

• Increased transaction throughput

• Increased scalability

• Better price/performance

• Improved application availability

• Access to more data

• In short, for better performance

Page 4: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

4

• Tightly coupled

• Symmetric Multiprocessor (SMP)

P = processor

M = memory

Shared Memory Architecture

P

Global Memory

PP

Page 5: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

5

• Less complex database software

• Limited scalability

• Single buffer

• Single database storage

Software – Shared Memory

P

Global Memory

PP

Page 6: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

6

• Loosely coupled

• Distributed Memory

Shared Disc Architecture

PPP

MMM

S

Page 7: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

7

• Avoids memory bottleneck

• Same page may be in more than one buffer at once – can lead to incoherence

• Needs global locking mechanism

• Single logical database storage

• Each processor has its own database buffer

Software – Shared Disc

PPP

MMM

S

Page 8: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

8

• Massively Parallel

• Loosely Coupled

• High Speed Interconnect (between processors)

Shared Nothing Architecture

PPP

MMM

Page 9: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

9

• One page is only in one local buffer – no buffer incoherence

• Needs distributed deadlock detection

• Needs multiphase commit protocol

• Needs to break SQL requests into multiple sub-requests

• Each processor has its own database buffer

• Each processor owns part of the data

Software - Shared Nothing

PPP

MMM

Page 10: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

10

Hardware vs. Software Architecture

• It is possible to use one software strategy on a different hardware arrangement

• Also possible to simulate one hardware configuration on another

–Virtual Shared Disk (VSD) makes an IBM SP shared nothing system look like a shared disc setup (for Oracle)

• From this point on, we deal only with shared nothing software mode

Page 11: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

11

Shared Nothing Challenges

• Partitioning the data

•Keeping the partitioned data balanced

• Splitting up queries to get the work done

• Avoiding distributed deadlock

•Concurrency control

•Dealing with node failure

Page 12: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

12

Hash Partitioning

Table

Page 13: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

13

Range Partitioning

A-H

I-P

Q-Z

Page 14: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

14

Schema Partitioning

Table 1

Table 2

Page 15: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

15

Rebalancing Data

Data in proper balance

Data grows, performance drops

Add new nodes and disc

Redistribute data to new nodes

Page 16: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

16

Dividing up the Work

Application

Coordinator Process

WorkerProcess

WorkerProcess

WorkerProcess

Page 17: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

17

DB Software on each node

App1

DBMS

W1 W2

C1

DBMS

W1 W2

App2

DBMS

W1 W2

C2

Page 18: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

18

Transaction Parallelism

Improves throughput

Inter-Transaction

–Run many transactions in parallel, sharing the DB

–Often referred to as Concurrency

Page 19: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

19

Query Parallelism

Improves response times (lower latency)

Intra-operator (horizontal) parallelism

–Operators decomposed into independent operator instances, which perform the same operation on different subsets of data

Inter-operator (vertical) parallelism

–Operations are overlapped

– Pipeline data from one stage to the next without materialisation

Page 20: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

20

Intra-Operator Parallelism

SQL Query

SubsetQueries

SubsetQueries

SubsetQueries

SubsetQueries

Processor

Processor

Processor

Processor

Page 21: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

21

Intra-Operator Parallelism

Example query:

– SELECT c1,c2 FROM t1 WHERE c1>5.5

Assumptions:

– 100,000 rows

– Predicates eliminate 90% of the rows

Page 22: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

22

I/O Shipping

Coordinatorand Worker

Network

Worker Worker Worker Worker

25,000 rows

25,000 rows

25,000 rows

25,000 rows

10,000 (c1,c2)

Page 23: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

23

Function Shipping

Coordinator

Network

Worker Worker Worker Worker

2,500 rows 2,500 rows 2,500 rows 2,500 rows

10,000 (c1,c2)

Page 24: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

24

Function Shipping Benefits

•Database operations are performed where the data are, as far as possible

•Network traffic in minimised

• For basic database operators, code developed for serial implementations can be reused

• In practice, mixture of Function Shipping and I/O Shipping has to be employed

Page 25: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

25

Inter-Operator Parallelism

time

Scan Join Sort

ScanJoin

Sort

Page 26: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

26

Resulting Parallelism

time

Scan Join Sort

ScanJoin

Sort

ScanScan

JoinJoin

SortSort

Page 27: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

27

Basic Operators

• The DBMS has a set of basic operations which it can perform

• The result of each operation is another table

– Scan a table – sequentially or selectively via an index – to produce another smaller table

– Join two tables together

– Sort a table

– Perform aggregation functions (sum, average, count)

• These are combined to execute a query

Page 28: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

28

The Volcano Architecture

• The Volcano Query Processing System was devised by Goetze Graefe (Oregon)

• Introduced the Exchange operator

– Inserted between the steps of a query to:

– Pipeline results

–Direct streams of data to the next step(s), redistributing as necessary

• Provides mechanism to support both vertical and horizontal parallelism

• Informix ‘Dynamic Scalable Architecture’ based on Volcano

Page 29: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

29

Exchange Operators

Example query:

– SELECT county, SUM(order_item)FROM customer, orderWHERE order.customer_id=customer_idGROUP BY countyORDER BY SUM(order_item)

Page 30: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

30

Exchange Operators

SORT

GROUP

HASHJOIN

SCAN

SCAN

Customer Order

Page 31: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

31

Exchange Operators

EXCHANGE

SCAN

SCAN

Customer

HASHJOIN

HASHJOIN

HASHJOIN

Page 32: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

32

Exchange Operators

EXCHANGE

SCAN

SCAN

Customer

HASHJOIN

HASHJOIN

HASHJOIN

EXCHANGE

SCAN

SCAN

SCAN

Order

Page 33: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

33

EXCHANGE

SCAN

SCAN

Customer

HASHJOIN

HASHJOIN

HASHJOIN

EXCHANGE

SCAN

SCAN

SCAN

Order

EXCHANGE

EXCHANGE

GROUPGROUP

SORT

Page 34: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

Query Processing

Page 35: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

35

Some Parallel Queries

• Enquiry

•Collocated Join

•Directed Join

• Broadcast Join

•Repartitioned Join

Page 36: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

36

Orders Database

CUSTKEY

C_NAME … C_NATION …

ORDERKEY

DATE …CUSTKEY

SUPPKEY

S_NAME … S_NATION …

SUPPKEY

CUSTOMER

ORDER

SUPPLIER

Page 37: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

37

Enquiry/Query

“How many customers live in the UK?”

SCAN

COUNT

COORDINATOR

SLAVE TASKS

SUM

Multiple partitionsof customer table

Return subcountsto coordinator

Return to applicationData from application

Page 38: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

38

Collocated Join

“Which customers placed orders in July?”

SCAN

UNION

ORDER

Tables both partitioned onCUSTKEY (the same key) andtherefore corresponding entriesare on the same node

Requires a JOIN of CUSTOMER and ORDER

SCAN

JOIN

CUSTOMER

Page 39: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

39

Directed Join

“Which customers placed orders in July?”(tables have different keys)

CUSTOMER

SCAN

ORDER

SCAN

JOIN

Slave Task 1 Slave Task 2

Coordinator

Return to application

ORDER partitioned on ORDERKEY, CUSTOMER partitioned on CUSTKEYRetrieve rows from ORDER, then use ORDER.CUSTKEY to direct appropriate rows to nodes with CUSTOMER

Page 40: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

40

Broadcast Join

“Which customers and suppliers are in the same country?”

CUSTOMER

SCAN

SUPPLIER

SCAN

JOIN

Slave Task 1 Slave Task 2

Coordinator

Return to application

SUPPLIER partitioned on SUPPKEY, CUSTOMER on CUSTKEY.Join required on *_NATIONSend all SUPPLIER to each CUSTOMER node

BROADCAST

Page 41: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

41

Repartitioned Join

“Which customers and suppliers are in the same country?”

CUSTOMER

SCAN

SUPPLIER

Slave Task 1

Coordinator

SUPPLIER partitioned on SUPPKEY, CUSTOMER on CUSTKEY.Join required on *_NATION. Repartition both tables on *_NATION tolocalise and minimise the join effort

SCAN

Slave Task 2

JOIN

Slave Task 3

Page 42: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

42

Query Handling

•Critical aspect of Parallel DBMS

•User wants to issue simple SQL, and not be concerned with parallel aspects

•DBMS needs structures and facilities to parallelise queries

•Good optimiser technology is essential

• As database grows, effects (good or bad) of optimiser become increasingly significant

Page 43: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

43

Further Aspects

• A single transaction may update data in several different places

•Multiple transactions may be using the same (distributed) tables simultaneously

•One or several nodes could fail

•Requires concurrency control and recovery across multiple nodes for:

– Locking and deadlock detection

– Two-phase commit to ensure ‘all or nothing’

Page 44: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

Deadlock Detection

Page 45: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

45

Locking and Deadlocks

•With Shared Nothing architecture, each node is responsible for locking its own data

•No global locking mechanism

•However:

– T1 locks item A on Node 1 and wants item B on Node 2

– T2 locks item B on Node 2 and wants item A on Node 1

–Distributed Deadlock

Page 46: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

46

Resolving Deadlocks

•One approach – Timeouts

• Timeout T2, after wait exceeds a certain interval

– Interval may need random element to avoid ‘chatter’i.e. both transactions give up at the same time and then try again

•Rollback T2 to let T1 to proceed

•Restart T2, which can now complete

Page 47: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

47

Resolving Deadlocks

•More sophisticated approach (DB2)

• Each node maintains a local ‘wait-for’ graph

•Distributed deadlock detector (DDD) runs at the catalogue node for each database

• Periodically, all nodes send their graphs to the DDD

•DDD records all locks found in wait state

• Transaction becomes a candidate for termination if found in same lock wait state on two successive iterations

Page 48: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

Reliability

Page 49: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

49

Reliability

We wish to preserve the ACID properties for parallelised transactions

– Isolation is taken care of by 2PL protocol

– Isolation implies Consistency

–Durability can be taken care of node-by-node, with proper logging and recovery routines

–Atomicity is the hard part. We need to commit all parts of a transaction, or abort all parts

Two-phase commit protocol (2PC) is used to ensure that Atomicity is preserved

Page 50: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

50

Two-Phase Commit (2PC)

Distinguish between:

– The global transaction

– The local transactions into which the global transaction is decomposed

Global transaction is managed by a single site, known as the coordinator

Local transactions may be executed on separate sites, known as the participants

Page 51: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

Phase 1: Voting

•Coordinator sends “prepare T” message to all participants

• Participants respond with either “vote-commit T” or “vote-abort T”

•Coordinator waits for participants to respond within a timeout period

Page 52: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

Phase 2: Decision

• If all participants return “vote-commit T” (to commit), send “commit T” to all participants. Wait for acknowledgements within timeout period.

• If any participant returns “vote-abort T”, send “abort T” to all participants. Wait for acknowledgements within timeout period.

•When all acknowledgements received, transaction is completed.

• If a site does not acknowledge, resend global decision until it is acknowledged.

Page 53: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

53

Normal Operation

C Pprepare T

vote-commit T

commit T

ack

vote-commit Treceived from allparticipants

Page 54: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

Parallel Utilities

Page 55: Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins - nmg@ecs.soton.ac.uk 2012-2013

55

Parallel Utilities

• Ancillary operations can also exploit the parallel hardware

– Parallel Data Loading/Import/Export

– Parallel Index Creation

– Parallel Rebalancing

– Parallel Backup

– Parallel Recovery