Db presentation google_megastore

Preview:

Citation preview

Department of information SystemCourse:IS533-Advanced topics in database

Done by :Alanoud Saad AlqoufiID :435920068

Supervised by : Prof.Almetwally Mohamed MostafaDate :5/3/2015

Google Megastore

Outline• Introduction• Architecture and data model• Transactions and concurrency control• Replications• Data structurs and algorithms• Failure Detection• Throughput • Limitation • Related work• Experience• Questions

What is Google Megastore

A database over Bigtable with high availability• Widely deployed in Google• Used on more than 100 application• Handle more than 3 billion write and 20 billion read• Store nearly a petabyte of primary data• Available on GAE since Jan 2011

1

Motivation• Scalability• Availability • Consistency• Responsive• Rapid development

2

Megastore=RDBMS+NOSQLRDBMS NOSQL

Slow performance High performanceNot scalable Scalable

Fixed data model No schemaEasier to code Complicated

Consistent Less consistent

3

Megastore

4

Toward Availability and Scalability

1. Data Replication Paxos Algorithm

2. Data Partitioning Entity Groups

5

Entity Group Operations

6

Design of Megastore• De-normalized dataData Model• The data model declared in schema• Each schema has a set of tables • Table could be Root or Child table• The root along with all child entities called Entity

Group

7

Sample schema for Photo Sharing ServiceCREATE SCHEMA PhotoApp;

CREATE TABLE User { required int64 user_id;required string name;} PRIMARY KEY(user_id), ENTITY GROUP ROOT;

CREATE TABLE Photo {required int64 user_id;required int32 photo_id; required int64 time;required string full_url;optional string thumbnail_url;repeated string tag;} PRIMARY KEY(user_id, photo_id),IN TABLE User,ENTITY GROUP KEY(user_id) REFERENCES User;

CREATE LOCAL INDEX PhotosByTimeON Photo(user_id, time);CREATE GLOBAL INDEX PhotosByTagON Photo(tag) STORING (thumbnail_url); 8

IndexesSecondary indexes are supported

• Local index• Global index• Storing clause• Repeated index• Inline index

9

Mapping to Big tableBigtable column name=Megastore table name+Property name

10

Transactions and concurrency controlConcurrency Control

MVCC• Read consistency• Current• Snapshot• Inconsistent reads

• Write consistency

11

Transactions and concurrency control

Complete transaction lifecycle in Megastore1. Read2. Application logic3. Commit4. Apply5. Clean up

12

Transactions and concurrency control

13

Queues• Example: Calendar application

2 Phase commit

Transactions and concurrency control

14

Paxos

15

Replications

Modified Paxos

16

Fast reads Fast writes

Replications

New Replica Types• Full Replicas• Witness Replicas• Read-only Replicas

17

Replications

Replicated Logs

18

Data structurs and algorithms

Reads

19

Data structurs and algorithms

Writes

20

Data structurs and algorithms

Failure Detection

• Chubby lock service

21

Write Throughput• Sharding entity groups• Place replicas in same region• Bulk processing

22

Limitations• Latency• Chain gang throttling• Not enforce policies on physical layout

23

Experience

24

Related Work• NoSQL Bigtable, Cassandra, Yahoo PNUTS, Amazon SimpleDB

• Data replication processHbase, CouchDB, Dynamo

• Paxos algorithmSCALARIS, Keyspace

25

Questions1. Megastore is built upon2. Synchronous replication based upon3. Partitioned into a vast space of small databases each with its own

replicated4. The data model is declared in a strongly typed5. Megastore tables are either or tables6. 3 levels of read consistency: 7. Cross entity group updates are supported by:

26

BigTable

Paxos

log

SchemaRoot

Child

CurrentSnapshotInconsistent

2 Phase commit

Any Questions?