Nonrelational Databases

1. Non-relational Databases A new kind of Databases for handling Web Scale

2. Agenda

The problem

3. The solution 4. Benefits 5. Cost 6. Example: Cassandra 7. The problem

The Web introduces a new scale for applications, in terms of:

Concurrent users(millions of reqs/second)

8. Data(peta-bytes generated daily) 9. Processing(all this data needs processing) 10. Exponential growth(surging unpredictable demands) 11. The problem (contd.)

Web sites with very large traffic have no way to deal with this using existing RDBMS solutions:

Oracle

12. MS SQL 13. Sybase 14. MySQL 15. PostgreSQL Even with their high-end clustering solutions 16. The problem (contd.)

Applications using normalized database schema require the use of join's, which doesn't perform well under lots of data and/or nodes

17. Existing RDBMS clustering solutions require scale-up, which is limited & not really scalable when dealing with exponential growth 18. Machines have upper limits on capacity, & sharding the data & processing across machines is very complex & app-specific 19. The problem (contd.)

Why not just use sharding?

Very problematic when adding/removing nodes

20. Basically, you end up denormalizing everything & loosing all benefits of relational databases 21. Who faced this problem?

Web applications dealing with high traffic, massive data, large user-base & user-generated content, such as:

Google

22. Yahoo! 23. Amazon 24. Facebook 25. Twitter 26. Linked-In 27. & many more 28. 1 difference though

Compared to traditional large applications (telco, financial, &c), these web applications are usuallyfree& therefore:

can sacrifice data integrity / consistency

No one will sue them if he doesn't receive the most current:

status of their friends (Facebook/Twitter)

29. Web search result (Google /Yahoo!) 30. Item added to cart (Amazon) 31. The solution

These companies had to come up with a new kind of DBMS, capable of handling web scale

Possibly sacrificing some level of consistency or some other feature

32. Must we sacrifice something?

In 2000, Eric Brewer (co-founder of Inktomi) formulated the CAP theorem, claiming that you can only optimize 2 out of these 3:

C onsistency

33. A vailability 34. P artition-tolerance BTW, the theorem was later proved by MIT scientists in 2002 35. Simple example

When you have a lot of data which needs to be highly available, you'll usually need top artition it across machines & also replicate it to be more fault-tolerant

36. This means, that when writing a record, all replica's must be updated too 37. Now you need to choose between:

Lock all relevant replica's during update => be lessa vailable

38. Don't lock the replicas => be lessc onsistent 39. The consequence

You need to either:

Drop partition tolerance (CA)

40. Drop availability (CP) 41. Drop consistency (AP) Drop here is usually not meant as binary, but rather tunable 42. Non-relational databases

The solution these companies came up with are a family of database for handling web scale:

BigTable(developed at Google)

43. Hbase(developed at Yahoo!) 44. Dynamo(developed at Amazon) 45. Cassandra(developed at FaceBook) 46. Voldemort(developed at LinkedIn) 47. & a few more:

Riak, Redis, CouchDB, MongoDB, Hypertable

48. Benefits

Massively scalable

49. Extremely fast 50. Highly available, decentralized & fault tolerant (no single-point-of-failure) 51. Transparent sharding (consistent hashing) 52. Elasticity 53. Parallel processing 54. Dynamic schema 55. Automatic conflict resolution 56. Consistent hashing 57. Replication 58. Replication node joining 59. Replication node leaving 60. Scale-out / elasticity?

O(1) Distributed Hashtable

61. Runs on a large number of cheap commodity machines 62. Replication 63. Gossip protocol 64. Transparently handles adding/removing nodes 65. Tunable consistency?

Levels of consistency:

Strict consistency

66. Read your writes consistency 67. Session consistency 68. Monotonic read consistency 69. Eventual consistency Tunable means: how many replica's to lock on write

N, R, W parameters

70. Quorum 71. Dealing with inconsistency

Read-repair (when encountering inconsistency)

72. Vector clock conflict resolution 73. Dynamic schema

Column families (basically a sparse table)

74. Dynamic schema (contd.)

Supercolumn is a collection of columns

75. Record can have several supercolumns 76. Data processing

Map/Reduce: an API exposed by non-relational databases to process data

A functional programming pattern for parallelizing work

77. Brings the workers to the data excellent fit for non-relational databases 78. Minimizes the programming to 2 simple functions (map & reduce) 79. Example: count appearances of a word in a giant table of large texts 80. Map/Reduce (contd.) 81. Storage 82. Cost

Allows sacrificing consistency (ACID) - at certain circumstances (but can deal with it)

83. Non-standard new API model 84. Non-standard new Schema model 85. New knowledge required to tune/optimize 86. Less mature 87. API model

Usually, similar to Key-Value map:

Get(key)

88. Put(key, value) 89. Delete(key) 90. Execute(operation, key_list) value can be

an opaque serialized object

91. a record (list of columns: ) 92. Schema model

Kind of sparse table

93. No schema 94. Example: Cassandra

Features:

O(1) DHT

95. Eventual consistency

tunable: consistency vs. latency

Values are structured, indexed 96. Columns / column families 97. Slicing with predicates (queries) 98. PartitionOrderer 99. Cassandra performance

Benchmark against MySQL (50GB)

MySQL:

300ms write

100. 350ms read Cassandra:

0.12ms write

101. 15ms read how come writes are so fast?

Writes involve no reads/seeks

102. Use any node (closest to you) 103. Cassandra API 104. Cassandra API (contd.) 105. Example: Cassandra (contd.)

Java API

Simple DAO

106. Simple client 107. Cassandra usage

Very high-traffic sites:

Facebook

108. Digg 109. Twitter 110. Further information

The Dynamo paper:

http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html

Nosql patterns:

http://horicky.blogspot.com/2009/11/nosql-patterns.html

Nosql conference video's:

https://nosqleast.com/2009/

Hebrew podcast covering nosql & Cassandra(episodes 56, 57 & more):

http://www.reversim.com/

111. Further information (contd.)

Ran Tavori's lecture (video + slides):

http://prettyprint.me/2010/01/09/introduction-to-nosql-and-cassandra-part-1/

112. http://prettyprint.me/2010/01/20/introduction-to-nosql-and-cassandra-part-2/

Technology

Nonrelational Databases