57
Università degli Studi di Roma Tor VergataDipartimento di Ingegneria Civile e Ingegneria Informatica NoSQL Data Stores Corso di Sistemi e Architetture per Big Data A.A. 2017/18 Valeria Cardellini The reference Big Data stack Valeria Cardellini - SABD 2017/18 1 Resource Management Data Storage Data Processing High-level Interfaces Support / Integration

NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Università degli Studi di Roma “Tor Vergata” Dipartimento di Ingegneria Civile e Ingegneria Informatica

NoSQL Data Stores

Corso di Sistemi e Architetture per Big Data A.A. 2017/18

Valeria Cardellini

The reference Big Data stack

Valeria Cardellini - SABD 2017/18

1

Resource Management

Data Storage

Data Processing

High-level Interfaces Support / Integration

Page 2: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Traditional RDBMSs

•  RDBMSs: the traditional technology for storing structured data in web and business applications

•  SQL is good –  Rich language and toolset –  Easy to use and integrate –  Many vendors

•  RDBMSs promise ACID guarantees

Valeria Cardellini - SABD 2017/18

2

ACID properties

•  Atomicity –  All included statements in a transaction are either executed

or the whole transaction is aborted without affecting the database (“all or nothing” principle)

•  Consistency –  A database is in a consistent state before and after a

transaction

•  Isolation –  Transactions cannot see uncommitted changes in the

database (i.e., the results of incomplete transactions are not visible to other transactions)

•  Durability –  Changes are written to a disk before a database commits a

transaction so that committed data cannot be lost through a power failure.

Valeria Cardellini - SABD 2017/18

3

Page 3: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

RDBMS constraints

•  Domain constraints –  Restricts the domain of each attribute or the set of possible

values for the attribute

•  Entity integrity constraint –  No primary key value can be null

•  Referential integrity constraint –  To maintain consistency among the tuples in two relations:

every value of one attribute of a relation should exist as a value of another attribute in another relation

•  Foreign key –  To cross-reference between multiple relations: it is a key in a

relation that matches the primary key of another relation

Valeria Cardellini - SABD 2017/18

4

Pros and cons of RDBMS

•  Well-defined consistency model

•  ACID guarantees •  Relational integrity

maintained through entity and referential integrity constraints

•  Well suited for OLTP apps OLTP: OnLine Transaction Processing

•  Sound theoretical foundation •  Stable and standardized

DBMSs available •  Well understood

Valeria Cardellini - SABD 2017/18

5

•  Performance as major constraint, scaling is difficult

•  Limited support for complex data structures

•  Complete knowledge of DB structure required to create ad hoc queries

•  Commercial DBMSs are expensive

•  Some DBMSs have limits on fields size

•  Data integration from multiple RDBMSs can be cumbersome

Pros Cons

Page 4: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

RDBMS challenges

•  Web-based applications caused spikes –  Internet-scale data size –  High read-write rates –  Frequent schema changes

•  Let’s scale RDBMSs –  RDBMS were not designed to be distributed

•  Possible solutions: –  Replication –  Sharding

Valeria Cardellini - SABD 2017/18

6

Replication

•  Primary backup with master/worker architecture •  Replication improves read scalability •  Write operations?

Valeria Cardellini - SABD 2017/18

7

Page 5: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Sharding

•  Horizontal partitioning of data across many separate servers

•  Scales read and write operations •  Cannot execute transactions across shards

(partitions) •  Consistent hashing is one form of sharding

Valeria Cardellini - SABD 2017/18

8

-  Hash both data and nodes using the same hash function in a same ID space

Scaling RDBMSs is expensive and inefficient

Valeria Cardellini - SABD 2017/18

9

Source: Couchbase technical report

Page 6: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

NoSQL data stores •  NoSQL = Not Only SQL

–  SQL-style querying is not the crucial objective

•  Main features of NoSQL data stores –  Support flexible schema

•  No requirement for fixed rows in a table schema –  Scale horizontally

•  Partitioning of data and processing over multiple nodes –  Provide scalability and high availability by replicating data in

multiple nodes, often across datacenters –  Multiprocessor support –  Mainly utilize shared-nothing architecture

•  With exception of graph-based database Valeria Cardellini - SABD 2017/18

10

NoSQL data stores (2) •  Main features of NoSQL data stores

(continued) –  Avoid unneeded complexity

•  E.g., elimination of join operations

–  Useful when working with Big data when the data’s nature does not require a relational model

–  Support weaker concurrency models than the standard ACID transaction model

•  Rather BASE: compromising reliability for better performance

Valeria Cardellini - SABD 2017/18

11

Page 7: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

ACID vs BASE

12

•  Two design philosophies at opposite ends of the consistency-availability spectrum -  Keep in mind the CAP theorem! Pick two of Consistency, Availability and Partition tolerance

•  ACID: the traditional approach to address the consistency issue in RDBMS –  A pessimistic approach: prevent conflicts from

occurring •  Usually implemented with write locks managed by the

system

–  But ACID does not scale well when handling petabytes of data (remember of latency!)

Valeria Cardellini - SABD 2017/18

ACID vs BASE (2)

13

•  BASE stands for Basically Available, Soft state, Eventual consistency –  An optimistic approach

•  Lets conflicts occur, but detects them and takes action to sort the out

•  Approaches: •  conditional updates: test the value just before updating •  save both updates: record that they are in conflict and

then merge them

–  Basically Available: the system is available most of the time and there could exist a subsystem temporarily unavailable

–  Soft state: data is not durable in the sense that its persistence is in the hand of the user that must take care of refresh them

–  Eventually consistent: the system eventually converge to a consistent state

•  Usually adopted in NoSQL databases Valeria Cardellini - SABD 2017/18

Page 8: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Consistency

•  Biggest change from a centralized RDBMS to a cluster-oriented NoSQL

•  RDBMS: strong consistency –  Traditional RDBMS are CA systems (or CP

systems, depending on the configuration) •  NoSQL systems: mostly eventual consistency

–  AP systems

Valeria Cardellini - SABD 2017/18

14

Consistency: an example •  Ann is trying to book a room of the Ace Hotel in New

York on a node located in London of a booking system

•  Pathin is trying to do the same on a node located in Mumbai

•  The booking system uses a replicated database with the master located in Mumbai and the slave in London

•  There is only a room available •  The network link between the two servers breaks

Valeria Cardellini - SABD 2017/18

15

Ann Pathin

London Mumbay

Page 9: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Consistency: an example •  CA system: neither user can book any hotel room

–  No tolerance to network partitions

•  CP system: –  Pathin can make the reservation –  Ann can see the inconsistent room information but cannot

book the room

•  AP: both nodes accept the hotel reservation –  Overbooking!

•  Remember that the tolerance to this situation depends on the application type –  Blog, financial exchange, shopping chart, …

Valeria Cardellini - SABD 2017/18

16

Pessimistic vs. optimistic approach

Valeria Cardellini - SABD 2017/18 17

•  Concurrency involves a fundamental tradeoff between: -  Safety (avoiding errors such as update conflicts)

and

-  Liveness (responding quickly to clients)

•  Pessimistic approaches often: -  Severely degrade the responsiveness of a system

-  Leads to deadlocks, which are hard to prevent and debug

Page 10: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

NoSQL cost and performance

Valeria Cardellini - SABD 2017/18

18

Source: Couchbase technical report

Pros and cons of NoSQL

•  Easy to scale-out •  Higher performance for

massive data scale •  Allows sharing of data

across multiple servers •  Most solutions are either

open-source or cheaper •  HA and fault tolerance

provided by data replication •  Supports complex data

structures and objetcs •  No fixed schema, supportrs

unstructured data •  Very fast retrieval of data,

suitable for real-time apps Vale

ria C

arde

llini

- S

AB

D 2

017/

18

19

•  Do not provide ACID guarantees, less suitable for OLTP apps

•  No fixed schema, no common data storage model

•  Limited support for aggregation (sum, avg, count, group by)

•  Performance for complex join is poor

•  No well defined approach for DB design (different solutions have different data models)

•  Lack of consistent model can lead to solution lock-in

Pros Cons

Page 11: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Barriers to NoSQL

•  Main barriers to NoSQL adoption –  No full ACID transaction support –  Lack of standardized interfaces –  Huge investments already made in existing

RDBMSs

•  A commercial example –  AWS launched two NoSQL services (SimpleDB in

2007 and later DynamoDB in 2012) and one RDBMS service (RDS in 2009)

Valeria Cardellini - SABD 2017/18

20

NoSQL data models

•  A number of largely diverse data stores not based on the relational data model

Valeria Cardellini - SABD 2017/18

21

Page 12: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

NoSQL data models

•  A data model is a set of constructs for representing the information –  Relational model: tables, columns and rows

•  Storage model: how the DBMS stores and manipulates the data internally

•  A data model is usually independent of the storage model

•  Data models for NoSQL systems: –  Aggregate-oriented models: key-value,

document, and column-family –  Graph-based models

Valeria Cardellini - SABD 2017/18

22

Aggregates •  Data as units that have a complex structure

–  More structure than just a set of tuples –  E.g.: complex record with simple fields, arrays,

records nested inside •  Aggregate pattern in Domain-Driven Design

–  A collection of related objects that we treat as a unit

–  A unit for data manipulation and management of consistency

•  Advantages of aggregates –  Easier for application programmers to work with –  Easier for database systems to handle operating

on a cluster See http://thght.works/1XqYKB0 Valeria Cardellini - SABD 2017/18

23

Page 13: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Transactions?

•  Relational databases do have ACID transactions!

•  Aggregate-oriented databases: –  Support atomic transactions, but only within a

single aggregate –  Don’t have ACID transactions that span multiple

aggregates •  Update over multiple aggregates: possible inconsistent

reads

–  Part of the consideration for deciding how to aggregate data

•  Graph databases tend to support ACID transactions

Valeria Cardellini - SABD 2017/18

24

Key-value data model •  Simple data model in which data is represented as a

collection of key-value pairs –  Associative array (map or dictionary) as fundamental data

model

•  Strongly aggregate-oriented –  Lots of aggregates –  Each aggregate has a key

•  Data model: –  A set of <key,value> pairs –  Value: an aggregate instance

•  The aggregate is opaque to the database –  Just a big blob of mostly meaningless bit

•  Access to an aggregate: –  Lookup based on its key

•  Richer data models can be implemented on top Valeria Cardellini - SABD 2017/18

25

Page 14: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Key-value data model: example

Valeria Cardellini - SABD 2017/18

26

Types of key-value stores

Valeria Cardellini - SABD 2017/18

27

•  Consistency models –  Range from eventual consistency to serializability

•  AP: Dynamo, Voldemort, Riak •  CP: Scalaris, Redis, Berkeley DB

•  Some data stores support ordering of keys

•  Some maintain data in memory (RAM), while others employ HDs or SSDs

Page 15: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Query features in key-value data stores

•  Only query by the key! –  There is a key and there is the rest of the data (the value)

•  It is not possible to use some attribute of the value column

•  The key needs to be suitably chosen –  E.g., session ID for storing session data

•  What if we don’t know the key? –  Some system allows the search inside the value using a full-

text search (e.g., using Apache Solr)

Valeria Cardellini - SABD 2017/18

28

Suitable use cases for key-value data stores •  Storing session information in web apps

–  Every session is unique and is assigned a unique sessionId value

–  Store everything about the session using a single put, request or retrieved using get

•  User profiles and preferences –  Almost every user has a unique userId, username, …, as

well as preferences such as language, which products the user has access to, …

–  Put all into an object, so getting preferences of a user takes a single get operation

•  Shopping cart data –  All the shopping information can be put into the value where

the key is the userId

Valeria Cardellini - SABD 2017/18

29

Page 16: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Examples of key-value stores

•  Amazon’s Dynamo is the most notable example –  Riak: open-source implementation

•  Other key-value stores include: –  Amazon DynamoDB

•  Data model and name from Dynamo, but different implementation –  Berkeley DB (ordered keys) –  Memcached, Redis, Hazelcast (in-memory data stores) –  Oracle NoSQL Database –  Encache (Java-based cache) –  Scalaris –  Voldemort (used by LinkedIn) –  upscaledb –  LevelDB (written by Google’s fellows and open source)

Valeria Cardellini - SABD 2017/18

30

Document data model

•  Strongly aggregate-oriented –  Lots of aggregates –  Each aggregate has a key

•  Similar to a key-value store (unique key), but API or query/update language to query or update based on the internal structure in the document –  The document content is no longer opaque

•  Similar to a column-family store, but values can have complex documents, instead of fixed format

•  Document: encapsulates and encodes data in some standard formats or encodings –  XML, YAML, JSON, BSON, …

Valeria Cardellini - SABD 2017/18

31

Page 17: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Document data model •  Data model:

–  A set of <key, document> pairs –  Document: an aggregate instance

•  Structure of the aggregate is visible –  Limits on what we can place in it

•  Access to an aggregate –  Queries based on the fields in the aggregate

•  Flexible schema –  No strict schema to which documents must conform, which

eliminates the need of schema migration efforts

Valeria Cardellini - SABD 2017/18

32

Document data model

Valeria Cardellini - SABD 2017/18

33

•  The data model resembles the JSON object

Page 18: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Document data store API

•  Usual CRUD operations (although not standardized) –  Creation (or insertion) –  Retrieval (or query, search, finds)

•  Not only simple key-to-document lookup •  API or query language that allows the user to retrieve

documents based on content (or metadata)

–  Update (or edit) •  Replacement of the entire document or individual

structural pieces of the document

–  Deletion (or removal)

Valeria Cardellini - SABD 2017/18

34

Examples of document stores

•  MongoDB and CouchDB are the two major representatives –  You will use MongoDB in lab –  Documents grouped together to form collections –  Collections organized into databases

•  Other popular document stores include: –  Couchbase –  DocumentDB: PaaS service in Microsoft’s Azure cloud platform –  RethinkDB –  RavenDB (for .NET)

Valeria Cardellini - SABD 2017/18

35

Page 19: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Key-value vs. document stores •  Key-value store

–  A key plus a big blob of mostly meaningless bits –  We can store whatever we like in the aggregate –  We can only access an aggregate by lookup based on its

key

•  Document store –  A key plus a structured aggregate –  More flexibility in access –  We can submit queries to the database based on the fields

in the aggregate –  We can retrieve part of the aggregate rather than the whole

thing –  Indexes based on the contents of the aggregate

Valeria Cardellini - SABD 2017/18

36

Suitable use cases for document stores

•  Good for storing and managing big data-size collections of semi-structured data with a varying number of fields –  Text and XML documents, email messages –  Conceptual documents like de-normalized

(aggregate) representations of DB entities such as product or customer

–  Sparse data in general, i.e., irregular (semi-structured) data that would require an extensive use of nulls in RDBMS

•  Nulls being placeholders for missing or nonexistent values

Valeria Cardellini - SABD 2017/18

37

Page 20: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

When not to use document stores

•  Complex transactions spanning different operations –  Document stores unsuited for atomic cross-

document operations

•  Queries against varying aggregate structure –  Data is saved as an aggregate in the form of

application entities. If the structure of the aggregate constantly changes, the aggregate is saved at the lowest level of granularity. In this scenario, document stores may not work

Valeria Cardellini - SABD 2017/18

38

Column-family data model •  Strongly aggregate-oriented

–  Lots of aggregates –  Each aggregate has a key

•  Similar to a key-value store, but the value can have multiple attributes (columns)

•  Data model: a two-level map structure –  A set of <row-key, aggregate> pairs –  Each aggregate is a group of pairs <column-key, value> –  Column: a set of data values of a particular type

•  Aggregate structure is visible •  Columns can be organized in families

–  Data usually accessed together

Valeria Cardellini - SABD 2017/18

39

Page 21: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Column-family data model: example

Valeria Cardellini - SABD 2017/18

40

•  Representing customer information in a column-family structure

Column-store vs. row-store

Valeria Cardellini - SABD 2017/18

41

•  Row-store systems: store and process data by row –  However, DBMSs support indexes to improve the performance

of set-wide operations on the whole tables

•  Column-store systems: store and process data by column –  Can access faster the needed data rather than scanning and

discarding unwanted data in row –  Examples: C-Store (pre-NoSQL) and its commercial fork Vertica

https://bit.ly/2pwrBMN –  Bigtable and Cassandra: no column stores in the original sense

of the term, but column families are stored separately

Page 22: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Properties of column-family stores

Valeria Cardellini - SABD 2017/18

42

•  In many analytical databases queries, few attributes are needed –  Example: aggregate queries (avg, max, …) –  Column-store is preferable for performance

•  Moreover, both rows and columns are split over multiple nodes using sharding to achieve scalability

•  So column-family stores are suitable for read-mostly, read-intensive, large data repositories

Properties of column-family stores •  Operations also allow picking out a particular column

See slide 40: get('1234', 'name')!•  Each column:

–  Has to be part of a single column family –  Acts as unit for access

•  You can add any column to any row, and rows can have very different columns

•  You can model a list of items by making each item a separate column

•  Two ways to look at data: –  Row-oriented

•  Each row is an aggregate •  Column families represent useful chunks of data within that

aggregate –  Column-oriented

•  Each column family defines a record type •  Row as the join of records in all column families

Valeria Cardellini - SABD 2017/18

43

Page 23: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Suitable use cases for column-family stores

•  Queries that involve only a few columns •  Aggregation queries against vast amounts of

data -  E.g., average age of all of your users

•  Column-wise compression •  Well-suited for OLAP-like workloads (e.g.,

data warehouses) which typically involve highly complex queries over all data (possibly petabytes)

Valeria Cardellini - SABD 2017/18

44

Examples of column-family stores •  Google’s Bigtable is the most notable example •  Other column-family stores include:

–  Apache HBase: open-source implementation providing Bigtable-like capabilities on top of Hadoop and HDFS

–  Apache Accumulo: based on the design of BigTable and powered by Apache Hadoop, Zookeeper, and Thrift

•  Different APIs and different nomenclature from HBase, but same in operational and architectural standpoint

•  Better security –  Cassandra –  Hypertable –  Amazon Redshift

Valeria Cardellini - SABD 2017/18

45

Page 24: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Graph data model •  Uses graph structures with nodes, edges, and

properties to represent stored data –  Nodes are the entities and have a set of attributes –  Edges are the relationships between the entities

•  E.g.: an author writes a book –  Edges can be directed or undirected –  Nodes and edges also have individual properties consisting of

key-value pairs

•  Replaces relational tables with structured relational graphs of interconnected key-value pairs

•  Powerful data model –  Differently from other types of NoSQL stores, it concerns itself

with relationships –  Focus on visual representation of information (more human-

friendly than other NoSQL stores) –  Other types of NoSQL stores are poor for interconnected data

Valeria Cardellini - SABD 2017/18

46

Graph data model: example

•  Small example from Neo4j

Valeria Cardellini - SABD 2017/18

47

Page 25: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Graph database

•  Explicit graph structure •  Each node knows its adjacent nodes

–  As the number of nodes increases, the cost of local step (or hop) remains the same

•  Plus an index for lookups

•  Cons: –  Sharding: data partitioning is difficult –  Horizontal scalability

•  When related nodes are stored on different servers, traversing multiple servers is not performance-efficient

–  Require rewiring your brain Valeria Cardellini - SABD 2017/18

48

Examples of graph databases

•  Neo4j •  OrientDB •  InfiniteGraph •  AllegroGraph •  HyperGraphDB

Valeria Cardellini - SABD 2017/18

49

Page 26: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Suitable use cases for graph databases •  Good for applications where you need to model

entities and relationships between them –  Social networking applications –  Pattern recognition –  Dependency analysis –  Recommendation systems –  Solving path finding problems raised in navigation systems –  …

•  Good for applications in which the focus is on querying for relationships between entities and analyzing relationships –  Computing relationships and querying related entities is

simpler and faster than in RDBMS

Valeria Cardellini - SABD 2017/18

50

Case studies

•  Key-value data stores: Amazon’s Dynamo (and Riak KV), Redis

•  Document-oriented data stores: MongoDB •  Column-family data stores: Google’s Bigtable

(and HBase), Cassandra •  Graph databases: Neo4j

In blue: Lab

Valeria Cardellini - SABD 2017/18

51

Page 27: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Case study: Amazon’s Dynamo •  Highly available and scalable distributed key-

value data store built for Amazon’s platform –  A very diverse set of Amazon applications with

different storage requirements –  Need for storage technologies that are always

available on a commodity hardware infrastructure •  E.g., shopping cart service: “Customers should be able to view

and add items to their shopping cart even if disks are failing, network routes are flapping, or data centers are being destroyed by tornados”

–  Meet stringent Service Level Agreements (SLAs) •  E.g., “service guaranteeing that it will provide a response within

300ms for 99.9% of its requests for a peak client load of 500 requests per second.”

Valeria Cardellini - SABD 2017/18

52

G. DeCandia et al., "Dynamo: Amazon's highly available key-value store", Proc. of ACM SOSP 2007.

Dynamo: Features

•  Simple key-value API –  Simple operations to read (get) and write (put) objects uniquely

identified by a key –  Each operation involves only one object at time

•  Focus on eventually consistent store –  Sacrifices consistency for availability –  BASE rather than ACID

•  Efficient usage of resources •  Simple scale-out schema to manage increasing data

set or request rates

•  Internal use of Dynamo –  Security is not an issue since operation environment is

assumed to be non-hostile

Valeria Cardellini - SABD 2017/18

53

Page 28: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Dynamo: Design principles •  Sacrifice consistency for availability (CAP theorem) •  Use optimistic replication techniques •  Possible conflicting changes which must be detected

and resolved: when to resolve them and who resolves them? –  When: execute conflict resolution during reads rather than

writes, i.e. “always writeable” data store –  Who: data store or application; if data store, use simple

policy (e.g., “last write wins”)

•  Other key principles: –  Incremental scalability

•  Scale-out with minimal impact on the system –  Simmetry and decentralization

•  P2P techniques –  Heterogeneity

Valeria Cardellini - SABD 2017/18

54

Dynamo: API

•  Each stored object has an associated key •  Simple API including get() and put() operations to

read and write objects get(key)!

•  Returns single object or list of objects with conflicting versions and context

•  Conflicts are handled on reads, never reject a write put(key, context, object)!

•  Determines where the replicas of the object should be placed based on the associated key, and writes the replicas to disk

•  Context encodes system metadata, e.g., version number –  Both key and object treated as opaque array of bytes –  Key: 128-bit MD5 hash applied to client supplied key

Valeria Cardellini - SABD 2017/18

55

Page 29: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Dynamo: Used techniques

Valeria Cardellini - SABD 2017/18

56

Problem Technique Advantage

Partitioning Consistent hashing Incremental scalability

High Availability for writes Vector clocks with reconciliation during reads

Version size is decoupled from update rates

Handling temporary failures

Sloppy Quorum and hinted handoff

Provides high availability and durability guarantee

when some of the replicas are not available

Recovering from permanent failures

Anti-entropy using Merkle trees

Synchronizes divergent replicas in the background

Membership and failure detection

Gossip-based membership protocol and

failure detection

Preserves symmetry and avoids having a

centralized registry for storing membership and

node liveness information

Dynamo: Data partitioning •  Consistent hashing: output range of a hash is treated

as a ring (similar to Chord) –  MD5(key) -> node (position on the ring) –  Differently from Chord: zero-hop DHT

•  “Virtual nodes” –  Each node can be responsible for more than one virtual

node –  Work distribution proportional to the capabilities of the

individual node

Valeria Cardellini - SABD 2017/18

57

Page 30: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Dynamo: Replication

•  Each object is replicated on N nodes –  N is a parameter configured per-instance by the application

•  Preference list: list of nodes that is responsible for storing a particular key –  More than N nodes to account for node failures –  See figure: object identified by key K is replicated on nodes

B, C and D

Valeria Cardellini - SABD 2017/18

58

•  Node D will store the keys in the ranges (A, B], (B, C], and (C, D]

Dynamo: Used techniques

Valeria Cardellini - SABD 2017/18

59

Problem Technique Advantage

Partitioning Consistent hashing Incremental scalability

High availability for writes Vector clocks with reconciliation during reads

Version size is decoupled from update rates

Handling temporary failures

Sloppy Quorum and hinted handoff

Provides high availability and durability guarantee

when some of the replicas are not available

Recovering from permanent failures

Anti-entropy using Merkle trees

Synchronizes divergent replicas in the background

Membership and failure detection

Gossip-based membership protocol and

failure detection

Preserves symmetry and avoids having a

centralized registry for storing membership and

node liveness information

Page 31: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Dynamo: Data versioning •  A put() call may return to its caller before the

update has been applied at all the replicas •  A get() call operation may return an object that

does not have the latest updates •  Version branching can also happen due to node/

network failures •  Problem: multiple versions of an object, that the

system needs to reconcile •  Solution: use vectorial clocks to capture the casuality

among different versions of the same object –  If causal: older version can be forgotten –  If concurrent: conflict exists, requiring reconciliation

Valeria Cardellini - SABD 2017/18

60

Dynamo: Used techniques

Valeria Cardellini - SABD 2017/18

61

Problem Technique Advantage

Partitioning Consistent hashing Incremental scalability

High Availability for writes Vector clocks with reconciliation during reads

Version size is decoupled from update rates

Handling temporary failures

Sloppy Quorum and hinted handoff

Provides high availability and durability guarantee

when some of the replicas are not available

Recovering from permanent failures

Anti-entropy using Merkle trees

Synchronizes divergent replicas in the background

Membership and failure detection

Gossip-based membership protocol and

failure detection

Preserves symmetry and avoids having a

centralized registry for storing membership and

node liveness information

Page 32: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Dynamo: Sloppy quorum

•  R/W: minimum numebr of nodes at must participate in a successful read/write operation

•  Setting R + W > N yields a quorum-like system –  The latency of a get (or put) operation is dictated by the

slowest of the R (or W) replicas –  R and W are usually configured to be less than N, to provide

better latency –  Typical configuration in Dynamo: (N, R, W) = (3, 2, 2)

•  Balances performance, durability, and availability

•  Sloppy quorum –  Due to partitions, quorums might not exist –  Sloppy quorum: create transient replicas (called hinted

replicas) •  N healthy nodes from the preference list (may not always

be the first N nodes encountered while walking the consistent hashing ring)

Valeria Cardellini - SABD 2017/18

62

Dynamo: Put and get operations •  put operation

–  Coordinator generates new vector clock and writes the new version locally

–  Send to N nodes –  Wait for response from W nodes

•  get operation –  Coordinator requests existing versions from N

•  Wait for response from R nodes

–  If multiple versions, return all versions that are causally unrelated

–  Divergent versions are then reconciled –  Reconciled version written back

Valeria Cardellini - SABD 2017/18

63

Page 33: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Dynamo: Hinted handoff

•  Consider N = 3; if A is temporarily down or unreachable, put will use D

•  D knows that the replica belongs to A

•  Later, D detects A is alive –  Sends the replica to A –  Removes the replica

Valeria Cardellini - SABD 2017/18

64

•  Hinted handoff for transient failures

•  Again, “always writeable” principle

Dynamo: Used techniques

Valeria Cardellini - SABD 2017/18

65

Problem Technique Advantage

Partitioning Consistent hashing Incremental scalability

High Availability for writes Vector clocks with reconciliation during reads

Version size is decoupled from update rates

Handling temporary failures

Sloppy Quorum and hinted handoff

Provides high availability and durability guarantee

when some of the replicas are not available

Recovering from permanent failures

Anti-entropy using Merkle trees

Synchronizes divergent replicas in the background

Membership and failure detection

Gossip-based membership protocol and

failure detection

Preserves symmetry and avoids having a

centralized registry for storing membership and

node liveness information

Page 34: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Dynamo: Membership management

•  Administrator explicitly adds and removes nodes

•  Gossiping to propagate membership changes –  Eventually consistent view –  O(1) hop overlay

Valeria Cardellini - SABD 2017/18

66

Dynamo: Failure detection and management •  Passive failure detection

–  Use pings only for detection from failed to alive –  In the absence of client requests, node A doesn’t need to

know if node B is alive

•  Anti-entropy mechanism to keep replica synchronized •  Use Merkle trees to rapidly detect inconsistency and

limit the amount of data transferred –  Merkle tree: every leaf node is labeled with the hash of a

data block and every non-leaf node is labeled with the cryptographic hash of the labels of its child nodes

Valeria Cardellini - SABD 2017/18

67

-  Nodes maintain Merkle tree of each key range

-  Exchange root of Merkle tree to check if the key ranges are updated

Page 35: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Riak KV

•  Distributed NoSQL key-value data store inspired by Dynamo –  Open-source version, http://docs.basho.com/riak/kv/2.2.3/

•  Like Dynamo –  Employs consistent hashing to partition and replicate data

around the ring –  Makes use of gossiping to propagate membership changes –  Uses vector clocks to resolve conflicts –  Nodes can be added and removed from the Riak cluster as

needed –  Two ways of resolving update conflicts:

•  Last write wins •  Both values are returned allowing the client to resolve the

conflict

•  Can run on Mesos Valeria Cardellini - SABD 2017/18

68

Case study: Google’s Bigtable

Valeria Cardellini - SABD 2017/18

69

•  Built on GFS, Chubby (lock service), SSTable (log-structured storage) and a few other Google technologies –  Data storage organized in tables, whose rows are distributed

over GFS

•  In 2015 made available as a service on Google Cloud Platform: Cloud Bigtable

•  Underlies Google Cloud Datastore •  Used by a number of Google applications, including:

–  Web indexing, MapReduce, Google Maps, Google Earth, YouTube and Gmail

•  LevelDB is based on concepts from Bigtable –  Stores entries lexicographically sorted by keys, but widely noted

for being unreliable Chang et al., "Bigtable: A Distributed Storage System for Structured Data", ACM Trans. Comput. Syst., 2008.

Page 36: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Bigtable: Motivation

•  Lots of semi-structured data at Google –  URLs, geographical locations, ...

•  Big data –  Billions of URLs, hundreds of millions of users,

100+TB of satellite image data, …

Valeria Cardellini - SABD 2017/18

70

Bigtable: Main features

•  Distributed storage structured as a large table –  Distributed multi-dimensional sorted map

•  Fault-tolerant

•  Scalable and self-managing

•  CP system: strong consistency and network partition tolerance

Valeria Cardellini - SABD 2017/18

71

Page 37: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Bigtable: Data model •  Table

–  Distributed, multi-dimensional, sparse and sorted map –  Indexed by rows

•  Rows –  Sorted in lexicographical order by row key –  Every read or write in a row is atomic: no concurrent ops on

same row

•  Columns –  The basic unit of data access –  Sparse table: different rows may use different columns –  Column family: group of columns

•  Data within a column family usually of the same type –  Column family allows for specific optimization for better

access control, storage and data indexing –  Column naming: column-family:column Valeria Cardellini - SABD 2017/18

72

Bigtable: Data model •  Multi-dimensional: rows, column families and

columns provide a three-level naming hierarchy in identifying data

Valeria Cardellini - SABD 2017/18

73

Page 38: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Bigtable: Data model •  Time-based

–  Each column family may keep multiple versions, each one having a timestamp

•  Bigtable data model vs. relational data model

Valeria Cardellini - SABD 2017/18

74

Bigtable: Tablet •  Tablet: group of consecutive rows of a table stored

together –  The basic unit for data storing and distribution –  Table sorted by row keys: select row keys properly to

improve data locality

•  Auto-sharding: tablets are split by the system when they become too large

•  Each tablet is served by exactly one tablet server

Valeria Cardellini - SABD 2017/18

75

Page 39: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Bigtable: API

•  Metadata operations –  Create/delete tables, column families, change

metadata •  Write operations: single-row, atomic

–  Write/delete cells in a row, delete all cells in a row •  Read operations: read arbitrary cells in a

Bigtable table –  Each row read is atomic –  One row, all or specific columns, certain

timestamps, ...

Valeria Cardellini - SABD 2017/18

76

Bigtable: Writing and reading examples

Valeria Cardellini - SABD 2017/18

77

Page 40: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

•  Main components: –  Master server –  Tablet server –  Client library

Bigtable: Architecture

Valeria Cardellini - SABD 2017/18

78

Bigtable: Master server

•  One master server •  Assigns tablets to tablet servers •  Balances tablet server load •  Garbage collection of unneeded files in GFS •  Handles schema changes, e.g., table and

column family creations

Valeria Cardellini - SABD 2017/18

79

Page 41: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Bigtable: Tablet server

•  Many tablet servers •  Can be added or removed dynamically •  Each manages a set of tablets (typically

10-1000 tablets/server) •  Handles read/write requests to tablets •  Splits tablets when too large

Valeria Cardellini - SABD 2017/18

80

Bigtable: Client library

•  Library that is linked into every client •  Client data do not move through the master •  Clients communicate directly with tablet

servers for reads/writes

Valeria Cardellini - SABD 2017/18

81

Page 42: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Bigtable: Building blocks

•  The external building blocks of Bigtable are: –  Google File System (GFS): raw storage –  Chubby: distributed lock service –  Cluster scheduler: schedules jobs onto machines

Valeria Cardellini - SABD 2017/18

82

Bigtable: Chubby lock service •  Distributed and highly available lock service used in

many Google’s products –  File system {directory/file} for locking –  Uses Paxos for consensus to keep replicas consistent –  A client leases a session with the service

•  In Bigtable Chubby is used to: –  Ensure there is only one active master –  Store bootstrap location of Bigtable data –  Discover tablet servers –  Store Bigtable schema information –  Store access control lists (ACL)

Valeria Cardellini - SABD 2017/18

83

Page 43: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Bigtable: Locating rows

•  Three-level indexing hierarchy •  Root tablet stores location of all METADATA tablets

in a special METADATA tablet •  Each METADATA tablet contains location of user

data tablets •  For efficiency the client library caches tablet locations

Valeria Cardellini - SABD 2017/18

84

Bigtable: Master startup

•  The master executes the following steps at startup –  Grabs a unique master lock in Chubby (to prevent

concurrent master instantiations) –  Scans the tablet servers directory in Chubby to

find the live servers –  Communicates with each live tablet server to find

what tablets are assigned to each server –  Scans the METADATA table to learn the full set of

tablets and builds a set of unassigned tablet servers, which are eligible for tablet assignment

Valeria Cardellini - SABD 2017/18

85

Page 44: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Bigtable: Tablet assignment

•  A tablet is assigned to one tablet server at a time

•  Master uses Chubby to keep tracks of live tablet serves and unassigned tablets

•  When a tablet server starts, it creates and acquires an exclusive lock in Chubby

•  Master detects the status of the lock of each tablet server by checking it periodically

•  Master is responsible for finding when tablet server is no longer serving its tablets and reassigning those tablets as soon as possible

Valeria Cardellini - SABD 2017/18

86

Bigtable: SSTable

•  Sorted Strings Table (SSTable) file format used internally to store Bigtable data

•  Immutable file of key/value string pairs, sorted by keys

•  Each SSTable is stored in a GFS file •  Chunks of data plus a block index

–  A block index is used to locate blocks –  The index is loaded into memory when the SSTable is opened

Valeria Cardellini - SABD 2017/18

87

Page 45: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Bigtable: Serving tablets

•  Updates committed to a commit log •  Recently committed updates are stored in memory

(memtable) •  Older updates are stored in a sequence of SSTables

Valeria Cardellini - SABD 2017/18

88

Write operations are logged

Recent updates kept sorted in memory

memtable and SSTables are merged to serve a read request

Bigtable: Loading tablets

•  To load a tablet, a tablet server: –  Finds location of tablet through its METADATA

•  Metadata for a tablet includes list of SSTables and set of redo points

–  Read SSTables index blocks into memory –  Read the commit log since the redo point and

reconstructs the memtable

Valeria Cardellini - SABD 2017/18

89

Page 46: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Bigtable: Consistency and availability

•  Strong consistency –  CP system –  Only one tablet server is responsible for a given

piece of data –  Replication is handled on the GFS layer

•  Tradeoff with availability –  If a tablet server fails, its portion of data is

temporarily unavailable until a new server is assigned

Valeria Cardellini - SABD 2017/18

90

Comparing Dynamo and Bigtable

Valeria Cardellini - SABD 2017/18

91

Dynamo Bigtable Data model Key-value, row store Column store API Single value Single value and range Data partition Random Ordered Optimized for Writes Reads Consistency Eventual Atomic Multiple version Version Timestamp Replication Quorum GFS Data center aware Yes Yes Persistency Local and pluggable Replicated and

distributed file system Architecture Decentralized Hierarchical (master/

worker) Client library Yes Yes

Page 47: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Cloud Bigtable •  Bigtable as a Cloud service •  Sparsely populated table that can scale to

billions of rows and thousands of columns –  Table: sorted key/value map –  Table composed of rows, each of which describes

a single entity, and columns, which contain individual values for each row

–  Row indexed by a single row key –  Columns that are related to one another are

grouped together into a column family •  Column identified by column family and column qualifier

Valeria Cardellini - SABD 2017/18

92

Cloud Bigtable: table example

Valeria Cardellini - SABD 2017/18

93

•  Application: social network for United States presidents

•  Table: tracks who each president is following

Page 48: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Cloud Bigtable: use cases •  When to use

–  To store large amounts of single-keyed data with low latency for apps that need high throughput and scalability for non-structured key/value data

•  Value no larger than 10 MB

–  Some examples •  Marketing data (e.g., purchase histories, customer

preferences) •  Financial data (e.g., transaction histories, stock prices) •  IoT data (e.g., usage reports from energy meters and

home appliances) •  Time-series data (e.g., CPU and memory usage over

time for multiple servers)

Valeria Cardellini - SABD 2017/18

94

Cloud Bigtable: architecture

Valeria Cardellini - SABD 2017/18

95

Page 49: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Case study: Cassandra

Valeria Cardellini - SABD 2017/18

96

•  Some large production deployments: -  Apple: over 75,000, over 10 PB of data -  Netflix: 2,500 nodes, 420 TB, over 1 trillion requests per day

•  Initially developed at Facebook •  A mixture of Amazon’s Dynamo and Google’s

BigTable

Dynamo BigTable P2P architecture (replication & partitioning), gossip-based discovery and error detection

Sparse column-oriented data model, storage architecture

(SSTable disk storage)

Cassandra

Cassandra: features •  High availability and incremental scalability •  Robust support for systems spanning multiple data

centers –  Asynchronous master-less replication allowing low latency

operations

•  Data model: structured key-value store where columns are added only to specified rows –  Distributed multi-dimensional map indexed by a row key –  As in BigTable: different rows can have different number of

columns and columns are grouped into column families –  Emphasizes denormalization instead of normalization and joins –  See example at http://bit.ly/2n1Vfua

•  Write-oriented system –  Bigtable designed for intensive read workloads

Valeria Cardellini - SABD 2017/18

97

Page 50: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Cassandra: Consistency

•  Decentralized –  No master node to coordinate reads and writes

•  Tunable read and write consistency for each query SELECT points! FROM fantasyfootball.playerpoints USING CONSISTENCY QUORUM! WHERE playername = ‘Tom Brady’;!

•  Achieved through quorum-based protocol –  If N+W > R and W >= N/2 +1 you have strong consistency –  Some available consistency levels (see http://bit.ly/2n26EdE)

•  ONE: only a single replica must respond •  QUORUM: a majority of the replicas must respond •  ALL: all of the replicas must respond •  LOCAL_QUORUM: a majority of the replicas in the local datacenter must

respond •  Tunable tradeoffs between consistency and latency

Valeria Cardellini - SABD 2017/18

98

Cassandra Query Language (CQL) •  An SQL-like language CREATE COLUMNFAMILY Customer (!KEY varchar PRIMARY KEY,!name varchar,!city varchar,!web varchar);!!INSERT INTO Customer (KEY,name,city,web)!VALUES ('mfowler',!'Martin Fowler',!'Boston',!'www.martinfowler.com');!!SELECT * FROM Customer!SELECT name,web FROM Customer!SELECT name,web FROM Customer WHERE city='Boston'!

Valeria Cardellini - SABD 2017/18

99

Page 51: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Case study: Neo4j

•  Support for ACID •  Properties of nodes and relationships captured in the

form of multiple attributes (key-value pairs) –  Relationships are unidirectional

•  Nodes tagged with labels –  Labels used to represent different roles in the modeled

domain

•  Neo4j architecture –  Clustered installation

•  Single read/write master and multiple read-only slaves –  Improve data throughput via a multi-level caching scheme

Valeria Cardellini - SABD 2017/18

100

Neo4j: Twitter graph example

Valeria Cardellini - SABD 2017/18

101

Page 52: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Neo4j: Cypher •  Cypher: declarative SQL-like language for performing

CRUD operations and more –  See http://bit.ly/2o0kytc –  ASCII-Art to represent patterns –  Relationships are an arrow --> between two nodes

•  Some examples: –  Create a node with labels and properties !CREATE (movie: Movie {title: “Shrek”})!

–  Search for a specified pattern: MATCH corresponds to SELECT in SQL !MATCH (me {name: “Rick”})-[:KNOWS *2..3]-> (remote_friend) RETURN remote_friend.name!•  Nodes that are a variable number of relationship→node hops

away can be found using the following syntax: ! !-[:TYPE*minHops..maxHops]�!

Valeria Cardellini - SABD 2017/18

102

Neo4j: Cypher

•  Native support for shortest path queries •  Example: find the shortest path between two airports

(SFO and MSO)

–  shortestPath returns a single shortest path between two nodes

•  In the example, the path length is between 0 and 2 –  allShortestPath returns all the shortest paths between

two nodes Valeria Cardellini - SABD 2017/18

103

Page 53: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Neo4j: Cypher

•  Shortest path queries (see Twitter graph on slide 98)

•  More queries on Twitter graph at http://network.graphdemos.com –  You can map your Twitter network and analyze it with Neo4j

Valeria Cardellini - SABD 2017/18

104

Find all the shortest paths between two Twitter users linked by Follows as long as the path is min 0 and max 3 long

Find all the shortest paths between two Twitter users as long as the path is min 0 and max 5 long

Performance comparison of NoSQL data stores •  Unsolved issue: no standard benchmark

–  Yahoo Cloud Serving Benchmark (YCSB): open source workload generator used in some studies

•  We consider two papers: 1.  “Solving Big Data Challenges for Enterprise Application

Performance Management”, VLDB 2012 http://bit.ly/1vYYihO 2.  “Is Elasticity of Scalable Databases a Myth?”, BigData 2016

•  Their focus is on performance evaluation –  Throughput and latency as metrics

•  Overall conclusion: –  No single “winner takes all” among NoSQL data stores –  Results depend on use case, workload and deployment

conditions

Valeria Cardellini - SABD 2017/18

105

Page 54: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Performance comparison @VLDB’12 •  Target: Application Performance Monitoring (APM)

platform for monitoring big data •  Workload R: 95% reads and only 5% writes

Valeria Cardellini - SABD 2017/18

106

Throughput Read latency

Write latency •  Cassandra: linear scalability in terms of throughput but slow at reading and writing

•  HBase: suffers in scalability, slow at reading but fast at writing

•  Redis: being in-memory, the fastest at reading

Performance comparison @VLDB’12

•  Workload WR: 50% reads and 50% writes

Valeria Cardellini - SABD 2017/18

107

Study conclusions: -  Linear scalability for Cassandra, HBase, and Voldemort in

most of the tests -  Cassandra’s throughput dominated in all the tests,

however high latency

-  HBase achieved least throughput but exhibited a low write latency at the cost of a high read latency

Throughput Read latency

Page 55: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

Performance comparison @BigData’16 Va

leria

Car

delli

ni -

SA

BD

201

7/18

108

•  Couchbase achieves highest throughput and lowest latency •  Cassandra benefits the most from larger cluster sizes in contrast

to MongoDB -  Cassandra shows best results also in “NoSQL Databases:

MongoDB vs Cassandra”

Which data store to use?

•  Different data stores designed to solve different problems

•  Using a single database engine for all of the requirements… –  storing transactional data –  caching session information –  traversing graph of customers –  performing OLAP operations

•  … usually leads to non-performing solutions •  Different needs for availability, consistency, or

backup requirements

Valeria Cardellini - SABD 2017/18

109

Page 56: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

How to choose about consistency?

Valeria Cardellini - SABD 2017/18

110

•  Let us consider the differences about consistency

Poliglot persistence •  Better approach: rather than a single solution

use multiple data storage technologies –  Choose them based upon the way data are used

by applications or components of a single application

–  Different kinds of data are best dealt with different data stores: pick the right tool for the right use case

Valeria Cardellini - SABD 2017/18

111

•  See http://bit.ly/2plRJYq

Page 57: NoSQL Data Stores - ce.uniroma2.it · • RDBMS: strong consistency – Traditional RDBMS are CA systems (or CP systems, depending on the configuration) • NoSQL systems: mostly

References

•  Sadalage and Fowler, “NoSQL distilled”, Addison-Wesley, 2009. •  Golinger at al., “Data management in cloud environments:

NoSQL and NewSQL data stores”, J. Cloud Comp., 2013. http://bit.ly/2oRKA5R

•  DeCandia et al., "Dynamo: Amazon's highly available key-value store", ACM SOSP 2007. http://bit.ly/2pmXzsr

•  Chang et al., “Bigtable: a distributed storage system for structured data”, OSDI 2006. http://bit.ly/2nywNRg

•  Lakshman and Malik, “Cassandra - a decentralized structured storage system”, LADIS 2009. http://bit.ly/2nyGSxE

Valeria Cardellini - SABD 2017/18

112