Upload
sergejus-barinovas
View
3.388
Download
2
Tags:
Embed Size (px)
DESCRIPTION
Overview of NoSQL in general, its types and available most pop
Citation preview
Sergejus Barinovas | Microsoft MVP
@sergejusb, sergejus.blogas.lt
NoSQL – What’s that?
NoSQL
WHY?
• Limited SQL scalability• Horizontal partitioning (sharding)• Vertical partitioning
NoSQL – Why?
• Limited SQL availability• Master / slave configuration
NoSQL – Why?
• SQL limitations for storing huge amount of data• Key / value / type columns
NoSQL – Why?
• Limited SQL speed of read/write operations• Multiple read replicas
NoSQL – Why?
• 2009, Eric Evans
• NoSQL – open source distributed databases, not relational SQL databases
• NoSQL – not only SQL
• NoSQL → Big Data
NoSQL History
• The ability to horizontally scale simple-operation throughput over many servers
NoSQL Characteristics (scalability)
• A “weaker” concurrency model than the ACID transactions in most SQL systems
NoSQL Characteristics (BASE)
• Efficient use of distributed indexes and RAM for data storage
NoSQL Characteristics (distributed)
• The ability to dynamically define new attributes or data schema
NoSQL Characteristics (schema-less)
• Atomicity – all or nothing
• Consistency – state integrity
• Isolation – no reads of uncommitted data
• Durability – recover committed trans
ACID (transactions)
• 2000, Eric Brewer• It is impossible for a distributed
computer system to simultaneously provide all three of the following guarantees:
• Consistency
• Availability
• Partition tolerance
CAP Theorem
• Basically – partial system failures are OKAvailable
• Soft state – inconsistency is OK
• Eventual consistency – stale data is OK
BASE (eventual consistency)
NoSQL Databases
• Key / value store
• Document database
• Graph database
• Columnar database
NoSQL Categories
• <key, value> or Tuple<key, v1,. ., vn>
• Simple operations• Get• Put• Delete
Key / value store
Byte[] Byte[]
Key Value
Key / value store
Key Value“current_date
”2023-04-08
“sergejusb” Binary Object
“sergejusb” JSON Object
• Dynamo*
• Membase
• Voldermort
• Redis
• Azure Table Storage
• Riak
Key / value store
Name: Dynamo
Created: 2007, Amazon (proprietary)
Implementation: ?
Distributed: Yes
Replication: Multiple Servers
CAP: AP
API: ?
Key / value store
Name: Membase
Created: 2010, sponsored by Zinga
Implementation: C / C++ / Erlang
Distributed: Yes
Replication: Multiple Servers
CAP: CP
API: Memcached API, JSON
Key / value store
Name: Voldemort
Created: 2008, LinkedIn
Implementation: Java
Distributed: Yes
Replication: Multiple Servers
CAP: AP
API: Java
Key / value store
Name: Redis
Created: 2009, sponsored by VMWare
Implementation: C
Distributed: No
Replication: Master / Slave
CAP: CP
API: Various Languages
Key / value store
Name: Azure Table Storage
Created: 2008, Microsoft
Implementation: ?
Distributed: Yes
Replication: Multiple Servers (DFS)
CAP: CP
API: .NET API, JSON
Key / value store
Name: Riak
Created: 2008, Basho (from Akamai)
Implementation: Erlang
Distributed: Yes
Replication: Multiple Servers
CAP: AP
API: JSON
Key / value store
• Document == complex object• XML• YAML• JSON / BSON
• Support for secondary indexes
• Schema can be defined at runtime
• Optional support for simple querying using Map / Reduce
Document database
• MongoDB
• CouchDB
• RavenDB
Document database
Name: MongoDB
Created: 2008, 10gen
Implementation: C++
Distributed: Yes via Shards
Replication: Master / Slave
CAP: CP
API: BSON
Document database
Name: CouchDB
Created: 2005
Implementation: Erlang
Distributed: Sort of
Replication: Master / Master
CAP: AP
API: JSON
Document database
Name: RavenDB
Created: 2010, Ayende Rahien
Implementation: C#
Distributed: Yes via Shards
Replication: Master / Master
CAP: AP
API: .NET API, JSON
Document database
• Graph == network
• Basic constructs• Node• Edge• Properties
Graph database
sergejus
sergejus.blogas.lt
tdagys
auth
ors reads
knows
knows
• FlockDB
• Neo4J
Graph database
Name: FlockDB
Created: 2010, Twitter
Implementation: Scala
Distributed: Yes
Replication: Multiple Servers
CAP: AP
API: Thrift, Ruby
Graph database
Name: Neo4J
Created: 2003, Neo Technologies
Implementation: Java
Distributed: No
Replication: Master / Slave
CAP: CP
API: JSON, Various Languages
Graph database
• For HUGE amount of data
• Columns are added at a runtime
• Great scalability • Horizontal • Vertical
Columnar database
• Unusual data model• Key Space == Database• Column Family == Table• Columns and Super Columns• Super Column == array of Columns• Column == Tuple<Key, Value,
Timestamp, TTL>
Columnar database
Columnar database
• Simple Column
Columnar database
• Super Column
• BigTable*
• Cassandra
• HBase
• Hypertable
Columnar database
Name: BigTable
Created: 2006, Google
Implementation: C++
Distributed: Yes
Replication: Multiple Servers (GFS)
CAP: CP
API: C++
Columnar database
Name: Cassandra
Created: 2008, Facebook
Implementation: Java
Distributed: Yes
Replication: Multiple Servers
CAP: AP
API: Thrift, Avro
Columnar database
Name: HBase
Created: 2007, Powerset
Implementation: Java
Distributed: Yes
Replication: Multiple Servers (HDFS)
CAP: CP
API: Thrift, Java, JSON
Columnar database
Name: Hypertable
Created: 2007, Zvents
Implementation: C
Distributed: Yes
Replication: Multiple Servers
CAP: CP
API: Thrift
Columnar database
• ORDER BY ?• “Natural Key Order”
NoSQL Limitations
• GROUP BY ?• Map / Reduce
NoSQL Limitations
• JOIN ?• Multiple Map / Reduce
NoSQL Limitations
• SELECT * ?• Multi-Machine Map / Reduce
NoSQL Limitations
• Maturity
• Tooling
• Specificity
NoSQL Limitations
• Choose the right tool for the task
• You can use BOTH
SQL vs. NoSQL
Q & A