Apache Cassandra - SALSAHPC

Apache Cassandra

Arvind Dwarakanath

[email protected]

Department of Computer Science

Indiana University Bloomington

1. What are NoSQL and

Distributed Hash Tables?

The concept of NoSQL differs from the

standard Relational Database. The

problems of the relational databases

included the inability to work on data-

intensive applications and indexing of

large number of files/documents. Many

NoSQL systems have been developed in

order to cater to the above

requirements.

Many of the more popular NoSQL

databases have of late been distributed

in nature. This type of structure means

redundant storage of data on many

servers. The storing occurs using a

distributed hash table.

In a distributed hash table, the data is

stored and a keyspace is evaluated using

a hash function. The hashing is done

using a SHA-1 hash. The data is

traversed and then stored in a node that

is responsible for that keyspace. A

keyspace partitioning scheme splits

ownership of this keyspace among the

participating nodes. An overlay network

then connects the nodes, allowing them

to find the owner of any given key in the

keyspace.

A very popular version of a NoSQL

database using the concept of keyspace

is Apache Cassandra - the topic of the

survey.

2. What is Cassandra?

Cassandra is an open source distributed

database management system. It is an

Apache Software Foundation top-level

project designed to handle very large

amounts of data spread out across many

commodity servers while providing a

highly available service with no single

point of failure. It is a NoSQL system

that was initially developed by Facebook

and it powers their Inbox Search

feature. A standalone test version of

Twitter called „Twissandra‟ has also been

created as demonstration.

The basic fundamental of Cassandra is

that it is a columnar database or rather

a column-oriented distributed database.

The data is stored in the form of

columns and it is uniquely marked using

'keyspace'. It can be classified as a

'Cloud Db'.

3. Features of the Cassandra

Model

The data model

A table in Cassandra is a distributed

multidimensional map indexed by a

Keyspace. The value is an object

maybe an element or it may be highly

structured. The row key in a table is a

string with no size restrictions, although

typically 16 to 36 bytes long. Every

operation under a single row key is

atomic per replica no matter how many

columns are being read or written into.

Columns are grouped together into sets

called column families very much similar

to what happens in the BigTable system.

Cassandra exposes two kinds of column

families: Simple and Super. Super

column families can be visualized as a

column family within a column family.

The top dimension in Cassandra is called

Keyspace.

For instance; usrs['adwaraka'] will

indicate a column family of users. In it,

there will be an identifier „adwaraka‟. In

usrs, we can further add

usrs[adwaraka][fname],

usrs[adwaraka][lname] and

usrs[adwaraka][gender].

Column and Column Family

As mentioned before, the data model is

columnar in nature. The column is the

base of Cassandra data model. The

column is the lowest and smallest

increment of data. It‟s a tupple (triplet)

that contains a name, a value and a

timestamp.

Here‟s a column represented in JSON

notation:

For the usr[adwaraka]

{

fname: "Arvind",

lname: "Dwarakanath",

gender: “Male”

}

A column family resembles a table in an

RDBMS. Column families contain rows

and columns. Each row is uniquely

identified by a row key.

Each row has multiple columns, each of

which has a name, value, and a

timestamp.

Unlike a table in an RDBMS, different

rows in the same column family do not

have to share the same set of columns,

and a column may be added to one or

multiple rows at any time. It can be

useful to distinguish between “static”

column families that contain values such

as user data or other object data, and

“dynamic” column families that contain

data such as precalculated query results.

Keyspaces

Keyspaces group column families

together. Typically, there will be one

Keyspace for each application that uses

a Cassandra cluster.

The most important settings that are

defined at the keyspace level are the

replication factor and the replica

placement strategy.

Thus, if you have sets of data that have

different requirements for these settings

(such as different levels of fault-

tolerance), these sets of data should

reside in different keyspaces. A

keyspace is to be set before any client

API like thrift has to be fired.

On the Cassandra CLI, use the 'use

<keyspace name>' to select the

required keyspace. The command goes

like this

use keyspace Keyspace1;

Super Columns

Super Columns are a type of super

structure of columns. Super columns are

way to group multiple columns. Every

super column must have a different

name, just like with regular columns.

Different super columns may hold sub-

columns with the same name. Super

columns are a way to add an extra map

layer to the data model.

Super columns are frequently used to

hold a single record where each field in

the record is represented by a sub-

column. For example, the name of a

super column might be the ID of a

transaction and each sub-column could

hold some attribute of the transaction.

For example, if a transactions row like

the one describe had two entries, it

might look like:

{

„trans-A‟: {

„date‟: „01/02/2010‟,

„amount‟: 5000

„timespace‟: <value1> },

„trans-B‟: {

„date‟: „01/03/2010‟,

„amount‟: 4500

„timespace‟: <value2> }

}

Decentralized

Every node in the cluster is identical.

There are no hierarchies between the

nodes. There are no network

bottlenecks. There are no single points

of failure.

Elasticity

New nodes can be added without any

down time or problems to applications.

So a user can continuously add new

nodes to it without any worry about

stoppage of applications.

Durability

Durability is the property that writes,

once completed, will survive

permanently, even if the server is killed

or crashes or loses power. This requires

calling fsync to tell the OS to flush its

write-behind cache to disk.

Fault Tolerant

Data is automatically replicated to the

multiple nodes for implementing fault-

tolerance. Replication across multiple

data centres is supported. Failed nodes can be replaced with no downtime.

Changeable Consistency

The ability to tune consistency levels per

query is a powerful feature of Cassandra

because it gives the developer complete

control of managing the trade-off of

availability versus consistency. This

means that Cassandra queries can be

configured to exhibit strongly consistent

behaviour (but there is no row-level

locking) if the developer is willing to

sacrifice latency.

The consistency levels offered by

Cassandra, which have different

meanings for reads and writes, are

detailed in the tables below. A „quorum‟

of replicas is essentially a majority of

replicas, or [(Replica Number/2) + 1]

with any resulting fractions rounded

down.

WRITE CONSISTENCY LEVELS

- ALL- All replicas must have

received the write; otherwise the

operation will fail.

- ANY - Ensure that the write has

been written to at least one node

(can include hinted handoff

recipients). Note that if all replica

nodes are down at write time, an

ANY write may not be readable

until nodes have recovered.

- ONE - Ensure that the write has

been written to at least one

replica‟s commit log and memory

table before responding to the

client.

- QUORUM- Ensure that the write

has been written to a quorum of

replicas before responding to the

client.

- LOCAL_QUORUM- Ensure that the

write has been written to a

quorum of replicas in the

datacenter local to the

coordinator before responding to

the client. This setting avoids the

latency of inter-data center

communication.

- EACH_QUORUM- Ensure that the

write has been written to a

quorum of replicas in each

datacenter in the cluster before

responding to the client.

READ CONSISTENCY LEVELS

- ALL- Return the record with the

most recent timestamp once all

replicas have replied, failing the

operation if any replicas are

unresponsive.

- ONE- Returns the response from

the closest replica, as determined

by the snitch configured for the

cluster. When read_repair is

enabled, Cassandra may perform

a consistency check in a

background thread.

- QUORUM- Returns the record

with the most recent timestamp

once a quorum of replicas has

reported.

- LOCAL_QUORUM- Returns the

record with the most recent

timestamp once a quorum of

replicas in the datacenter local to

the coordinator has reported.

This setting avoids the latency of

inter-data center communication.

- EACH_QUORUM- Returns the

record with the most recent

timestamp once a quorum of

replicas in each datacenter in the

cluster has reported.

In choosing the consistency level for

particular operations, developers should

consider the relative importance of

consistency, latency, and availability.

Note that the read operations are slower

than rights in Cassandra. Cassandra can

therefore be used for more write

intensive operations on a massive scale

like a blog.

For instance, in a case where availability

is top priority it may make sense to

choose a level of ONE over QUORUM. If

the replication factor for the cluster is 3,

a QUORUM operation tolerates the loss

of only one node (or, one copy of the

data) while ONE allows the operation to

complete even if two nodes are

unavailable.

Clusters spanning multiple data centres

may present further questions regarding

local latency and durability.

4. Major Client Libraries for

Cassandra

Thrift

Thrift has been mentioned many times

before in the paper. What is Thrift? Thrift

is a software framework that allows for

scalable cross-programming

development. In this context, Thrift is

the name of the RPC client used to

communicate with the Cassandra server.

It statically generates an interface for

serialization in a variety of languages,

including C++, Java, Python, PHP, Perl,

C# to name a few. It is this mechanism

that allows you to interact with

Cassandra from any of these client

languages.

Some other clients that are used include

Hector (using Java), Pycassa (using

Python), phpcasssa (PHP), Ruby

(Cassandra) etc. The libraries are

available at github website.

Running Cassandra and basic CLI

Commands

The latest version available for download

is Apache-Cassandra is 0.7.2. The

installation is simple enough. The

minimal installation for this study was

done only on one local machine using a

virtual OS Ubuntu. To run the Cassandra

in the foreground, we need to run the

following command:-

darkprince@ubuntu:~/Desktop/apache

-cassandra-0.7.2$ bin/cassandra –f

After a flurry of messages, we see that

the message that pops up looks

somewhat like.

INFO 10:09:01,633 Listening for

thrift clients....

This indicates that Cassandra is all ears

for all the thrift clients and that the

clients can start their operations.

In the absence of any higher-level client,

Cassandra has its own indigenous client

that can be run using the Cassandra

Command line prompt. To run this, open

another terminal and run the following

command-

darkprince@ubuntu:~/Desktop/apache

-cassandra-0.7.2$ bin/cassandra-

cli --host localhost

We can see the following on the screen-

Connected to: "Test Cluster" on

localhost/9160

Welcome to cassandra CLI.

Type 'help;' or '?' for help. Type

'quit;' or 'exit;' to quit.

[default@unknown]

On the following prompt, the user can

type in the necessary commands. If in

doubt, the user can type „help;‟ to see

what the various APIs of Cassandra are.

5. References and Notes

1. ‘Cassandra – A Decentralized Structured Storage System’, by Avinash Lakshman and Prashant Malik. Published April 2010.

2. Apache Cassandra Main Project

Site. This site contains the code

base and information of the data

model. This even provides the

client options a user can have to

access the contents of

Cassandra;

http://cassandra.apache.org/

3. Apache Site that contains the

Cassandra wiki;

http://wiki.apache.org/cassandra

/

4. Datastax Cassandra

Documentation, it contains a

succinct summary of the tuneable

consistencies;

http://www.datastax.com/docs/0.

7/index

5. Software Solutions and

Development. Talks about the

installation of the Thrift clients

and the data model in detail;

http://www.sodeso.nl/

Documents

Apache Cassandra - SALSAHPC