Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

Preview:

Citation preview

Jean Armel Luce

Orange France

Thursday, June 19 2014

Cassandra Meetup Nice

2Cassandra Meetup Nice – June 19 2014  

Summary

Short description of PnS. Why did we choose C* ?

Some key features of C*

After the migration …

Analytics with Hadoop/Hive over Cassandra

Some conclusions about the project PnS3.0

Jean Armel Luce - Orange-France

Short description of Short description of PnS3PnS3

4Cassandra Meetup Nice – June 19 2014  

PnS – Short description

PnS means Profiles and Syndication : PnS is a highly available service for collecting and serving live data about Orange customers

End users of PnS are :

– Orange customers (logged to Portal www.orange.fr)

– Sellers in Orange shops

– Some services in Orange (advertisements, …)

Jean Armel Luce - Orange-France

5Cassandra Meetup Nice – June 19 2014  

PnS – The Big Picture

Jean Armel Luce - Orange-France

End users

Millions of HTTP requests(Rest or Soap)Fast and highly available

Database

WebService to get or setdata stored by pns :-postProcessing(data1)-postProcessing(data2)-postProcessing(data3)-postProcessing(datax)-…

PNSData providers

Thousands of files (Csv or Xml)Scheduled data injection

DB QueriesR/W operations

6Cassandra Meetup Nice – June 19 2014  

Until 2012, data were stored in 2 differents backends :

MySQL cluster (for volatile data)

PostGres « cluster » (sharding and replication)

and web services

(read and writes)

for batch updates

PnS – Architecture

Jean Armel Luce - Orange-France

Bagnolet

Sophia Antipolis

2 DCs architecture for high availability

7Cassandra Meetup Nice – June 19 2014  

Timeline – Key dates of PnS 3.0

Jean Armel Luce - Orange-France

04/2014

8Cassandra Meetup Nice – June 19 2014  

PnS – Why did we choose Cassandra ?

Cassandra fits our requirements :

– Very high availability

– Low latency

– Scalability

And also :

– Ease of use : Cassandra is easy to administrate and operate– Some features that I like (rack aware, CL per request, …)– Cassandra is designed to work naturally and plainly in a multidatacenter

architecture

Jean Armel Luce - Orange-France

PnS2 = 99,95% availabilitywe want to improve it

!!!20 ms < RT PnS2 web service < 150 ms

we want to improve it !!!Higher load, higher volume next years ?unpredictable; better

scalability brings new businesses

Some key features of Some key features of C*C*

10Cassandra Meetup Nice – June 19 2014  

Who is Cassandra ?

Cassandra is a NoSQL database, developped in Java,

Cassandra was created at Facebook in 2007, was open sourced and then incubated at Apache, and is nowadays a Top-Level-Project.

2 distributions of Cassandra :

– Community edition : http://cassandra.apache.org/

distributed under the Apache License– Enterprise distribution : http://www.datastax.com/download

distributed by Datastax

Jean Armel Luce - Orange-DSIF/SDFY-PnS 3.0

11Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-DSIF/SDFY-PnS 3.0

Cassandra : architecture/topology

The main characteristic of Cassandra : all the nodes play the same role No master, no slave, no configuration server no SPOF

Rows are sharded among the nodes in the cluster

Rows are replicated among the nodes in the cluster

The parameter TopologyStrategy defines how/where rows in a keyspace are replicated (monodatacenter, multidatacenter, …)

12Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-DSIF/SDFY-PnS 3.0

Cassandra : how requests are executed

The application sends a request to one of the nodes (not always the same; try to balance the load among the nodes). This node is called the coordinator

The coordinator routes the query to the datanode(s)

The datanodes execute the query and return the result to the coordinator

The coordinator returns the result to the application

Application

13Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

Cassandra : what happens when a node crash?

Read query Case 1 : a READ query is executed while a datanode is crashed :

the coordinator has already received the information (via Gossip) that a node is down and do not send any request to this node

Application

Replica1

Replica 2

Replica 3

14Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

Cassandra : what happens when a node crash

Read query Case 2 : the coordinator crashes while a READ query is being executed :

the application receives a KO (or timeouts), then re-sends the request to another node which acts as a new coordinator

Application

15Cassandra Meetup Nice – June 19 2014  

HH

Jean Armel Luce - Orange-France

Cassandra : what happens when a node crash ?

Write query Case 1 : a WRITE query is executed while a datanode is crashed and there are enough replica up :

– A “Hinted Handoff” is stored in the coordinator

Application

Replica1

Replica 2

Replica 3

Replica 1

16Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

Cassandra : what happens when a node crash ?

Write query Case 1 : a WRITE query is executed while a datanode is crashed :

– The write is executed in all replica which are available.

– A Hinted Handoff is stored in the coordinator, and the query will be executed when the datanode comes back again (within 3 hours)

3 tips for keeping consistency between nodes :

– Hinted Handoffs (repair when node comes back in the ring after a failure)

– Read repairs (automatic repair in background for 10% of read queries)

– Anti entropy repairs (manual read repair for all data)

17Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

The eventual consistency with Cassandra

We can specify the consistency level for each read/update/insert/delete request

CL mostly used : LOCAL_ONE

ONE

ANY

LOCAL_QUORUM

QUORUM

ALL

SERIAL

Strong consistency : W + R > RF

Consistency

Weak

Strong

Availability

Higher

Lower

Latency

Higher

Lower

18Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

Cassandra : sharding with virtual nodes(versions 1.2 +)

Virtual nodes are available since C* 1.2.

With virtual nodes, adding a new node (or many nodes) in the cluster is easy

data are moved from ALL the old nodes to new node :

few data to move between nodes

after the move of data, the cluster is still well balanced

procedure totally automatized

Adding a new node in the cluster is a normal operation which is done on-line without interruption of service

When adding nodes, replica are also moved between nodes

19Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

Cassandra : Adding a node using VnodesExample : adding a 5th node in a 4-nodes cluster

Node 4

Node 5

Node 3

Node 2

Node 1

20Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

CQL (Cassandra Query language)

DDL :

CREATE keyspace, CREATE table, CREATE INDEX

ALTER keyspace, ALTER table

DROP keyspace, DROP table, DROP INDEX

DML :

SELECT

INSERT/UPDATE (INSERT equivalent to UPDATE : improve performances)

DELETE (delete a ROW or delete COLUMNS in a ROW)

21Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

CQL (Cassandra Query language)

Exemple :

jal@jal-VirtualBox:~/cassandra200beta1/apache-cassandra-2.0.0-beta1-src/bin$ ./cqlsh -u cassandra -p cassandra

Connected to Test Cluster at localhost:9160.

[cqlsh 4.0.0 | Cassandra 2.0.0-beta1-SNAPSHOT | CQL spec 3.1.0 | Thrift protocol 19.37.0]

Use HELP for help.

cqlsh> CREATE KEYSPACE fr

... WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };

cqlsh> use fr;

Replication factor

Replication

strategy

Connexion to the keyspace

Keyspace instead of database

22Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

CQL (Cassandra Query language)

Exemple :

cqlsh:fr> CREATE TABLE customer (

... custid int,

... firstname text,

... lastname text,

... PRIMARY KEY (custid) );

cqlsh:fr>

cqlsh:fr> UPDATE customer set firstname = ‘Bill', lastname = ‘Azerty' WHERE custid = 1;

cqlsh:fr> INSERT INTO customer (custid , firstname , lastname ) values (2, ‘Steve', ‘Qwerty');

cqlsh:fr>

INSERT equivalent to UPDATE

23Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

CQL (Cassandra Query language)

Exemple :

cqlsh:fr> SELECT firstname, lastname FROM customer WHERE custid = 1;

firstname | lastname

-----------+----------

Bill | Azerty

(1 rows)

cqlsh:fr> SELECT * FROM customer WHERE custid = 2;

custid | firstname | lastname

--------+------------+----------

2 | Steve | Qwerty

(1 rows)

SELECT with clause WHERE on

primary key

24Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

CQL (Cassandra Query language)

Exemple :

cqlsh:fr> SELECT * FROM customer WHERE lastname= ‘Azerty';

Bad Request: No indexed columns present in by-columns clause with Equal operator

This request requires an index on column ‘lastname’

SELECT rejected.No other operator than = accepted

in WHERE clause (<, >, != rejected)/ The column in the WHERE clause must be indexed

After the migration …After the migration …

26Cassandra Meetup Nice – June 19 2014  

Comparison before/after migration to Cassandra

Some graphs about the latency of the web services are very explicit :

Service push mail Service push webxms

Jean Armel Luce - Orange-France

The latency

dates of migration to C*

27Cassandra Meetup Nice – June 19 2014  

Read and write latencies are now in microseconds in the datanodes :

Thanks to and

This latency will be improved by (tests in progress) :ALTER TABLE syndic WITH compaction = { 'class' : 'LeveledCompactionStrategy', 'sstable_size_in_mb' : ?? };

Jean Armel Luce - Orange-France

The latency

28Cassandra Meetup Nice – June 19 2014  

• We got a few hardware failures and network outages

• No impact on QoS :

• no error returned by the application

• no real impact on latency

Jean Armel Luce - Orange-France

The availability

29Cassandra Meetup Nice – June 19 2014  

• Thanks to vnodes (available since Cassandra 1.2), it is easy to scale out

With NetworkTopologyStrategy, make sure to distribute evenly the nodes in the racks

Jean Armel Luce - Orange-France

The scalability

Analytics with Analytics with Hadoop/Hive over Hadoop/Hive over CassandraCassandra

31Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

Basic architecture of the Cassandra cluster

Cluster without Hadoop : 2 datacenters, 16 nodes in each DC

RF (DC1, DC2) = (3, 3)

Web servers in DC1 send queries to C* nodes in DC1

Web servers in DC2 send queries to C* nodes in DC2

Poolof

webservers

DC1

Poolof

webservers

DC2

DC1 DC2

32Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

Architecture of the Cassandra cluster with the datacenter for analytics

Cluster with Hadoop : 3 datacenters, 16 nodes in DC1, 16 nodes in DC2, 4 nodes in DC3

RF (DC1, DC2, DC3) = (3, 3, 1)

Because RF = 1 in DC3, we need less storage space in this datacenter

We favor cheaper disks (SATA) in DC3 rather than SSDs or FusionIo cards

33Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

Architecture of the Cassandra cluster with the datacenter for analytics

DC1 DC2

DC3

Poolof

webservers

DC1

Poolof

webservers

DC2

Some conclusions Some conclusions about the project PnS3about the project PnS3

35Cassandra Meetup Nice – June 19 2014  

With Cassandra, we have improved our QoS

Lower response time

Higher availability

More flexibility for exploitation teams

We are able to open our service to new opportunities

There is a large ecosystem around C* (Hadoop, Hive, Pig, Storm, Shark, Titan, …), which offers more capabilities.

The Cassandra community is very active and helps a lot. There are a lot of resources available : mailing lists, blogs, …

Jean Armel Luce - Orange-France

Conclusions

Thank youfor your attention

37Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

Questions

38Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

A few answers about hardware/OS version /Java version/Cassandra version/Hadoop version

Hardware :

16 nodes in DC1 and DC2 at the end of 2013 : 2 CPU 6cores each Intel® Xeon® 2.00 GHz

64 GB RAM

FusionIO 320 GB MLC

4 nodes in DC3 24 GB de RAM

2 CPU 6cores each Intel® Xeon® 2.00 GHz

SATA Disks 15Krpm

OS : Ubuntu Precise (12.04 LTS)

Cassandra version : 1.2.13

Hadoop version : CDH 4.5 (with Hive 0.10) : Hadoop 2 with MRv1

Hive handler : https://github.com/cscetbon/hive/tree/hive-cdh4-auth

Java version : Java7u45 (GC CMS with option CMSClassUnloadingEnabled)

39Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

A few answers about data and requests

Data types :

Volume : 6 TB at the end of 2013

elementary types : boolean, integer, string, date

collection types

complex types : json, xml (between 1 and 20 KB)

Requests :

10.000 requests/sec at the end of 2013

80% get

20% set

Consistency level used by PnS for on line queries and batch updates :

LOCAL_ONE (95% of the queries)

LOCAL_QUORUM (5% of the queries)