39
Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

Embed Size (px)

Citation preview

Page 1: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

Jean Armel Luce

Orange France

Thursday, June 19 2014

Cassandra Meetup Nice

Page 2: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

2Cassandra Meetup Nice – June 19 2014  

Summary

Short description of PnS. Why did we choose C* ?

Some key features of C*

After the migration …

Analytics with Hadoop/Hive over Cassandra

Some conclusions about the project PnS3.0

Jean Armel Luce - Orange-France

Page 3: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

Short description of Short description of PnS3PnS3

Page 4: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

4Cassandra Meetup Nice – June 19 2014  

PnS – Short description

PnS means Profiles and Syndication : PnS is a highly available service for collecting and serving live data about Orange customers

End users of PnS are :

– Orange customers (logged to Portal www.orange.fr)

– Sellers in Orange shops

– Some services in Orange (advertisements, …)

Jean Armel Luce - Orange-France

Page 5: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

5Cassandra Meetup Nice – June 19 2014  

PnS – The Big Picture

Jean Armel Luce - Orange-France

End users

Millions of HTTP requests(Rest or Soap)Fast and highly available

Database

WebService to get or setdata stored by pns :-postProcessing(data1)-postProcessing(data2)-postProcessing(data3)-postProcessing(datax)-…

PNSData providers

Thousands of files (Csv or Xml)Scheduled data injection

DB QueriesR/W operations

Page 6: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

6Cassandra Meetup Nice – June 19 2014  

Until 2012, data were stored in 2 differents backends :

MySQL cluster (for volatile data)

PostGres « cluster » (sharding and replication)

and web services

(read and writes)

for batch updates

PnS – Architecture

Jean Armel Luce - Orange-France

Bagnolet

Sophia Antipolis

2 DCs architecture for high availability

Page 7: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

7Cassandra Meetup Nice – June 19 2014  

Timeline – Key dates of PnS 3.0

Jean Armel Luce - Orange-France

04/2014

Page 8: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

8Cassandra Meetup Nice – June 19 2014  

PnS – Why did we choose Cassandra ?

Cassandra fits our requirements :

– Very high availability

– Low latency

– Scalability

And also :

– Ease of use : Cassandra is easy to administrate and operate– Some features that I like (rack aware, CL per request, …)– Cassandra is designed to work naturally and plainly in a multidatacenter

architecture

Jean Armel Luce - Orange-France

PnS2 = 99,95% availabilitywe want to improve it

!!!20 ms < RT PnS2 web service < 150 ms

we want to improve it !!!Higher load, higher volume next years ?unpredictable; better

scalability brings new businesses

Page 9: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

Some key features of Some key features of C*C*

Page 10: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

10Cassandra Meetup Nice – June 19 2014  

Who is Cassandra ?

Cassandra is a NoSQL database, developped in Java,

Cassandra was created at Facebook in 2007, was open sourced and then incubated at Apache, and is nowadays a Top-Level-Project.

2 distributions of Cassandra :

– Community edition : http://cassandra.apache.org/

distributed under the Apache License– Enterprise distribution : http://www.datastax.com/download

distributed by Datastax

Jean Armel Luce - Orange-DSIF/SDFY-PnS 3.0

Page 11: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

11Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-DSIF/SDFY-PnS 3.0

Cassandra : architecture/topology

The main characteristic of Cassandra : all the nodes play the same role No master, no slave, no configuration server no SPOF

Rows are sharded among the nodes in the cluster

Rows are replicated among the nodes in the cluster

The parameter TopologyStrategy defines how/where rows in a keyspace are replicated (monodatacenter, multidatacenter, …)

Page 12: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

12Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-DSIF/SDFY-PnS 3.0

Cassandra : how requests are executed

The application sends a request to one of the nodes (not always the same; try to balance the load among the nodes). This node is called the coordinator

The coordinator routes the query to the datanode(s)

The datanodes execute the query and return the result to the coordinator

The coordinator returns the result to the application

Application

Page 13: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

13Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

Cassandra : what happens when a node crash?

Read query Case 1 : a READ query is executed while a datanode is crashed :

the coordinator has already received the information (via Gossip) that a node is down and do not send any request to this node

Application

Replica1

Replica 2

Replica 3

Page 14: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

14Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

Cassandra : what happens when a node crash

Read query Case 2 : the coordinator crashes while a READ query is being executed :

the application receives a KO (or timeouts), then re-sends the request to another node which acts as a new coordinator

Application

Page 15: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

15Cassandra Meetup Nice – June 19 2014  

HH

Jean Armel Luce - Orange-France

Cassandra : what happens when a node crash ?

Write query Case 1 : a WRITE query is executed while a datanode is crashed and there are enough replica up :

– A “Hinted Handoff” is stored in the coordinator

Application

Replica1

Replica 2

Replica 3

Replica 1

Page 16: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

16Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

Cassandra : what happens when a node crash ?

Write query Case 1 : a WRITE query is executed while a datanode is crashed :

– The write is executed in all replica which are available.

– A Hinted Handoff is stored in the coordinator, and the query will be executed when the datanode comes back again (within 3 hours)

3 tips for keeping consistency between nodes :

– Hinted Handoffs (repair when node comes back in the ring after a failure)

– Read repairs (automatic repair in background for 10% of read queries)

– Anti entropy repairs (manual read repair for all data)

Page 17: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

17Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

The eventual consistency with Cassandra

We can specify the consistency level for each read/update/insert/delete request

CL mostly used : LOCAL_ONE

ONE

ANY

LOCAL_QUORUM

QUORUM

ALL

SERIAL

Strong consistency : W + R > RF

Consistency

Weak

Strong

Availability

Higher

Lower

Latency

Higher

Lower

Page 18: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

18Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

Cassandra : sharding with virtual nodes(versions 1.2 +)

Virtual nodes are available since C* 1.2.

With virtual nodes, adding a new node (or many nodes) in the cluster is easy

data are moved from ALL the old nodes to new node :

few data to move between nodes

after the move of data, the cluster is still well balanced

procedure totally automatized

Adding a new node in the cluster is a normal operation which is done on-line without interruption of service

When adding nodes, replica are also moved between nodes

Page 19: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

19Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

Cassandra : Adding a node using VnodesExample : adding a 5th node in a 4-nodes cluster

Node 4

Node 5

Node 3

Node 2

Node 1

Page 20: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

20Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

CQL (Cassandra Query language)

DDL :

CREATE keyspace, CREATE table, CREATE INDEX

ALTER keyspace, ALTER table

DROP keyspace, DROP table, DROP INDEX

DML :

SELECT

INSERT/UPDATE (INSERT equivalent to UPDATE : improve performances)

DELETE (delete a ROW or delete COLUMNS in a ROW)

Page 21: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

21Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

CQL (Cassandra Query language)

Exemple :

jal@jal-VirtualBox:~/cassandra200beta1/apache-cassandra-2.0.0-beta1-src/bin$ ./cqlsh -u cassandra -p cassandra

Connected to Test Cluster at localhost:9160.

[cqlsh 4.0.0 | Cassandra 2.0.0-beta1-SNAPSHOT | CQL spec 3.1.0 | Thrift protocol 19.37.0]

Use HELP for help.

cqlsh> CREATE KEYSPACE fr

... WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };

cqlsh> use fr;

Replication factor

Replication

strategy

Connexion to the keyspace

Keyspace instead of database

Page 22: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

22Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

CQL (Cassandra Query language)

Exemple :

cqlsh:fr> CREATE TABLE customer (

... custid int,

... firstname text,

... lastname text,

... PRIMARY KEY (custid) );

cqlsh:fr>

cqlsh:fr> UPDATE customer set firstname = ‘Bill', lastname = ‘Azerty' WHERE custid = 1;

cqlsh:fr> INSERT INTO customer (custid , firstname , lastname ) values (2, ‘Steve', ‘Qwerty');

cqlsh:fr>

INSERT equivalent to UPDATE

Page 23: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

23Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

CQL (Cassandra Query language)

Exemple :

cqlsh:fr> SELECT firstname, lastname FROM customer WHERE custid = 1;

firstname | lastname

-----------+----------

Bill | Azerty

(1 rows)

cqlsh:fr> SELECT * FROM customer WHERE custid = 2;

custid | firstname | lastname

--------+------------+----------

2 | Steve | Qwerty

(1 rows)

SELECT with clause WHERE on

primary key

Page 24: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

24Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

CQL (Cassandra Query language)

Exemple :

cqlsh:fr> SELECT * FROM customer WHERE lastname= ‘Azerty';

Bad Request: No indexed columns present in by-columns clause with Equal operator

This request requires an index on column ‘lastname’

SELECT rejected.No other operator than = accepted

in WHERE clause (<, >, != rejected)/ The column in the WHERE clause must be indexed

Page 25: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

After the migration …After the migration …

Page 26: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

26Cassandra Meetup Nice – June 19 2014  

Comparison before/after migration to Cassandra

Some graphs about the latency of the web services are very explicit :

Service push mail Service push webxms

Jean Armel Luce - Orange-France

The latency

dates of migration to C*

Page 27: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

27Cassandra Meetup Nice – June 19 2014  

Read and write latencies are now in microseconds in the datanodes :

Thanks to and

This latency will be improved by (tests in progress) :ALTER TABLE syndic WITH compaction = { 'class' : 'LeveledCompactionStrategy', 'sstable_size_in_mb' : ?? };

Jean Armel Luce - Orange-France

The latency

Page 28: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

28Cassandra Meetup Nice – June 19 2014  

• We got a few hardware failures and network outages

• No impact on QoS :

• no error returned by the application

• no real impact on latency

Jean Armel Luce - Orange-France

The availability

Page 29: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

29Cassandra Meetup Nice – June 19 2014  

• Thanks to vnodes (available since Cassandra 1.2), it is easy to scale out

With NetworkTopologyStrategy, make sure to distribute evenly the nodes in the racks

Jean Armel Luce - Orange-France

The scalability

Page 30: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

Analytics with Analytics with Hadoop/Hive over Hadoop/Hive over CassandraCassandra

Page 31: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

31Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

Basic architecture of the Cassandra cluster

Cluster without Hadoop : 2 datacenters, 16 nodes in each DC

RF (DC1, DC2) = (3, 3)

Web servers in DC1 send queries to C* nodes in DC1

Web servers in DC2 send queries to C* nodes in DC2

Poolof

webservers

DC1

Poolof

webservers

DC2

DC1 DC2

Page 32: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

32Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

Architecture of the Cassandra cluster with the datacenter for analytics

Cluster with Hadoop : 3 datacenters, 16 nodes in DC1, 16 nodes in DC2, 4 nodes in DC3

RF (DC1, DC2, DC3) = (3, 3, 1)

Because RF = 1 in DC3, we need less storage space in this datacenter

We favor cheaper disks (SATA) in DC3 rather than SSDs or FusionIo cards

Page 33: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

33Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

Architecture of the Cassandra cluster with the datacenter for analytics

DC1 DC2

DC3

Poolof

webservers

DC1

Poolof

webservers

DC2

Page 34: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

Some conclusions Some conclusions about the project PnS3about the project PnS3

Page 35: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

35Cassandra Meetup Nice – June 19 2014  

With Cassandra, we have improved our QoS

Lower response time

Higher availability

More flexibility for exploitation teams

We are able to open our service to new opportunities

There is a large ecosystem around C* (Hadoop, Hive, Pig, Storm, Shark, Titan, …), which offers more capabilities.

The Cassandra community is very active and helps a lot. There are a lot of resources available : mailing lists, blogs, …

Jean Armel Luce - Orange-France

Conclusions

Page 36: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

Thank youfor your attention

Page 37: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

37Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

Questions

Page 38: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

38Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

A few answers about hardware/OS version /Java version/Cassandra version/Hadoop version

Hardware :

16 nodes in DC1 and DC2 at the end of 2013 : 2 CPU 6cores each Intel® Xeon® 2.00 GHz

64 GB RAM

FusionIO 320 GB MLC

4 nodes in DC3 24 GB de RAM

2 CPU 6cores each Intel® Xeon® 2.00 GHz

SATA Disks 15Krpm

OS : Ubuntu Precise (12.04 LTS)

Cassandra version : 1.2.13

Hadoop version : CDH 4.5 (with Hive 0.10) : Hadoop 2 with MRv1

Hive handler : https://github.com/cscetbon/hive/tree/hive-cdh4-auth

Java version : Java7u45 (GC CMS with option CMSClassUnloadingEnabled)

Page 39: Jean Armel Luce Orange France Thursday, June 19 2014 Cassandra Meetup Nice

39Cassandra Meetup Nice – June 19 2014  Jean Armel Luce - Orange-France

A few answers about data and requests

Data types :

Volume : 6 TB at the end of 2013

elementary types : boolean, integer, string, date

collection types

complex types : json, xml (between 1 and 20 KB)

Requests :

10.000 requests/sec at the end of 2013

80% get

20% set

Consistency level used by PnS for on line queries and batch updates :

LOCAL_ONE (95% of the queries)

LOCAL_QUORUM (5% of the queries)