27
Migration from Thrift to CQL Brij Bhushan Ravat Chief Architect, Voucher Server - Charging System

Migration from Thrift to CQL (Brij Bhushan Ravat, Ericsson) | Cassandra Summit 2016

Embed Size (px)

Citation preview

Page 1: Migration from Thrift to CQL (Brij Bhushan Ravat, Ericsson) | Cassandra Summit 2016

Migration from Thrift to CQL

Brij Bhushan RavatChief Architect, Voucher Server - Charging System

Page 2: Migration from Thrift to CQL (Brij Bhushan Ravat, Ericsson) | Cassandra Summit 2016

Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 2

To be continued …

“I didn't come here to tell you how this is going to end. I came here to tell you how it's going to begin.”

Page 3: Migration from Thrift to CQL (Brij Bhushan Ravat, Ericsson) | Cassandra Summit 2016

Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 3

1 Why Thrift-to-CQL

2 Impact on data model

3 Approaches

4 Comparison

5 Summary

Page 4: Migration from Thrift to CQL (Brij Bhushan Ravat, Ericsson) | Cassandra Summit 2016

Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 4

ContextWhen everyone thought there is no alternative to RDBMS, one started his journey with Cassandra

He interfaced the application with Cassandra using Thrift interface

Thrift is deprecated and CQL is the new interface

Page 5: Migration from Thrift to CQL (Brij Bhushan Ravat, Ericsson) | Cassandra Summit 2016

Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 5

Thrift & CQL› CQL interface was introduced in Cassandra in Nov, 2012.

› Since then Cassandra can be interfaced, using Thrift as well as CQL interface

Page 6: Migration from Thrift to CQL (Brij Bhushan Ravat, Ericsson) | Cassandra Summit 2016

Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 6

› Development of Thrift interface has been officially frozen from last 2 years

› Thrift’s support will be completely removed in Cassandra 4.0

› Therefore, moving from Thrift to CQL is not just a choice. It is mandatory:– in order to leverage new capabilities of Cassandra, and– if you want your application to be ready for Cassandra 4.0

› Moreover: – With Cassandra 3.0 onwards, performance of CQL is much better than that of Thrift, and– CQL is easier to use because it is similar to SQL

Thrift & CQL

Page 7: Migration from Thrift to CQL (Brij Bhushan Ravat, Ericsson) | Cassandra Summit 2016

Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 7

› Moving from Thrift to CQL changes all touch points of an application with Cassandra

› This implies that the application may have to redesign its framework for those operations that directly work on data

– For example:› Atomicity of multiple updates› Isolation of a transaction

Few Points to ponder

Page 8: Migration from Thrift to CQL (Brij Bhushan Ravat, Ericsson) | Cassandra Summit 2016

Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 8

1 Why Thrift-to-CQL

2 Impact on data model

3 Approaches

4 Comparison

5 Summary

Page 9: Migration from Thrift to CQL (Brij Bhushan Ravat, Ericsson) | Cassandra Summit 2016

Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 9

› Unlike Thrift, CQL depends more on column_metadata

› Therefore, if your table has both fixed & dynamic columns then CQL will be able to read only fixed columns

Impact on data model

Page 10: Migration from Thrift to CQL (Brij Bhushan Ravat, Ericsson) | Cassandra Summit 2016

Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 10

Cassandra tableKey Name City State Zip Phone

ABC_SJ ABC Hotel San Jose CA 95113 124-543-2244567_SJ 567 Hotel San Jose CA 95113 124-756-1567XYZ_LV XYZ Hotel Las Vegas NV 89109 311-587-2222

create column family hotels with key_validation_class = UTF8Type

and comparator = UTF8Type

and column_metadata = [

{column_name: Name, validation_class: UTF8Type},

{column_name: City, validation_class: UTF8Type},

{column_name: State, validation_class: UTF8Type},

{column_name: Zip, validation_class: IntegerType},

{column_name: Phone, validation_class: UTF8Type}

]

Page 11: Migration from Thrift to CQL (Brij Bhushan Ravat, Ericsson) | Cassandra Summit 2016

Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 11

Cassandra tableABC_SJ Name ABC Hotel 2005-10-30 T 10:45

City San Jose 2005-10-30 T 10:46State CA 2005-10-30 T 10:47Zip 95113 2005-10-30 T 10:48Phone 124-543-2244 2011-04-16 T 08:15

567_SJ Name 567 Hotel 2005-11-14 T 15:06City San Jose 2005-11-14 T 15:06State CA 2005-11-14 T 15:06Zip 95113 2005-11-14 T 15:06Phone 124-756-1567 2005-11-14 T 15:06

XYZ_LV Name XYZ Hotel 2005-02-21 T 09:10City Las Vegas 2005-02-21 T 09:10State NV 2005-02-21 T 09:10Zip 89109 2005-02-21 T 09:10Phone 311-587-2222 2007-12-02 T 14:02

Actual format

Page 12: Migration from Thrift to CQL (Brij Bhushan Ravat, Ericsson) | Cassandra Summit 2016

Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 12

Dynamic columnABC_SJ Name ABC Hotel 2005-10-30 T 10:45

City San Jose 2005-10-30 T 10:46State CA 2005-10-30 T 10:47Zip 95113 2005-10-30 T 10:48Phone 124-543-2244 2011-04-16 T 08:15My Rating Just average 2011-05-22 T 15:07

567_SJ Name 567 Hotel 2005-11-14 T 15:06City San Jose 2005-11-14 T 15:06State CA 2005-11-14 T 15:06Zip 95113 2005-11-14 T 15:06Phone 124-756-1567 2005-11-14 T 15:06

XYZ_LV Name XYZ Hotel 2005-02-21 T 09:10City Las Vegas 2005-02-21 T 09:10State NV 2005-02-21 T 09:10Zip 89109 2005-02-21 T 09:10Phone 311-587-2222 2007-12-02 T 14:02

• If table has dynamic column along with fixed columns• CQL will fail to read the dynamic column(s)• Because unlike Thrift, CQL depends more on metadata

column_metadata = [

column_name: Name, column_name: City, column_name: State, column_name: Zip, column_name: Phone

]

Page 13: Migration from Thrift to CQL (Brij Bhushan Ravat, Ericsson) | Cassandra Summit 2016

Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 13

1 Why Thrift-to-CQL

2 Impact on data model

3 Approaches

4 Comparison

5 Summary

Page 14: Migration from Thrift to CQL (Brij Bhushan Ravat, Ericsson) | Cassandra Summit 2016

Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 14

› Add collections to the schema› Make the table schema-less

(If you have both fixed & dynamic columns)

Approaches (for Moving to CQL)

Page 15: Migration from Thrift to CQL (Brij Bhushan Ravat, Ericsson) | Cassandra Summit 2016

Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 15

Add collections to the schema• Replace multiple dynamic columns with one (or more) collections, like map

ABC_SJ Name ABC HotelCity San JoseState CAZip 95113Phone 124-543-2244Rating {my_rating: "Just average"}

567_SJ Name 567 HotelCity San JoseState CAZip 95113Phone 124-756-1567Rating {my_rating: "Above average", portal_rating: "Good"}

ABC_SJ Name ABC HotelCity San JoseState CAZip 95113Phone 124-543-2244My Rating Just average

567_SJ Name 567 HotelCity San JoseState CAZip 95113Phone 124-756-1567Portal Rating GoodMy Rating Above average

Page 16: Migration from Thrift to CQL (Brij Bhushan Ravat, Ericsson) | Cassandra Summit 2016

Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 16

Make the table schema-less

column1 value timestampABC_SJ Name ABC Hotel 2005-10-30 T 10:45

City San Jose 2005-10-30 T 10:46State CA 2005-10-30 T 10:47Zip 95113 2005-10-30 T 10:48Phone 124-543-2244 2011-04-16 T 08:15My Rating Just average 2011-05-22 T 15:07

› Drop the entire column_metadata

› This will make it possible to read all the columns of the table (because now all the columns are dynamic).

– In absence of column_metadata, CQL will make use of an internal column called ‘column1’

– column1 has listing of all the column names in the table

update column family hotels with key_validation_class = UTF8Type and comparator = UTF8Type and column_metadata=[]

Page 17: Migration from Thrift to CQL (Brij Bhushan Ravat, Ericsson) | Cassandra Summit 2016

Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 17

› Usually, an application performs multiple operations on a table, like– Adding new data records– Seek a data record & update it– Update data records in bulk (based on a criteria)– Just read the data records & generate a report

› Good news– APIs that work with Thrift interface, will continue working even without the metadata– This gives flexibility in development to migrate functionalities from Thrift to CQL one-by-one.

Schema-less table: advantage

Page 18: Migration from Thrift to CQL (Brij Bhushan Ravat, Ericsson) | Cassandra Summit 2016

Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 18

Schema-less table: Advantage

Cassandra

Thrift interface CQL interface

Add new hotel Update hotel record

Add a rating to hotel

Report Generation

Hector APIs Hector APIs Hector APIs Hector APIsCQL APIs CQL APIs CQL APIs CQL APIs

Page 19: Migration from Thrift to CQL (Brij Bhushan Ravat, Ericsson) | Cassandra Summit 2016

Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 19

1 Why Thrift-to-CQL

2 Impact on data model

3 Approaches

4 Comparison

5 Summary

Page 20: Migration from Thrift to CQL (Brij Bhushan Ravat, Ericsson) | Cassandra Summit 2016

Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 20

Comparison› Schema with collections

– Requires data migration

– Once schema is changed, all functions will require migration in one-go

› Schema-less– No need for data migration

– Application functions can be migrated in multiple phases

Wait !– Don’t jump to schema-less. The decision won’t be that easy.

– There is one more dimension to be evaluated. --------> Performance

Page 21: Migration from Thrift to CQL (Brij Bhushan Ravat, Ericsson) | Cassandra Summit 2016

Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 21

› By moving to CQL

– Marginal improvement in performance, in key-based read queries and write operations

– But there is a major performance drop in full-table scan scenario, with ‘Schema with collections’

Performance ComparisonCassandra: 2.0.14

Data size: 1 million records Data size: 50 million rec.

Page 22: Migration from Thrift to CQL (Brij Bhushan Ravat, Ericsson) | Cassandra Summit 2016

Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 22

Page 23: Migration from Thrift to CQL (Brij Bhushan Ravat, Ericsson) | Cassandra Summit 2016

Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 23

“I didn't come here to tell you how this is going to end. I came here to tell you how it's going to begin.”

So, when you move to CQL from Thrift and you are on Cassandra v 2.0.x, you don’t get significant performance benefit.

Performance benefit will come when you upgrade Cassandra

Page 24: Migration from Thrift to CQL (Brij Bhushan Ravat, Ericsson) | Cassandra Summit 2016

Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 24

Performance across versions

• At v 2.0.14, Schema-less gives same performance

• but v 2.1 onwards, performance drops

Data size: 50 million recordsScenario: Full-table scan Spark version: 1.2

Schema-less

• At v 2.0.14, Schema-with-collection give almost half performance

• but v 2.1 onwards, its performance is always better than schema-less

Schema-with-collection

Page 25: Migration from Thrift to CQL (Brij Bhushan Ravat, Ericsson) | Cassandra Summit 2016

Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 25

1 Why Thrift-to-CQL

2 Impact on data model

3 Approaches

4 Comparison

5 Summary

Page 26: Migration from Thrift to CQL (Brij Bhushan Ravat, Ericsson) | Cassandra Summit 2016

Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 26

› Moving from Thrift to CQL is important for lifecycle management of a solution

› CQL gives a challenge when both fixed as well as dynamic columns are present

› There are two approaches for moving to CQL– Schema-less

› Doesn’t require data migration. Hence, data remains compatible with Thrift APIs› Better performance with Cassandra v 2.0.14

– Schema with collections› Requires data migration. Hence, data is no longer compatible with Thrift APIs› Better performance with Cassandra v 2.1.13 & higher

› Performance of CQL (schema with collection) improves with Cassandra version upgrade & becomes significantly high after upgrade to Cassandra 3.x

Summary

Page 27: Migration from Thrift to CQL (Brij Bhushan Ravat, Ericsson) | Cassandra Summit 2016