15
Pushing Cassandra’s Boundaries Darshan Rawal VP Engineering, Openwave Messaging Inc.

C* Summit 2013: (Re)-Building the Social Grid for Global Telcos @ 1/10th the Market Cost by Darshan Rawal

Embed Size (px)

DESCRIPTION

Darshan Rawal is leading the development of hybrid cloud based messaging products for global Tier 1 Telcos. Darshan has been working in Silicon valley since 2000, building nimble, cost effective products/services, handling millions of users and billions of transactions per day. Previous to Openwave Messaging, Darshan held engineering positions @ SS8 networks, Yahoo, DE Shaw, yp.com and has a M.S in Software Engineering from Carnegie Mellon University.

Citation preview

Page 1: C* Summit 2013: (Re)-Building the Social Grid for Global Telcos @ 1/10th the Market Cost by Darshan Rawal

Pushing Cassandra’s Boundaries

Darshan Rawal VP Engineering, Openwave Messaging Inc.

Page 2: C* Summit 2013: (Re)-Building the Social Grid for Global Telcos @ 1/10th the Market Cost by Darshan Rawal

2 © 2013 Openwave Messaging | Confidential #Cassandra13

Agenda

! Introduction ! Our Cassandra Journey ! Spectrum of BIG Data challenges ! Cassandra Pivots ! Typical Cassandra Instance YoY change ! Cassandra Insights ! Conclusion

Page 3: C* Summit 2013: (Re)-Building the Social Grid for Global Telcos @ 1/10th the Market Cost by Darshan Rawal

3 © 2013 Openwave Messaging | Confidential #Cassandra13

Openwave Messaging Customers

Page 4: C* Summit 2013: (Re)-Building the Social Grid for Global Telcos @ 1/10th the Market Cost by Darshan Rawal

4 © 2013 Openwave Messaging | Confidential #Cassandra13

Universal Messaging Suite

Page 5: C* Summit 2013: (Re)-Building the Social Grid for Global Telcos @ 1/10th the Market Cost by Darshan Rawal

5 © 2013 Openwave Messaging | Confidential #Cassandra13

Our Cassandra Journey – 3.5 years

Page 6: C* Summit 2013: (Re)-Building the Social Grid for Global Telcos @ 1/10th the Market Cost by Darshan Rawal

6 © 2013 Openwave Messaging | Confidential #Cassandra13

Cassandra Under Fire - A Story

! Customer Emergency •  Where: Major North American OWM customer •  When: Q4 2012 •  What: File System corruption in legacy platform •  Impact: All (~800K) accounts without mail access

! Resolution: A lab system goes live ! Metrics:

•  20 minutes to upgrade RAM per Cassandra Node •  Run wild maintainence/compaction; solved via SSDs •  100% Uptime

Page 7: C* Summit 2013: (Re)-Building the Social Grid for Global Telcos @ 1/10th the Market Cost by Darshan Rawal

7 © 2013 Openwave Messaging | Confidential #Cassandra13

Spectrum of BIG Data Challenges

Page 8: C* Summit 2013: (Re)-Building the Social Grid for Global Telcos @ 1/10th the Market Cost by Darshan Rawal

8 © 2013 Openwave Messaging | Confidential #Cassandra13

Cassandra Pivots

Page 9: C* Summit 2013: (Re)-Building the Social Grid for Global Telcos @ 1/10th the Market Cost by Darshan Rawal

9 © 2013 Openwave Messaging | Confidential #Cassandra13

Atomic Batches – Client Side Impact

getConnection()

batch_mutate(…) freeConnection() getConnection()

batch_mutate(…) freeConnection() getConnection()

batch_mutate(…) freeConnection()

getConnection() batch_mutate( …) batch_mutate( …) batch_mutate( …)

freeConnection()

prepare_batch() getConnection()

atomic_batch_mutate(…) freeConnection()

Cassandra 1.1x

Cassandra 1.2.x

Application Optimization

Page 10: C* Summit 2013: (Re)-Building the Social Grid for Global Telcos @ 1/10th the Market Cost by Darshan Rawal

10 © 2013 Openwave Messaging | Confidential #Cassandra13

Typical Cassandra Instance - YoY Change

Page 11: C* Summit 2013: (Re)-Building the Social Grid for Global Telcos @ 1/10th the Market Cost by Darshan Rawal

11 © 2013 Openwave Messaging | Confidential #Cassandra13

Cassandra Journey Insights

! It’s a new paradigm, will take time / investment ! There is no free lunch; cool features have a price

! Sizing is all about IOPS, not all IOPS are equal

! Eventual Consistency is dual edged sword

! Adapt paradigms that don’t fit upfront

Page 12: C* Summit 2013: (Re)-Building the Social Grid for Global Telcos @ 1/10th the Market Cost by Darshan Rawal

12 © 2013 Openwave Messaging | Confidential #Cassandra13

Cassandra Insights

Aspect Insight Replication Factor Ratio of RF / Ring size plays a crucial role in

throughput. Linear growth as the ratio shrinks

Tombstones Needs effective tuning for delete heavy applications Refactor application level soft deletes

Sizing Plan for the perfect storm: Compaction + N Failures + Recovery (especially for dense deployments)

Reliable Counters Utilize Client side affinity

Super Cols Best Avoided

Client Interaction Thundering herd issues due to backend GC

Page 13: C* Summit 2013: (Re)-Building the Social Grid for Global Telcos @ 1/10th the Market Cost by Darshan Rawal

13 © 2013 Openwave Messaging | Confidential #Cassandra13

In retrospection

Page 14: C* Summit 2013: (Re)-Building the Social Grid for Global Telcos @ 1/10th the Market Cost by Darshan Rawal

14 © 2013 Openwave Messaging | Confidential #Cassandra13

Current challenges @ Openwave Messaging