20
Anuj Wadehra Architect & Cassandra SME Ericsson R & D Support APACHE Cassandra in Production

Support Apache Cassandra in Production · Anuj Wadehra . Architect & Cassandra SME . Ericsson R & D . Support APACHE Cassandra in Production

  • Upload
    others

  • View
    15

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Support Apache Cassandra in Production · Anuj Wadehra . Architect & Cassandra SME . Ericsson R & D . Support APACHE Cassandra in Production

Anuj Wadehra Architect & Cassandra SME Ericsson R & D

Support APACHE Cassandra in Production

Page 2: Support Apache Cassandra in Production · Anuj Wadehra . Architect & Cassandra SME . Ericsson R & D . Support APACHE Cassandra in Production

Support Apache Cassandra in Production | Public | © Ericsson AB 2017 | 2017-04-30 | Page 2

› Building In-House Support › Case Study › Challenges › Best Practices › How We Fixed It

AGENDA

Page 3: Support Apache Cassandra in Production · Anuj Wadehra . Architect & Cassandra SME . Ericsson R & D . Support APACHE Cassandra in Production

BUILDING IN-HOUSE SUPPORT

Page 4: Support Apache Cassandra in Production · Anuj Wadehra . Architect & Cassandra SME . Ericsson R & D . Support APACHE Cassandra in Production

Support Apache Cassandra in Production | Public | © Ericsson AB 2017 | 2017-04-30 | Page 4

Define Scope – 24*7 On-Call Support – Hot Fixes – Consultancy – Community Contributions

Identify Vision – Faster & Predictable

Turnaround Time – Cost-Effectiveness – Open Source Contributions

BUILDING IN-HOUSE SUPPORT

Page 5: Support Apache Cassandra in Production · Anuj Wadehra . Architect & Cassandra SME . Ericsson R & D . Support APACHE Cassandra in Production

Support Apache Cassandra in Production | Public | © Ericsson AB 2017 | 2017-04-30 | Page 5

– How to learn?

› Documentation › Trainings › Community › Product issues › PoCs

Build Competence

– What to learn ? › Cassandra › Programming › Troubleshooting › Product

BUILDING IN-HOUSE SUPPORT CONTD..

Page 6: Support Apache Cassandra in Production · Anuj Wadehra . Architect & Cassandra SME . Ericsson R & D . Support APACHE Cassandra in Production

Support Apache Cassandra in Production | Public | © Ericsson AB 2017 | 2017-04-30 | Page 6

Setup Infrastructure – Continuous Integration setup – Test Lab for reproducing defects – R&D Lab for PoCs and new Feature Evaluation – Collaboration tools

› Wiki › Mailing Lists › Forums etc.

BUILDING IN-HOUSE SUPPORT CONTD..

Page 7: Support Apache Cassandra in Production · Anuj Wadehra . Architect & Cassandra SME . Ericsson R & D . Support APACHE Cassandra in Production

Support Apache Cassandra in Production | Public | © Ericsson AB 2017 | 2017-04-30 | Page 7

Cassandra In-House Support X Technologies is a product based organization which has decided to replace RDBMS with C* in 3 product offerings. The decision is expected to positively impact scalability, fault-tolerance & availability of these products. Moreover, the organization aims to cut its expenditure on licensing and expensive RDBMS support.

CASE STUDY

Page 8: Support Apache Cassandra in Production · Anuj Wadehra . Architect & Cassandra SME . Ericsson R & D . Support APACHE Cassandra in Production

Support Apache Cassandra in Production | Public | © Ericsson AB 2017 | 2017-04-30 | Page 8

CASE STUDY CONTD..

Proposed solution:

Page 9: Support Apache Cassandra in Production · Anuj Wadehra . Architect & Cassandra SME . Ericsson R & D . Support APACHE Cassandra in Production

Challenges

Page 10: Support Apache Cassandra in Production · Anuj Wadehra . Architect & Cassandra SME . Ericsson R & D . Support APACHE Cassandra in Production

Support Apache Cassandra in Production | Public | © Ericsson AB 2017 | 2017-04-30 | Page 10

– Selling NoSQL concepts › Why Repair ? › How deleting data can

increase disk space ? › How RF determines fault-

tolerance ?

Our Biggest Challenges – Keeping pace with Releases – Support for EOL versions – Insufficient/outdated docs – Lengthy upgrades – Compaction strategies

CHALLENGES

Page 11: Support Apache Cassandra in Production · Anuj Wadehra . Architect & Cassandra SME . Ericsson R & D . Support APACHE Cassandra in Production

Support Apache Cassandra in Production | Public | © Ericsson AB 2017 | 2017-04-30 | Page 11

– Be Updated

› Mailing Lists › JIRAs

Addressing Challenges – Stability over fancy features – Upgrades aligned with EOL – Well tested Compaction

Strategies – CI Infrastructure for Patches – Well documented

› NoSql concepts › Operational procedures

CHALLENGES CONTD..

Page 12: Support Apache Cassandra in Production · Anuj Wadehra . Architect & Cassandra SME . Ericsson R & D . Support APACHE Cassandra in Production

Support Apache Cassandra in Production | Public | © Ericsson AB 2017 | 2017-04-30 | Page 12

Best PRACTICES C* Operations

Page 13: Support Apache Cassandra in Production · Anuj Wadehra . Architect & Cassandra SME . Ericsson R & D . Support APACHE Cassandra in Production

Support Apache Cassandra in Production | Public | © Ericsson AB 2017 | 2017-04-30 | Page 13

› Monitoring – Repair – Dropped Mutations – All time blocked – Read/Write Latencies – GC Pauses – Disk Usage – Upgrade sstables

› Daily Maintenance – Backup – Clearing Snapshots

› Repair – Be Selective – Repair once:

› Primary Range › Incremental

BEST PRACTICES

Page 14: Support Apache Cassandra in Production · Anuj Wadehra . Architect & Cassandra SME . Ericsson R & D . Support APACHE Cassandra in Production

Support Apache Cassandra in Production | Public | © Ericsson AB 2017 | 2017-04-30 | Page 14

› Security – SSL for JMX – Encryption

› Automate – Scaling/Down Sizing – Multi-DC setup – Upgrades – Backup – Restoration – Clearing Snapshots – Repairs

BEST PRACTICES

Page 15: Support Apache Cassandra in Production · Anuj Wadehra . Architect & Cassandra SME . Ericsson R & D . Support APACHE Cassandra in Production

Support Apache Cassandra in Production | Public | © Ericsson AB 2017 | 2017-04-30 | Page 15

HOW WE FIXED IT C* Issues & Solutions

Page 16: Support Apache Cassandra in Production · Anuj Wadehra . Architect & Cassandra SME . Ericsson R & D . Support APACHE Cassandra in Production

Support Apache Cassandra in Production | Public | © Ericsson AB 2017 | 2017-04-30 | Page 16

› GC Pauses – Faster Memtable flushing – Aggressive compactions – JVM Tuning

› Slow Table Scans – Spark

HOW WE FIXED IT&

Page 17: Support Apache Cassandra in Production · Anuj Wadehra . Architect & Cassandra SME . Ericsson R & D . Support APACHE Cassandra in Production

Support Apache Cassandra in Production | Public | © Ericsson AB 2017 | 2017-04-30 | Page 17

› Hung Repair – Firewall timeout – TCP Keep alive

› Disk Crunch – Aggressive compaction – LCS

HOW WE FIXED IT&

Page 18: Support Apache Cassandra in Production · Anuj Wadehra . Architect & Cassandra SME . Ericsson R & D . Support APACHE Cassandra in Production

Support Apache Cassandra in Production | Public | © Ericsson AB 2017 | 2017-04-30 | Page 18

› Performance & OOM – Disable THP/Zone Reclaim – Max map count

› Wide Rows – Divide into buckets – Add buckets when needed

HOW WE FIXED IT&

Page 19: Support Apache Cassandra in Production · Anuj Wadehra . Architect & Cassandra SME . Ericsson R & D . Support APACHE Cassandra in Production

QUESTIONS ?

Page 20: Support Apache Cassandra in Production · Anuj Wadehra . Architect & Cassandra SME . Ericsson R & D . Support APACHE Cassandra in Production