WP DataStax Enterprise Best Practices

7/24/2019 WP DataStax Enterprise Best Practices

http://slidepdf.com/reader/full/wp-datastax-enterprise-best-practices 1/16

Best Practices in Deploying and ManagingDataStax Enterprise



#

Table of Contents$%&'( )* +),-(,-. /////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// #

!"#$%&'$ )))))))))))))))))))))))))))))))))) )))))))))))))))))))))))))))))))))))))))) ))))))))))))))))))))))))))))))))))))))))) )))))))))))))))))))))))))))))))))))))))) )))))))))))))))))))))))))))))))))))))))) )))))))) *

+&$&,$&- ./$0%1%2#03 450 6&#$0#$7 89#$ ,'&:&":0 +2#$%2";$0< +&$&"&#0 40'5/9:9=> ?9% $50 @/$0%/0$

./$0%1%2#0 ))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))) *

A0#$ B%&'$2'0# ?9% !1&'50 C&##&/<%& ))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))) *

0)1,2%-3),%' 4(5)66(,2%-3),. ///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// 7

8%-% 9)2('3,: 4(5)66(,2%-3),. /////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// 7

;%<2=%<( >-)<%:( 4(5)66(,2%-3),. /////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// ?

>(5),2%<@ A,2(B ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// ?

+),*3:1<%-3), C%<%6(-(< 4(5)66(,2%-3),. ///////////////////////////////////////////////////////////////////////////////////////////////////////////////////// D

E(,(<%' 8(F')@6(,- +),.32(<%-3),. ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// D

>('(5-3,: % +)6F%5-3), >-<%-(:@ ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// G

H(,5I6%<J3,: %,2 K..(..3,: $I<)1:IF1- +%F%53-@ /////////////////////////////////////////////////////////////////////////////////////////////////////// L

4(5)66(,2(2 9),3-)<3,: C<%5-35(. ///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// L

8>M C(<*)<6%,5( >(<N35( //////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// L

O)2(-))' ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// L

+,. ,0&%'53 A0#$ B%&'$2'0# ))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))) D

E(,(<%' 4(5)66(,2%-3),. *)< >(%<5I O)2(. //////////////////////////////////////////////////////////////////////////////////////////////////////////////////// P

>3Q3,: 8(-(<63,%-3),. /////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// R

>5I(6% 4(5)66(,2%-3),. ///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// R

9),3-)<3,: %,2 $<)1&'(.I))-3,: 4(5)66(,2%-3),. *)< >(%<5I /////////////////////////////////////////////////////////////////////////////// R

+,. !/&:>$2'#3 A0#$ B%&'$2'0# )))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))) EF

E(,(<%' 4(5)66(,2%-3),. *)< K,%'@-35. O)2(. //////////////////////////////////////////////////////////////////////////////////////////////////////////// "S

9),3-)<3,: %,2 $1,3,: 4(5)66(,2%-3),. *)< K,%'@-35. ////////////////////////////////////////////////////////////////////////////////////////// ""

,0';%2$>3 A0#$ B%&'$2'0# )))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))) EG

8;:$2H+&$& C0/$0% &/< C:9;<3 A0#$ B%&'$2'0# ))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))) EG

E(,(<%' 4(5)66(,2%-3),. ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// "#

91'-3T8%-% +(,-(< %,2 E():<%FI35 +),.32(<%-3),./////////////////////////////////////////////////////////////////////////////////////////////////////// "#

+')12 +),.32(<%-3),. ///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// "?A&'I;1 &/< J0#$9%03 A0#$ B%&'$2'0# ))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))) EK

H%5J1F 4(5)66(,2%-3),. ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// "D

4(.-)<( +),.32(<%-3),. ///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// "D

,9?$L&%0 M1=%&<0#3 A0#$ B%&'$2'0# ))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))) EN

!"9;$ +&$&,$&- ))))))))))))))))))))))))) )))))))))))))))))))))))))))))))))))))))) )))))))))))))))))))))))))))))))))))))))) ))))))))))))))))))))))))))))))))))))))))) ))))))))))))))))))))))))))))))))))))))) EN



7

Abstract

This document provides best practice guidelines fordesigning, deploying, and managing DataStaxEnterprise (DSE) database clusters. Although eachapplication is different and no one-size-fits-all set ofrecommendations can be applied to every situation,the information contained in this paper summarizestechniques proven to be effective for severalDataStax customers who have deployed DataStaxEnterprise in large production environments.

This paper does not replace the detailed informationfound in DataStax online documentation. For helpwith general installation, configuration, or othersimilar tasks, please refer to the most recentdocumentation for your specific DSE release. Notethat information and recommendations for hardwareselection, sizing, and architecture design is

contained in the DataStax Enterprise Reference Architecture white paper .

DataStax Enterprise: TheFastest, Most ScalableDistributed DatabaseTechnology for the

Internet EnterpriseDataStax Enterprise (DSE), built on ApacheCassandra™, delivers what Internet Enterprisesneed to compete in today’s high-speed, always-ondata economy. With in-memory computingcapabilities, enterprise-level security, fast andpowerful integrated analytics and enterprise search,visual management, and expert support, DataStaxEnterprise is the leading distributed database choicefor online applications that require fast performancewith no downtime. DSE also includes DataStaxOpsCenter, which is a visual management and

monitoring tool to manage DSE clusters, as well as24x7x365 expert support and services.

Best Practices for ApacheCassandra

Apache Cassandra is an always-on, massivelyscalable open source NoSQL database designed todeliver continuous uptime, linear scale performance,

and operational simplicity for modern onlineapplications. Cassandra’s distributed databasecapabilities and masterless architecture allow it toeasily span multiple data centers and cloudavailability zones, which means customers can writeand read data anywhere and be guaranteed of veryfast performance.

Foundational Recommendations

Success experienced by the vast majority ofCassandra users can be attributed to:

1. A proper data model2. An appropriate hardware storageinfrastructure

With Cassandra, modeling data as you would withan RDBMS and/or selecting a SAN or other sharedstorage device for your data versus non-sharedstorage on multiple nodes greatly increases chancesof failure. In general, in managing a Cassandracluster, you are validating or invalidating its datamodel and storage architecture selections.

Data Modeling RecommendationsCassandra is a wide-row store NoSQL database andprovides a rich and flexible data model that easilysupports modern data types. The importance of aproper data model for an application cannot beoveremphasized, with success for a new systemnearly always being possible if data is modeledcorrectly and the proper storage hardwareinfrastructure is selected.

A running instance of Cassandra can have one ormore keyspaces within it, which are akin to adatabase in RDBMS’s such as MySQL andMicrosoft SQL Server. Typically, a Cassandracluster has one keyspace per application. Akeyspace can contain one or more column familiesor tables, which are somewhat similar to an RDBMStable but much more flexible.

A typical best practice is to rarely, if ever, modelyour data in Cassandra as you have modeled it in anRDBMS. A relational engine is designed for very

“The data gathered by NREL comes in

different formats, at different rates, from a

wide variety of sensors, meters, and control

networks. DataStax aligns it within one

scalable database.”

— Keith Searight, NREL



?

normalized tables that minimize the number ofcolumns used where the exact reverse is true inCassandra where the best practice is to heavily de-normalize your data and have tables with manycolumns.

Each table requires a primary key as shown below inFigure 1. The primary key is made up of two parts:

the Partition Key and the Clustering Columns.

Figure 1 – A keyspace housing two tables

Keep in mind that a column value can never exceed

2GB, with a recommended size per column valuebeing no more than single digit MBs. Also, a row keycannot exceed the maximum of 64KB.

While RDBMS’s are designed in a typical Codd-Datefashion of creating entities, relationships, andattributes that are very normalized per table, theCassandra data model differs in that it is based onthe application’s query needs and user trafficpatterns. This oftentimes means that some tablesmay have redundancies. Cassandra’s built-in datacompression manages this redundancy to minimizestorage overhead and incur no performance penalty.

Cassandra performs best when data needed tosatisfy a given query is located in the same partitionkey. The data model can be planned so that one rowin a single table is used to answer each query.Denormalization via the use of many columnsversus rows helps the query performing as one read.This sacrifices disk space in order to reduce thenumber of disk seeks and the amount of networktraffic.

Data modeling is high priority for a Cassandra-basedsystem; hence, DataStax has created a special setof visual tutorials designed to help understand how

data modeling data in Cassandra differs frommodeling data in an RDBMS. It is recommended thatyou visit the DataStax data modeling web page toreview each tutorial.

Hardware Storage RecommendationsDataStax recommends using SSDs for Cassandraas it provides extremely low-latency response timesfor random reads while supplying ample sequentialwrite performance for compaction operations.

SSDs are usually not configured optimally by defaultby the majority of Linux distributions.

The following steps ensures best practice settingsfor SSDs:

• Ensure that the SysFS rotational flag is setto false (zero) as the initial first step. Thisoverrides any detection that the operating

system (OS) might do to ensure the drive isdefinitely considered an SSD by the OS. Also do the same for any block devicescreated from SSD storage, such asmdarrays.

• Set the IO scheduler to either deadline ornoop. The noop scheduler is an appropriatechoice if the target block device is an arrayof SSDs behind a high-end IO controller thatwill do its own IO optimization. The deadlinescheduler will optimize requests to minimizeIO latency. If in doubt, use the deadlinescheduler.

• Set the read-ahead value for the block

device to 0 (or 8 for slower SSDs). This tellsthe OS not to read extra bytes, which canincrease the time an IO requires and alsopollutes the cache with bytes that weren’tactually requested by the user.

The table below shows how to implement thesesettings in an /etc/rc.local file.

For more information on proper storage selection,please refer to the DataStax Reference Architecture

paper .

Secondary IndexIt is best to avoid using Cassandra's built-insecondary indexes where possible. Instead, it isrecommended to denormalize data and manuallymaintain a dynamic table as a form of an indexinstead of using a secondary index. If and whensecondary indexes are to be used, they should becreated only on columns containing low-cardinalitydata (for example: fields with less than 1000 states).



D

Configuration ParameterRecommendationsThis section contains best practice advice for someof the most important Cassandra configurationparameter options.

Data Distribution ParametersData distribution inside a Cassandra databasecluster is determined by the type of partitioner selected. There are different types of partitioners inCassandra that can be configured via Cassandra’sconfiguration file: cassandra.yaml.

The Murmur3Partitioner and RandomPartitioner partitioners uniformly distribute data to each nodeacross the cluster. Read and write requests to thecluster are evenly distributed while using thesepartitioners. Load balancing is further simplified aseach part of the hash range receives an equalnumber of rows on average. The Murmur3Partitioner is the default partitioner in Cassandra 1.2 andabove. It provides faster hashing and betterperformance compared to RandomPartitioner and isrecommended for your DSE deployments.

The ByteOrderedPartitioner keeps an ordereddistribution of data lexically by key bytes and is notrecommended except in special circumstances. It isonly useful during key range queries, but note thatyou will lose the key benefits of load balancing andeven distribution of data, and so much more manualmaintenance may be necessary.

To help automatically maintain a cluster’s datadistribution balance when nodes are added orsubtracted from a cluster, a feature called virtualnodes or vnodes was introduced in Cassandra 1.2.Enabling vnodes on Cassandra will allow each nodeto own a large number of small data ranges (definedby tokens) distributed throughout the cluster. Therecommended value for the vnodes configurationvalue (num_tokens) is 256 and must be set beforestarting a cluster for the first time.

Virtual nodes are recommended because they savetime and effort where manually generating andassigning tokens to nodes is concerned. Vnodes areenabled by default in Cassandra 2.0 and higherversions. Vnodes should not be used if your DSEcluster contains nodes devoted to analytics(integrated/external Hadoop) or enterprise search.Those node types do not perform well with vnodes.

Memory Parameters

General Memory RecommendationsJVM heap settings are a potential source of troublefor Cassandra and DSE deployments. Whileadministrators may be tempted to use large JVMfootprints when DRAM is plentiful on the available

nodes, but this can lead to serious problems. JVMuses a garbage collector (GC) to free up unusedmemory. Large heaps can introduce GC pauses thatcan lead to latency, or even make a Cassandranode appear to have gone offline. Proper heapsettings can minimize the impact of the GC in theJVM.The minimum recommendation for DRAM is16GB and the heap should be set to 8GB, but notethat the minimum recommended by DataStax forproduction deployments is 32GB of DRAM.

Data Caches

Cassandra supports two types of data caches: a key

and row cache. The key cache is essentially a cacheof the primary key index data for a Cassandra table.It helps save CPU time and memory over just relyingon the OS page cache for key data. Enabling justthe key cache will still result in disk (or OS pagecache) activity to actually read the requested datarows.

It is best to aim for a hit rate of 95% when using thekey cache. If your hit rate is higher than 95%, thenthe cache might be too large.

The row cache is more similar to a traditional cachein other databases: when a row is accessed, the

entire row is pulled into memory (merging frommultiple SStables, if necessary) and cached so thatfurther reads against that row can be satisfiedwithout accessing disk. It is best used when youhave a small subset of data to keep hot and youfrequently need most or all of the columns returned.DataStax recommends not to use row cache unlessyour use case is > 95% read oriented.

Note that the row cache can decrease performancein some situations such as when you have tablesthat are not cacheable at all.

General Deployment Considerations

The following sections contain recommendations forvarious Cassandra deployment scenarios.

Node Addition/Removal PracticesCassandra provides linear scale performance with

elastic expansion capabilities so adding capacity to

a database cluster is straightforward and easy.

General guidelines for adding new nodes to an

existing cluster- ! If you are not using virtual nodes (vnodes),



G

set the inital_token value of the new node in

the cassandra.yaml file and ensure it

balances the ring without token conflicts.

This step is not necessary with vnodes.

! Change the cassandra.yaml setting of

auto_bootstrap to True

! Point the seed_list parameter of the new

node to the existing seed_list that existswithin the cluster.

! Set the listen_address and,

broadcast_address parameters to the

appropriate addresses of the new node.

! Start the new Cassandra node(s) two

minutes apart. Upon bootstrapping, nodes

will join the cluster and start accepting new

writes, but not reads, while the data these

nodes will be responsible for is streamed to

them from new nodes.

! Later, the cleanup command can be used on

each node to remove stale data. This step is

not immediately required.

In the rare instance that existing nodes are removed

and capacity of other systems is increased to handle

the workload (scaling up versus scaling out), the

same steps above are required. The only additional

step is removing the old nodes with the

decommission operation, one at a time, to lessen

impact of the operations on the cluster. When the

decommission command is called on a node, the

node streams all data for which it is responsible tothe new nodes that are taking over responsibility.

Avoiding Downtime by Using Rack AwarenessCassandra provides the ability to distribute andreplicate data among different hardware racks,which helps keep a database cluster online should aparticular rack fail while another stays operational.While defining one rack for an entire cluster is thesimplest and most common implementation, thereare times when data replication between differentracks to help ensure continuous databaseavailability is required.

A rack aware snitch ensures high availability within asingle datacenter. A snitch is configured at thecluster level and controls the placement of replicaswithin the cluster.

With the use of a rack-aware snitch, Cassandragains knowledge of the physical location of eachnode within the cluster. Cassandra leverages thisinformation to ensure replicas of each row exist inmultiple server racks to ensure that rack loss due tofire, mechanical problems, or routine maintenance

does not impact availability. Please refer toDataStax documentation for a list of rack-awaresnitches and their configuration information.

The following are some general principles that canbe followed for using multiple racks:

! Designate nodes in an alternating and equalpattern in your data centers. Doing so allows

you to realize the benefits of Cassandra'srack feature, while allowing for quick andfully functional expansions.

! Once a cluster is deployed, you can swapnodes and make the appropriate moves toensure that nodes are placed in the ring inan alternating fashion with respect to theracks, as shown in Figure 2 below.

Figure 2 – Multi-data center deployments with multiple

racks

Selecting a Compaction Strategy

Compaction is a process in Cassandra wheremultiple data files (SSTables) are combined toimprove the performance of partition scans and toreclaim space from deleted data. Compaction issuescan degrade performance in Cassandra so theselection of a proper compaction plan is important.When considering a compaction strategy, thegeneral rule of thumb is that size-tiered compaction(which is the default in Cassandra 2.0 and above) isgood when write performance is of primaryimportance or when rows are always written entirelyat once and are never updated. This type ofcompaction strategy triggers a minor compactionwhenever there are a number of similar sizedSSTables on disk. Read workloads can suffer due tofactors such as compaction getting behind andmultiple compactions running simultaneously.

A leveled compaction strategy is worth consideringwhen read performance is of primary importanceand is also the recommended strategy if you areusing SSDs. It is also useful if the followingconditions are present in your database cluster-



L

! A high read/write ratio

! Rows are frequently updated

! Heavy delete workloads or you have

TTL columns in wide rows

More information on compaction strategies can be

found in a technical article on the DataStax TechBlog.

Benchmarking and Assessing ThroughputCapacityThe approach taken to benchmarking and testingthe capacity of a new database cluster will dependon the type of application it is targeting. Benchmarkssuch as YCSB and the Cassandra stress tool can beutilized to gain a general idea of how well yourinitially configured cluster will perform.

When first beginning to work with and test DSE, ageneral approach is to:

! Run a sample workload against 3 nodes

! Make a note of the point when

throughput maxes out and latency

becomes an issue for the application

! Run the same workload against 6 nodes

in order to get a second interpolation

point

Since Cassandra scales linearly, assessing theimprovements is typically quite easy (the transaction

per second rate doubles, etc.) For examples onbenchmarking DataStax Enterprise using YCSB,either standalone or in comparison to other NoSQLdatabases, see the DataStax Benchmarking TopNoSQL Databases white paper . Forrecommendations on how not to benchmarkCassandra, see this article on the DataStax TechBlog.

Recommended Monitoring PracticesThis section contains general tips for monitoring andtroubleshooting Cassandra performance.

Key Monitoring Tools and MetricsThe primary tools recommended for monitoringCassandra performance are:

1. CQL Performance Objects2. DataStax OpsCenter3. The nodetool utility’s tpstats and

cfhistograms options

DSE Performance ServiceDataStax Enterprise 4.5 and beyond includes a newperformance data dictionary containing variousdiagnostic objects that can be queried from anyCQL-based utility. It provides granular details onuser activities and other statistics that helps DBAs

and operational team to monitor databaseperformance and respond to issues immediately.

There are a number of settings in dse.yaml thatneed to be enabled for the collection of specificstatistics, with the objects being stored in the“dse_perf” keyspace.

For example: - enabling “CQL slow log” will recordany CQL statement in a table that takes longer thanthe certain “set” threshold. A best practice is tomonitor this table for any slow queries that affectsthe performance and take necessary steps to tune it

for better performance.

In addition, there are also various performanceobjects that collect system info, summary stats,histogram and also latency tracking of users/objects.

A sample query is shown below when a query isexecuted on the “user_io” table that is used to trackper node read/write metrics, broken down by clientconnection and aggregated for all keyspaces andtables.

DataStax OpsCenter enables the visualmanagement of key metrics and setting proactive

alerts that notify the operator when performance ona database cluster is beginning to degrade. WithOpsCenter, administrators can drill into a node andgraphically view information from utilities like tpstats.

NodetoolWhile several Cassandra statistics are available,those contained in the tpstats (thread pool status)and cfhistograms utilities provide the shortest routein determining whether a database cluster isperforming well or not. For detailed information on



P

how to use both utilities, please see the onlinedocumentation for the version of Cassandra you areusing.

When reviewing the output of both utilities, generalmonitoring rules-of-thumb to keep in mind includethe following:

For tpstats:

! Any blocks seen for Flushwriters indicatesthat the disk is getting behind in writes andthat Cassandra is trying to flush memtabledata to disk but cannot.

For cfhistograms:

! Check to see how many SStableseeks/SStables were accessed (first columnin the output). Seeing larger numbers after 1

in the Offset column indicate potential readI/O slowdown’s because more SSTableshave to be accessed to satisfy a readrequest.

! Seeing large numbers of operations in theread and write latency columns that are fardown in the list (i.e. the Offset column forthese statistics equals microseconds, solarge Offset values with many operations isusually bad) indicate read and/or writebottlenecks.

These and other key metrics can be monitored

visually in OpsCenter. In general, the followingcharacteristics indicate degrading performance for anode or cluster:

! Increasing number of blocked flushwriters! Many requests for multiple SSTable seeks

(as opposed to just one)! Increasing number of pending compactions! Increasing number and time of garbage

collection operations

Using OpsCenter’s built-in auto failover option(available in version 5.1) is recommended where

monitoring and managing critical data without anyinterruption is a requirement. Auto failover optionenables users to have multiple instances ofOpsCenter with an active-passive setup (only oneOpsCenter can be active at a time). This addressesthe HA needs by automatically failing over to thestandby OpsCenter should a failure in the primaryoccur. The failover will be transparent to the userwith zero interruption in most of the services.

Monitoring and Tuning Garbage CollectionJVM uses a garbage collector (GC) process to free

up unused memory. Very large heaps can introduceGC pauses that can increase wait times or evenmake a Cassandra node look like it is down. Additionally, JVM can throw out of memoryexceptions due to heap size being too small or GCnot being able to keep up with the workload to freememory quickly enough to be reused by Cassandra.

Basic recommendations for monitoring/tuninggarbage collection:

1. Ensure the JVM heap has the correctsetting, which at most times equates to 8GBof RAM.

2. Edit cassandra-env.sh, and setheap_newsize parameter to a minimum of100MB * number of cores (200MB isbecoming common).

3. Set up a method to monitor a number of keyGC statistics, which can easily be done inDataStax OpsCenter. These statistics

include JVM ParNew collection count, JVMCMS collection time and JVM ParNewcollection time.

Tuning involves adjusting the heap_newsizeparameter in the following way:

! If the GC collection/pause times are aproblem, reduce heap_newsize

! If the GC events are too frequent and/orCMS collection time is too expensive, thenincrease heap_newsize

DSE Search: BestPractices

DataStax Enterprise includes an integratedenterprise search component powered by ApacheSolr. This allows you to easily search your line-of-business data stored in Cassandra.

General Recommendations for SearchNodesSeveral Cassandra best practices can be applied to

nodes in a DSE cluster devoted to search. Keydifferences for properly configuring and deployingDSE-search/Solr nodes include:

! Do not enable virtual nodes/vnodes for acluster containing Solr nodes.

! If using traditional spinning disks, set up atleast 4 volumes with a set of dedicatedheads for the operating system, commit log,SSTables, and Solr data. SSDs overspinning disks are strongly encouraged.



R

! Set the JVM heap size to 14GB.! Set the heap_newsize parameter to 200MB

per cpu ore, with maximum not to exceed1600MB.

! Disable the Cassandra row cache, andleave the Cassandra key cache at defaultvalue.

! Set the Solr soft autocommit max time to

10s.! DSE 3.1 and higher uses a custom per-

segment filter implementation that uses thefilter cache. The filter cache is the onlymeaningful cache for Solr nodes; all othersshould be disabled. Start with a filter cachesetting of 128. Note that 512 is often too bigfor a 14G heap and will cause GC pressure.Don’t enable auto warming unless you havefrequently used filters.

! Ensure the document and query resultcaches are disabled.

! Set merge factor to 10. This is a good

compromise value for most situations. Forread only use cases this value may belowered to 5 or 2, while for write heavy orbalanced use cases leave it at 10.

! Ensure that the Lucene version is set to themost recent version supported by therelease.

! Set DSE type mapping to the most recentversion supported by the release (currently1). If this value is changed, the Solrmetadata must be removed and Solr mustbe re-enabled. It is recommended that thisvalue remain unchanged unless absolutelynecessary.

Sizing DeterminationsTo determine whether a proposed DSE-search/Solrconfiguration is sized properly, run through thefollowing steps:

! Install DSE and create a cluster with a Solrnode.

! Create a column family with the Solr schemaand configuration.

! Load one thousand mock/sample records.! Get the index size for the Solr core:

(example:

http://localhost:8983/solr/admin/cores?action=STATUS&memory=true).

! Extrapolate from those numbers the indexsize for the expected total record count.

For example, if the index size is 1GB, and youexpect one million records, then the index size willbe 1000GB. The database cluster must be largeenough so that the total cluster memory is largeenough to cache the total index size, and hotdataset, subtracting for the JVM heap and operating

system overhead. Assume 1GB of memory for theoperating system and 14 GB of memory for the JVMheap, or an overhead of 15GB.

A useful sizing equation is:((Nodes * Memory Per Node) - (15GB * Nodes)) /(Index Size * (Expected Rows / 1000))

If the value is less than 1 then you can expectproblems (e.g. you won’t have enough memory tocache the index, let alone cache the rows; everyquery will hit the disk multiple times, etc.)

Schema RecommendationsGeneral recommendations for schema design andpractice:

! The schema version is defined in the rootnode of the XML document. Avoid specifyinga schema version if possible, and avoidspecifying an older version as it may result

in undesirable behavior.! Avoid or limit the use of dynamic fields.

Lucene allocates memory for each uniquefield (column) name, which means if youhave a row with columns A,B,C, and anotherrow with D,E, Lucene will allocate 5 chunksof memory. If this were done for millions ofrows, it is fairly easy to blow up your heap.

! Instead of using dynamic fields, copy fieldcontents using the CopyField directive, andperform queries against the combined field.

Monitoring and Troubleshooting

Recommendations for SearchWith Search/Solr response times, a general rule ofthumb is that search execution times of 20ms or lessare considered good. Anything less than 10ms isconsidered very good. Under heavy load with largedatasets, it is not uncommon to see search latenciesof 100ms to 200ms. The most common causes forslow queries include:

! The disk/file system is laid out incorrectly.! There is not enough memory to cache Solr

index files and the hot dataset. To verify onLinux, you can check IOwait metrics intop/vmstat/iostat.

! GC pressure can result in high latencyvariance.

! Un-tuned queries.

Other troubleshooting advice for Solr includes thefollowing:

! If slow startup times are experienced,decrease the commit log and/or SSTablesizes.

! If inconsistent query results are seen, this is



"S

likely due to inconsistent data across thenodes. Either script repair jobs or use theDSE automatic repair service to ensure dataconsistency across the cluster.

! If manually running repair, and a highsystem load is experienced during repairoperations, then repair is not being run oftenenough. Switch to using the DSE automatic

repair service.! If dropped mutations are seen in the tpstats

utility, then it is likely that the system load istoo high for the configuration and additionalnodes should be added.

! If read/socket timeouts are experienced,then decrease themax_solr_concurrency_per_core parameterin the dse.yaml configuration file to 1 perCPU core.

DSE 4.6 includes a new performance data dictionaryfor Search containing various diagnostic objects that

can be queried from any CQL-based utility.There are a number of settings in dse.yaml thatneed to be enabled for the collection of specificstatistics, with the objects being stored in the“dse_perf” keyspace.

For example: - enabling “solr slow query log” willrecord any search statement in a table that takeslonger than the certain “set” threshold. A bestpractice is to monitor this table for any slow queriesthat affect the performance and to tune for betterperformance.

In addition, there are also various performanceobjects that collect latency ofquery/update/commit/merge, indexing error, indexstats and so on.

DSE Analytics: BestPractices

DataStax Enterprise delivers three options forrunning analytics on Cassandra data:

1. Integrated real-time analytics including in-memory processing, enabled via certified Apache Spark. Included is a Hive-compatible SQL-like interface for Spark

2. Integrated batch analytics using built-inMapReduce, Hive, Pig, Mahout, and Sqoop

3. Integration with external Hadoop vendors(Cloudera and HortonWorks) mergingoperational information in Cassandra withhistorical information stored in Hadoop using

Hive, Pig etc.

General Recommendations for AnalyticsNodesTo properly configure and deploy DSE-Analyticsnodes, DataStax makes the followingrecommendations:

! Enable vnodes while using Spark/Shark onanalytic nodes.

! Both integrated and external Hadoop cannotbe run on the same node.

! DSE’s Spark Streaming capability takesexecutor slots just like any other sparkapplication. This means that batch Spark jobs cannot be run on the same Sparkcluster that is running streaming unless youare manually limiting the resources bothapplications can use. Because of thislimitation it is advised to run streamingapplications on a different datacenter(workload isolation) from the datacenter,running batch jobs. All nodes/DC belongingto the same cluster.

! The minimum number of streaming nodes isone although it must have at least twoexecutors (cores) available. If you wantredundancy you'll need to have a storageoption like MEMORY_ONLY_2 that willduplicate partitions and at least two nodesfor this type of configuration.

! It is recommended not to enable vnodes onanalytics nodes when using integratedHadoop or external Hadoop, unless datasize is very large and increased latency is

not a concern.! Always start nodes one at a time. The first

node started is automatically selected as aJobTracker (JT).

! Set up a reserve JobTracker on a nodedifferent than the primary JobTracker (usedsetool movejt for it). Therefore, if theprimary JT node goes down, it willautomatically fail over to the reserve JT.

! Verify settings in dse-mapred-default.xmland dse-core-default.xml. DSE tries its best

DSEAnalyticsOption

DataSizeRange

Performance

Integrated Spark Terabytes(TB) High

External HadoopIntegration

Petabytes(PB)

Low

IntegratedHadoop

Terabytes(TB)

Low



""

to auto detect hardware settings and adaptparameters appropriately, but they mightneed further tuning. Please refer to DSE-Hadoop documentation for moreinformation.

! Avoid putting millions of small files intoCassandra File System (CFS). CFS is bestoptimized for storing files sizes of at least 64

MB (1 block of default size).! Do not lower the consistency level on CFS

from the default values. Using CL.ONE forwrites or reads won't increase throughput,but may cause inconsistencies and maylead to incorrect results or job failures.

! Ensure that map tasks are large enough. Asingle map task should run for at least a fewseconds. Adjust the size of the splits(cassandra.input.split.size) that controls theamount of rows read by single map task ifnecessary. Too many small splits, and tooshort map tasks, will cause task scheduling

overhead that dominates other work.! Running small jobs which processes tinydata or setting cassandra.input.split.size toohigh will spawn only a few mappers and maycause hot spots (e.g. one node executingmore mappers than another one).

Monitoring and Tuning Recommendationsfor Analytics Analytics-enabled nodes (integrated Hadoop)include additional JobTracker and/or TaskTrackerthreads and CFS operations can increase memtable

usage for column-families in the CFS keyspace.Monitor Cassandra JVM memory usage and adjustaccordingly if low memory conditions arise. Also,Map/Reduce tasks will be running as separate JVMprocesses and you’ll need to keep some memory inreserve for them as well.

DSE’s Spark analytics is much lighter on memorythan Hadoop as it does not use separate JVMs perevery task. Allow at least 50% more memory or even2x more for Spark alone than the size of the dataplanned to cache in order to fully leverage Spark’sin-memory caching. Monitor the heap size as largerheaps can introduce GC pauses/issues that canincrease wait times.

Spark needs memory for heavy operators like joins.In case of joins, large amounts of RAM must beallocated to ensure one of the sides of the join fits inmemory. Additionally, if you plan to run severaldifferent Spark applications at the same time, thenallocate enough RAM for all of them.

For Shark, a Hive-compatible system built on Sparkcan store the result of a query in memory.

Administrators must plan on allocating additionalmemory when in-built caching is enabled.

DSE’s new Spark Streaming can require high CPUor memory depending on the application workload. Ifthe job requires a high amount of processing of asmall amount of data, high CPU (16 cores or more)should be the goal. If the job requires holding a large

amount of data in memory and high-performanceanalytics, then high memory (64GB or more) shouldbe the aim. We also recommend a minimum of 8cores on each node that runs DSE Spark Streaming.

One key thing to note about Spark streaming setupin DSE is that it requires a minimum of 2 executorsto run streaming applications. Also it is important toensure that no growing backlog of batches waiting tobe processed occurs. This can be seen on the sparkUI.

A general guideline is to monitor network bandwidth

because streaming when writing to Cassandra willrequire a significant amount of network IO. Tolessen this burden more nodes should be added sothe load can be distributed over more machines. Also monitor “Processing Time” and “SchedulingDelay” metrics in the Spark web UI. The first metricis the time to process each batch of data, and thesecond metric is the time a batch waits in a queuefor the processing of previous batches to finish. Ifthe batch processing time is consistently more thanthe batch interval and/or the queuing delay keepsincreasing, then it indicates the application is notable to process the batches as fast as they arebeing generated and falling behind. Reduce thebatch processing time during these situations.

DataStax Enterprise 3.1 and beyond include anauto-tuning capability for common integratedHadoop configuration parameters for good out-of-the-box performance. Depending on the use caseand workload, the following areas can be furthertuned.

! Set the Dfs.block.size to 128MB or 256MB.The larger the size of data being processed,the larger the data block size should be. The

number of map tasks depends on the inputdata size and the block size; a larger blocksize will result in fewer map tasks. A largerblock size will also reduce the number ofpartitions stored in the CFS ‘sblocks’ columnfamily resulting in a smaller footprint forpartition-dependent structures, e.g. keycache, bloom filter, index.

! Set mapred.local.dir to a separate physicaldevice(s) from Cassandra data files andcommit log directories. Map/reduce tasks



"#

process a lot of data and are I/O intensive,so the location for this data should beisolated as much as possible, especiallyfrom Cassandra commit log and datadirectories. A list of directories can bespecified to further spread the I/O load.

! Use 80% or higher for themapred.reduce.slowstart.completed.maps

value. With the default value, after 5% of themap tasks have completed, reduce tasks willstart to copy and shuffle intermediate output;however, the actual reduce won’t start untilthe mappers are all completed, so a lowvalue tends to occupy reducer slots. Usinghigher values decreases the overlapbetween mappers and reduces and allowsreducers to wait less before starting.

Auto-tuned parameters and other information can be

found in this DataStax Technical blog article.

Security: Best Practices

Establishing a SSL connection when clients (CQL,Spark, Shark etc.) are connecting to DSE nodes (asthe username and password will be in plain text) andalso during node-node communication isrecommended.

DSE 4.6 integrates with the industry standard LDAP

and Active Directory servers.

There are certain settings in dse.yaml config file under

“LdapAuthenticator” section that one can tune- thesearch cache and the credentials cache. The search

cache can be kept for a long time because you

wouldn't normally expect a user to be moved within a

directory server. The credentials cache settings in

DSE is configurable and it depends on internal

IT/security policies.

Plan to use PermissiveTransitionalAuthenticator or

TransitionalAuthenticator if a directory server admin

does not allow the creation of a 'cassandra' user in the

directory server. In this case the DSE admin could usethe TransitionalAuthenticator to create a super user

with a name available in the directory server. After

doing this you can switch over to using the

LdapAuthenticator. The problem being that you can't

create users in cassandra if the AllowAllAuthenticator

(default) is being used.

Ensure passwords are encrypted in the config file by

running the command “bin/dsetool encryptconfigvalue”

that can encrypt “search_password” and

“truststore_password” in dse.yaml file.

OpsCenter 5.0 offers granular based security with user

creation and role based permission control. This is

very useful for organizations dealing with compliance

and strict internal security requirements.

Multi-Data Center andCloud: Best Practices

Cassandra delivers strong support for replicatingand synchronizing data across multiple data centersand cloud availability zones. Additionally, Cassandrasupports active-active deployments. SeveralDataStax customers deploy Cassandra across 2 ormore data centers and/or availability zones on cloudproviders like Amazon to ensure (1) data is keptclose to customers in various geographic locations,and (2) their database remains online should aparticular data center or cloud availability zone fail.

The sections that follow outline best practices whendeploying DSE across multiple data centers (DCs)and the cloud.

General RecommendationsCassandra replication is configured on a perkeyspace and per DC level. The replication factor(RF) used should not exceed the number of nodes ina data center. If this occurs, writes can be rejected

(dependent on the consistency level used), butreads are served as long as the desired consistencylevel is met.

As a general rule, DataStax recommends areplication factor of 3 within a single data center.This protects against data loss due to singlemachine failure once the data has been persisted toat least two machines. When accessing data, aconsistency level of QUORUM is typically used in asingle datacenter deployment. The recommendedreplication strategy for multiple datacenter is“NetworkTopologyStrategy ”.

Multi-Data Center and GeographicConsiderationsBest practice for serving customers in multiplegeographies is to deploy a copy of data in a datacenter closest to the customer base. This ensuresfaster performance for those users.

As an example, consider one DC in the UnitedStates and one DC in Europe. This can be either twophysical datacenters with each having a virtual data



"7

center (if required) or two data centers in the cloudas shown below.

Figure 3 – Using multiple data centers in differentgeographic regions (cloud)

Best practices for these two data centers:

! Ensure that LOCAL_QUORUM is beingused for requests and not EACH_QUORUM since the latter’s latency will negativelyimpact the end user’s performanceexperience.

! Ensure that the clients to which usersconnect can only see one data center,based on the list of IPs provided to the

client. ! Run repair operations (without the -pr

option) more frequently than the requiredonce per gc_grace_seconds. You could alsouse the automated repair service availablein DataStax Enterprise to do this.

Natural disasters such as hurricanes, typhoons, andearthquakes can shut down data centers, and causerolling power outages in a geographic area. When adata center becomes unavailable, there are anumber of tasks you may need to perform.

One of the first things to do is to stop client writes tothe downed data center. The next task is to add newnodes to the remaining data centers, to handle anyincrease in load that comes from traffic previouslyserved at the downed data center. When a downeddata center is brought back online, a repairoperation should be performed to ensure that alldata is consistent across all active nodes. Ifnecessary, additional nodes can be decommissionand additional nodes added to other DCs to handlethe user traffic from the once-downed DC.

Cassandra and DSE are data center aware,meaning that multiple DCs can be used for remotebackups or as a failover site in case of a disaster.Note, though, that such an implementation does notguard against situations such as a table mistakenlybeing dropped or a large amount of data beingdeleted in error.

An example of using multiple data centers forfailover/backup might be the following: an architectmay use Amazon EC2 to maintain a primary DC inthe U.S. East Coast and a failover DC on the WestCoast. The West Coast DC could have a lower RF tosave on data storage. As shown in the diagrambelow, a client application sends requests to EC2 " sUS-East-1 region at a consistency level (CL) ofLOCAL_QUORUM . EC2" s US-West-1 region willserve as a live backup. The replication strategy canreflect a full live backup ({US-East-1: 3, US-West-1:3}) or a smaller live backup ({US-East-1: 3, US-West-1: 2}) to save costs and disk usage for this

regional outage scenario. All clients continue to writeto the US-East-1 nodes by ensuring that the client’spools are restricted to just those nodes, to minimizecross data center latency.

Figure 4 – Setting up a multiple DC configuration forbackup or disaster recovery

To implement a better cascading fallback, initiallythe client’s connection pool can be restricted tonodes in the US-East-1 region. In the event of clienterrors, all requests can retry at a CL ofLOCAL_QUORUM , for X times, then decrease to aCL of ONE while escalating the appropriate

notifications. If the requests are still unsuccessful,using a new connection pool consisting of nodesfrom the US-West-1 data center, requests can begincontacting US-West-1 at a higher CL, beforeultimately dropping down to a CL of ONE.Meanwhile, any writes to US-West-1 can beasynchronously tried on US-East-1 via the client,without waiting for confirmation and logging anyerrors separately.

More details on these types of scenarios are



"?

available in this DataStax Tech blog article.+')12

+),.32(<%-3),.

This section provides general best practicesspecifically for the Amazon Web Services (AWS)environment. Any other sizing information (data andinstance size) on cloud can be found in DataStaxReference Architecture.

For Apache Cassandra installations in Amazon WebServices' (AWS) Elastic Compute Cloud (EC2), anm1.large is the smallest instance that DataStaxrecommends for development purposes. Anythingbelow that will have too small a vCPU count, smallmemory allocation, along with lighter networkcapacity, all of which are below recommendedCassandra requirements. Also, m1.xlarge is theminimum recommended for low-end productionusage, as the stress on an m1.large machine’s JavaVirtual Machine (JVM) is high during garbagecollection (GC) periods.

Another attribute to consider is the number of CPUs.Multiple CPUs are also important if heavy writeloads are expected, but ultimately Cassandra islimited by disk contention. Also, one of the mostcritical attributes to include in your cluster instancesare SSDs, and c3.2xlarge would be the bestupgrade in such cases. With 2x vCPUs,approximately 4x EC2 Compute Units, and 2x160SSDs, the c3.2xlarge will be more performant thanthe m1.xlarge. This option is ideal if your data sizeisn’t too large and latency is more important thancost.

Choosing the right AMI with an official image, up-to-date kernels and EC2 fixes are important factors toconsider in a cloud deployment. DataStaxrecommends using Amazon Linux AMI as it is themost accurately configured AMI for EC2. For i2instances, use the Amazon Linux AMI 2013.09.02 orany Linux AMI with a version 3.8 or newer kernel forthe best I/O performance.

The DataStax AMI uses an Ubuntu 12.04 LTS imageand can be found in this link: http://cloud-images.ubuntu.com/locator/ec2. http://cloud-images.ubuntu.com/locator/ec2 DataStax recommends against EBS devices forstoring your data, as Cassandra is a disk-intensivedatabase typically limited by disk I/O. EBS is aservice similar to that of legacy Network AttachedStorage (NAS) devices, and is not recommended forCassandra for the following reasons:

! Network disks are a single point of failure(SPOF). Within the NAS infrastructure, youcan have multiple layers of redundancy andways of validation. But ultimately, you willnever really know how reliable that is until

disks start failing. With Cassandra, nodesmay come and go frequently within the ringand you must know that your choice forstorage redundancy works as expected.

! If the network goes down, disks will beinaccessible, and thus your database will gooffline. Using networked storage forCassandra data, stored on one device,

circumvents the innate, tangible redundancythat a distributed database grants by default.Granted, the network connection betweenyour application and your database can besevered, but this is independent of yourdatabase.

! Local disks are faster than NAS. Eliminatingthe need to traverse the network and streamdata across a data center will reduce thelatency of each Cassandra operation

EC2 network performance can be inconsistent, so ageneral recommendation is to increase your

phi_convict_threshold to 12, in the cassandra.yamlfile. Otherwise, you may see issues with flappingnodes, which occur when Cassandra’s gossipprotocol doesn’t recognize a node as being UPanymore and periodically marks it as DOWN beforegetting the UP notification. Leave the phi_convict_threshold at its default setting, unlessyou see flapping nodes.

Clock skew will happen, especially in a cloudenvironment, and cause timestamps to get offset.Because Cassandra relies heavily on reliabletimestamps to resolve overwrites, keeping clocks insync is of utmost importance to your Cassandradeployment. If you’re handling timestamps in theapplication tier as well, keeping it in sync there isalso highly recommended. You can do this byinstalling and ensuring the NTP service is active andrunning successfully.

More details on making good choices for hostingCassandra on the Amazon Web Services (AWS)environment can be found in this DataStax Techblog article.

Backup and Restore: BestPractices

Proper backup and restore practices are a criticalpart of a database maintenance plan, protecting thedatabase against data loss. Cassandra providessnapshot backup capabilities, while OpsCenterEnterprise supplies visual, scheduled backups,restores (including object-level restores), andmonitoring tools. Users can also take a table level



"D

backup or backup to a remote location like AmazonS3, perform compression on the backup files orrestore to a specific point using point in time restorecapabilities (available in OpsCenter 5.1 andbeyond).

Backup RecommendationsDisk space is a key consideration when backing up

database systems. It is recommended to haveenough disk space on any node to accommodatemaking snapshots of the data files. Snapshots cancause disk usage to grow more quickly over timebecause a snapshot prevents obsolete data filesfrom being deleted.

OpsCenter’s capacity planning services helpsmonitor and forecast the disk usage for suchsituations. For example, the graph below shows20% disk usage cluster-wide with plenty of space forbackup files. One could predict the disk usage(including backups) for coming months using the

Capacity Service’s forecasting feature, as shown inthe right side of Figure 5.

Figure 5 – An example of using OpsCenter tomonitor and forecast used disk spaceOld snapshot files should be moved to a separatebackup location and then cleared on the node aspart of your maintenance process. The nodetoolclearsnapshot command removes all existingsnapshot files from the snapshot directory of each

keyspace. A suggestion is for you to move/clear oldsnapshot files before creating new ones.

As with snapshots, Cassandra does notautomatically clear incremental backup files.DataStax recommends setting up a process to clearincremental backup hard-links each time a newsnapshot is created. One way to automate this

process is to use OpsCenter. OpsCenter comes withan option to cleanup old backup data, with optionsfor you to specify how long backups should be kept.

Figure 6 – OpsCenter’s automatic backup purgeutility

Scheduling backups at regular intervals is also agood practice; this can easily be done throughOpsCenter. The frequency of backups will dependon how static or dynamic your data is.

An alternative backup storage strategy to consider isto use Amazon’s Simple Storage Service. S3 is an AWS storage service that has configurable security(using Access Control Lists (ACLs)) and tunableredundancy. There are independent S3 services foreach region, providing good redundancy, if needed.Note that you can customize OpsCenter’s backuputility to store backup data on S3.

Restore Considerations

Restoring data from a backup in the event of dataloss or other failure can be accomplished in anumber of different ways. OpsCenter’s visual restoreutility makes the process easy and error-free, and itis recommended you use OpsCenter to do either fullor object-level restores when possible.

You can now restore data if stored in Amazon S3 toan existing or a different cluster (available inOpsCenter 5.1 and beyond).

Outside of OpsCenter, either re-create the schema

before restoring a snapshot, or truncate tables beingtargeted for restore. Cassandra can only restoredata from a snapshot when the table schema exists.

A snapshot can be manually restored in severalways:

! Use the sstableloader tool.

! Copy the snapshot SSTable directory (seeTaking a snapshot) to the data directory(/var/lib/cassandra/data/keyspace/table/),and then call the JMX method



"G

loadNewSSTables() in the column familyMBean for each column family throughJConsole. Instead of using theloadNewSSTables() call, users can also usenodetool refresh.

! Use the Node Restart Method.

Software Upgrades: BestPractices

The procedures for upgrading an existing softwareinstallation depend on several things such as thecurrent software version used, how far behind it is,and similar considerations. General practices tofollow when upgrading an installation include thefollowing:

! Take a snapshot of all keyspaces before theupgrade. Doing so allows you to rollback to

the previous version if necessary.Cassandra is able to read data files createdby the previous version, but the inverse isnot always true. Taking a snapshot is fast,especially if JNA is installed, and takeseffectively zero disk space until you startcompacting the live data files again.

! Check NEWS.txt and CHANGES.txt for ofthe target software version any newinformation about upgrading. Note that theNews.txt is on the Apache Cassandra githubsite.

The order of a node upgrade (taking the node downand restarting it) matters. To help ensure success,you can follow these guidelines:

! Upgrade an entire datacenter before movingon to a different datacenter.

! Upgrade DSE Analytics data centers first,then Cassandra data centers and finallyDSE Search datacenters.

! Upgrade the Spark Master/ Job Trackernode in all Analytics datacenters first.

! Upgrade seed nodes before non-seednodes.

To perform an upgrade with zero downtime,DataStax recommends performing the upgrade as arolling restart. A rolling upgrade involves performingupgrades to the nodes one at a time withoutstopping the database.The process for rollingupgrade on each node is as follows:

1. Backup data by taking a snapshot of thenode to be upgraded.

2. Run “nodetool drain” on the node. This putsthe node into a read-only state and flushesthe memtable to disk in preparation for

shutting down operations.3. Stop the node.4. Install the new software version.5. Configure the new software (use the

yaml_diff tool that filters differences betweentwo cassandra.yaml files).

6. Start the node.7. Check the logs for warnings, errors and

exceptions. Frequent specific errors in thelogs are expected and may even provideadditional steps that need to be run. Youmust refer to the documentation for thetarget version to identify these kinds oferrors.

8. DataStax recommends waiting at least 10minutes after starting the node beforemoving on to the next node. This allows thenode to fully start up, receive hints, andgenerally become a fully participatingmember of the cluster.

9. Repeat these steps (1- 8) for each node in

the cluster.Upgrading from DataStax Community (DSC) or Apache Cassandra to DataStax Enterprise is similarto upgrading from one version to another. Beforeupgrading to DataStax Enterprise, it isrecommended to upgrade to the version Cassandrasupported in the DSE release you are targeting. Theonly additional step is configuring the cluster (step 4above), the snitch needs to move from beingspecified in the cassandra.yaml file to beingspecified in the dse.yaml file. Details can be foundhere. Frequently, upgrades from major versions ofSolr, Hadoop, or Cassandra require special steps to

be followed during the upgrade process. Pleasereview the recommendations mentioned in thedocumentation of the version you are targeting.

About DataStax

DataStax, the leading distributed databasemanagement system, delivers Apache Cassandra tothe world’s most innovative enterprises. Datastax isbuilt to be agile, always-on, and predictably scalableto any size.

DataStax has more than 500 customers in 45countries including leaders such as Netflix,Rackspace, Pearson Education and ConstantContact, and spans verticals including web, financialservices, telecommunications, logistics, andgovernment. Based in Santa Clara, Calif., DataStaxis backed by industry-leading investors includingLightspeed Venture Partners, Meritech Capital, andCrosslink Capital. For more information, visitDataStax.com or follow us @DataStax.

Documents

WP DataStax Enterprise Best Practices