39
Cassandra 0.7 Friday, December 10, 2010

Cassandra 0.7, Los Angeles High Scalability Group

  • Upload
    jbellis

  • View
    1.994

  • Download
    0

Embed Size (px)

DESCRIPTION

What's new in Cassandra 0.7

Citation preview

Page 1: Cassandra 0.7, Los Angeles High Scalability Group

Cassandra 0.7

Friday, December 10, 2010

Page 2: Cassandra 0.7, Los Angeles High Scalability Group

Features

• Live schema modification

• Secondary indexes

• Hadoop OutputFormat

• (Very) large rows

• up to 2 billion columns

• NetworkTopologyStrategy

Friday, December 10, 2010

Page 3: Cassandra 0.7, Los Angeles High Scalability Group

Operations

• efficient Streaming

• Per-ColumnFamily settings of memtable thresholds

• Much more (optional) metadata about columns

Friday, December 10, 2010

Page 4: Cassandra 0.7, Los Angeles High Scalability Group

Operations backports• HH disable (0.6.2)

• compaction priority (0.6.3)

• HH hourly scan (0.6.3)

• JMX metrics for row-level bloom filters (0.6.3)

• Flow control (0.6.4, 5)

• HH paging (0.6.5)

• Dynamic snitch (0.6.5)

• Tombstone removal in minor compaction (0.6.6)

Friday, December 10, 2010

Page 5: Cassandra 0.7, Los Angeles High Scalability Group

Compatiblity

• Fully backwards-compatible with 0.6 data

• Some Thrift API changes

• String row keys become byte[]

• keyspace is set once per connection

• Requires drain + cluster restart

Friday, December 10, 2010

Page 6: Cassandra 0.7, Los Angeles High Scalability Group

Features

Friday, December 10, 2010

Page 8: Cassandra 0.7, Los Angeles High Scalability Group

Data model tradeoffs

• Twitter: “Fifteen months ago, it took two weeks to perform ALTER TABLE on the statuses [tweets] table.”

Friday, December 10, 2010

Page 9: Cassandra 0.7, Los Angeles High Scalability Group

A static ColumnFamily

Friday, December 10, 2010

Page 10: Cassandra 0.7, Los Angeles High Scalability Group

Friday, December 10, 2010

Page 11: Cassandra 0.7, Los Angeles High Scalability Group

A dynamic ColumnFamily

Friday, December 10, 2010

Page 12: Cassandra 0.7, Los Angeles High Scalability Group

SELECT * FROM tweetsWHERE user_id IN (SELECT follower FROM followers WHERE user_id = ?)

followers

?

tweets

timeline

?

Friday, December 10, 2010

Page 13: Cassandra 0.7, Los Angeles High Scalability Group

SuperColumns = full denormalization

Friday, December 10, 2010

Page 14: Cassandra 0.7, Los Angeles High Scalability Group

A little deeper

• http://twissandra.com

• http://github.com/jhermes/twissjava

Friday, December 10, 2010

Page 15: Cassandra 0.7, Los Angeles High Scalability Group

Secondary indexes

Friday, December 10, 2010

Page 16: Cassandra 0.7, Los Angeles High Scalability Group

A static ColumnFamily

Friday, December 10, 2010

Page 18: Cassandra 0.7, Los Angeles High Scalability Group

Hadoop OutputFormatjob.setOutputFormatClass(ColumnFamilyOutputFormat.class);ConfigHelper.setOutputColumnFamily(job.getConfiguration(), KS, CF);...public void reduce(Text word, Iterable<IntWritable> values, Context context){ int sum = 0; for (IntWritable val : values) sum += val.get(); context.write(outputKey, Collections.singletonList(getMutation(word, sum)));}

Friday, December 10, 2010

Page 19: Cassandra 0.7, Los Angeles High Scalability Group

Large rows

• 0.6: smaller of {2GB, memory limit}

• 0.7: in_memory_compaction_limit_in_mb

Friday, December 10, 2010

Page 20: Cassandra 0.7, Los Angeles High Scalability Group

NetworkTopologyStrategy

• RackAwareStrategy is tuned for 3 replicas and 2 data centers

• renamed to OldNetworkTopologyStrategy

• NTS allows configuring replicas per data center, per Keyspace

• ignores replication_factor directive

Friday, December 10, 2010

Page 21: Cassandra 0.7, Los Angeles High Scalability Group

Operations

Friday, December 10, 2010

Page 22: Cassandra 0.7, Los Angeles High Scalability Group

Efficient Streaming

• The following slides show how in 0.7, we just send the data portion of the sstables we are moving to a new node over to it (which is contiguous on disk, no random i/o), which rebuilds indexes etc

• This minimizes the impact on existing nodes

Friday, December 10, 2010

Page 23: Cassandra 0.7, Los Angeles High Scalability Group

A

L

T

W

F(A-L]

Friday, December 10, 2010

Page 24: Cassandra 0.7, Los Angeles High Scalability Group

A

L

T

W

F(A-F]

(F-L]

(A-F]

Friday, December 10, 2010

Page 25: Cassandra 0.7, Los Angeles High Scalability Group

A

L

T

W

F

Data

Index

Filter

Friday, December 10, 2010

Page 26: Cassandra 0.7, Los Angeles High Scalability Group

A

L

T

W

F

Index

Filter

Friday, December 10, 2010

Page 27: Cassandra 0.7, Los Angeles High Scalability Group

Per-CF memtable thresholds

• Easier tuning for large numbers of ColumnFamilies

Friday, December 10, 2010

Page 28: Cassandra 0.7, Los Angeles High Scalability Group

Column Metadata

• 0.6: comparator, subcomparator

• 0.7: default_validation_class, column_metadata

Friday, December 10, 2010

Page 29: Cassandra 0.7, Los Angeles High Scalability Group

Native code

• JNA introduced in 0.6.5 for mlockall

• Extended to hard links in 0.6.6

Friday, December 10, 2010

Page 30: Cassandra 0.7, Los Angeles High Scalability Group

Flow Control (0.6.4)

• Replica nodes drop hopeless requests on the floor

• Coordinator node is unaffected

• TimedOutException signals client to back off

• Requires enough memory to buffer RPCTimeout’s worth of requests

• (In the short term, you’re still screwed)

Friday, December 10, 2010

Page 31: Cassandra 0.7, Los Angeles High Scalability Group

Flow control in 0.5

• Why backpressure doesn’t fit Cassandra

Friday, December 10, 2010

Page 32: Cassandra 0.7, Los Angeles High Scalability Group

Dynamic snitch

public void sortByProximity(List<InetAddress> addresses);

Friday, December 10, 2010

Page 33: Cassandra 0.7, Los Angeles High Scalability Group

Everything else

Friday, December 10, 2010

Page 34: Cassandra 0.7, Los Angeles High Scalability Group

0.7 performance

• Reads roughly 100% faster, thanks largely to removing String creation

• Row-cached reads up to 8x faster after optimizations by tjake and jbellis

• Optimizations for reads of large rows

• 0.7.1? ~15% improvement everywhere from ByteBuffer optimizations

Friday, December 10, 2010

Page 35: Cassandra 0.7, Los Angeles High Scalability Group

Thrift: the libpq of Cassandra

• OOMs on malformed packets

• Python Unicode string issues

• PHP support is buggy and maintainerless

Friday, December 10, 2010

Page 36: Cassandra 0.7, Los Angeles High Scalability Group

Client support from Riptano

• Hector

• Building JPA/JDO layer on top

• pycassa

• phpcassa

• Soon: cassandra gem

Friday, December 10, 2010

Page 37: Cassandra 0.7, Los Angeles High Scalability Group

After 0.7.0

• IndexOperator.GT

• Triggers / plugins

• Entity groups

• On-disk data format improvements (Compression, compound keys?)

Friday, December 10, 2010

Page 38: Cassandra 0.7, Los Angeles High Scalability Group

Summary

Friday, December 10, 2010

Page 39: Cassandra 0.7, Los Angeles High Scalability Group

Friday, December 10, 2010