Spark Cassandra Connector Dataframes

Cassandra And Spark Dataframes

Russell Spitzer Software Engineer @ Datastax





Tungsten Gives Dataframes OffHeap Power!

Can compare memory off-heap and bitwise! Code generation!

The Core is the Cassandra Source

https://github.com/datastax/spark-cassandra-connector/tree/master/spark-cassandra-connector/src/main/scala/org/apache/spark/sql/cassandra

/** * Implements [[BaseRelation]]]], [[InsertableRelation]]]] and [[PrunedFilteredScan]]]] * It inserts data to and scans Cassandra table. If filterPushdown is true, it pushs down * some filters to CQL * */

DataFrame

source org.apache.spark.sql.cassandra


The Core is the Cassandra Source


/** * Implements [[BaseRelation]]]], [[InsertableRelation]]]] and [[PrunedFilteredScan]]]] * It inserts data to and scans Cassandra table. If filterPushdown is true, it pushs down * some filters to CQL * */

DataFrameCassandraSourceRelation

CassandraTableScanRDDConfiguration


Configuration Can Be Done on a Per Source Level

clusterName:keyspaceName/propertyName. Example Changing Cluster/Keyspace Level Properties val conf = new SparkConf() .set("ClusterOne/spark.cassandra.input.split.size_in_mb","32") .set("default:test/spark.cassandra.input.split.size_in_mb","128")

val lastdf = sqlContext .read .format("org.apache.spark.sql.cassandra") .options(Map( "table" -> "words", "keyspace" -> "test" , "cluster" -> "ClusterOne" ) ).load()




Namespace: ClusterOne spark.cassandra.input.split.size_in_mb=32




Namespace: default Keyspace: test

spark.cassandra.input.split.size_in_mb=128










val lastdf = sqlContext .read .format("org.apache.spark.sql.cassandra") .options(Map( "table" -> "words", "keyspace" -> "test" , "cluster" -> "default" ) ).load()






val lastdf = sqlContext .read .format("org.apache.spark.sql.cassandra") .options(Map( "table" -> "words", "keyspace" -> "other" , "cluster" -> "default" ) ).load()




Connector Default

Predicate Pushdown Is Automatic!

Select * From cassandraTable where clusteringKey > 100



DataFrame DataFromC*

Filter clusteringKey > 100

Show





Show

Catalyst





Show

Catalyst

https://github.com/datastax/spark-cassandra-connector/blob/master/spark-cassandra-connector/src/main/scala/org/apache/spark/sql/cassandra/PredicatePushDown.scala



DataFrame DataFromC* AND

add where clause to CQL

"clusteringKey > 100"

Show

Catalyst

https://github.com/datastax/spark-cassandra-connector/blob/master/spark-cassandra-connector/src/main/scala/org/apache/spark/sql/cassandra/PredicatePushDown.scala

What can be pushed down?

1. Only push down no-partition key column predicates with =, >, <, >=, <= predicate 2. Only push down primary key column predicates with = or IN predicate. 3. If there are regular columns in the pushdown predicates, they should have at least one EQ

expression on an indexed column and no IN predicates. 4. All partition column predicates must be included in the predicates to be pushed down, only

the last part of the partition key can be an IN predicate. For each partition column, only one predicate is allowed.

5. For cluster column predicates, only last predicate can be non-EQ predicate including IN predicate, and preceding column predicates must be EQ predicates.

6. If there is only one cluster column predicate, the predicates could be any non-IN predicate. There is no pushdown predicates if there is any OR condition or NOT IN condition.

7. We're not allowed to push down multiple predicates for the same column if any of them is equality or IN predicate.

What can be pushed down?

If you could write in CQL it will get pushed down.

What are we Pushing Down To?

CassandraTableScanRDD

All of the underlying code is the same as with sc.cassandraTable so everything with Reading and Writing

applies

What are we Pushing Down To?

CassandraTableScanRDD

All of the underlying code is the same as with sc.cassandraTable so everything with Reading and Writing

applies

https://academy.datastax.com/ Watch me talk about this in the privacy of your own home!

https://academy.datastax.com/

How the Spark Cassandra Connector

Reads Data

Spark RDDs Represent a Large

Amount of Data Partitioned into Chunks

RDD

1 2 3

4 5 6

7 8 9Node 2

Node 1 Node 3

Node 4

Node 2

Node 1



RDD

2

346

7 8 9

Node 3

Node 4

1 5

Node 2

Node 1

RDD

2

346

7 8 9

Node 3

Node 4

1 5



Cassandra Data is Distributed By Token Range


0

500


0

500

999


0

500

Node 1

Node 2

Node 3

Node 4


0

500

Node 1

Node 2

Node 3

Node 4

Without vnodes


0

500

Node 1

Node 2

Node 3

Node 4

With vnodes

Node 1

120-220300-500780-830

0-50

spark.cassandra.input.split_size_in_mb 1

Reported density is 100 tokens per mb

The Connector Uses Information on the Node to Make Spark Partitions

Node 1

120-220300-500

0-50


1

780-830



1

Node 1

120-220

300-500

0-50


780-830



2

1

Node 1 300-500

0-50


780-830



2

1

Node 1 300-500

0-50


780-830



2

1

Node 1

300-400

0-50


780-830400-500



21

Node 1

0-50


780-830400-500



21

Node 1

0-50


780-830400-500

3



21

Node 1

0-50


780-830

3

400-500



21

Node 1

0-50


780-830

3



4

21

Node 1

0-50


780-830

3



4

21

Node 1

0-50


780-830

3



421

Node 1


3



4

spark.cassandra.input.page.row.size 50

Data is Retrieved Using the DataStax Java Driver

0-50780-830

Node 1

4



0-50

780-830

Node 1

SELECT * FROM keyspace.table WHERE token(pk) > 780 and token(pk) <= 830


4



0-50

780-830

Node 1



4



0-50

780-830

Node 1



50 CQL Rows

4



0-50

780-830

Node 1



50 CQL Rows

4



0-50

780-830

Node 1



50 CQL Rows50 CQL Rows

4



0-50

780-830

Node 1



50 CQL Rows50 CQL Rows

4



0-50

780-830

Node 1



50 CQL Rows50 CQL Rows50 CQL Rows

4



0-50

780-830

Node 1



50 CQL Rows50 CQL Rows50 CQL Rows

4



0-50

780-830

Node 1



50 CQL Rows50 CQL Rows50 CQL Rows 50 CQL Rows

4



0-50

780-830

Node 1



50 CQL Rows50 CQL Rows50 CQL Rows50 CQL Rows

4



0-50

780-830

Node 1



50 CQL Rows50 CQL Rows50 CQL Rows50 CQL Rows 50 CQL Rows

4



0-50

780-830

Node 1



50 CQL Rows50 CQL Rows50 CQL Rows50 CQL Rows50 CQL Rows

4



0-50

780-830

Node 1

SELECT * FROM keyspace.table WHERE token(pk) > 0 and token(pk) <= 5050 CQL Rows50 CQL Rows50 CQL Rows50 CQL Rows50 CQL Rows

4



0-50

780-830

Node 1

SELECT * FROM keyspace.table WHERE token(pk) > 0 and token(pk) <= 5050 CQL Rows50 CQL Rows50 CQL Rows50 CQL Rows50 CQL Rows 50 CQL Rows

4



0-50

780-830

Node 1


50 CQL Rows

4



0-50

780-830

Node 1


50 CQL Rows

50 CQL Rows50 CQL Rows50 CQL Rows50 CQL Rows

4



0-50

780-830

Node 1


50 CQL Rows50 CQL Rows50 CQL Rows50 CQL Rows50 CQL Rows

How The Spark Cassandra Connector

Writes Data



RDD

1 2 3

4 5 6

7 8 9Node 2

Node 1 Node 3

Node 4

Node 2

Node 1



RDD

2

346

7 8 9

Node 3

Node 4

1 5

Node 2

Node 1

RDD

2

346

7 8 9

Node 3

Node 4

1 5

The Spark Cassandra Connector saveToCassandra

method can be called on almost all RDDs

rdd.saveToCassandra("Keyspace","Table")

Node 11

A Java Driver connection is made to the local node and a prepared statement

is built for the target table

Java Driver

Node 11

Batches are built from data in Spark partitions

Java Driver

1,1,1

1,2,1

2,1,1

3,8,1

3,2,1

3,4,1

3,5,1

3,1,1

1,4,1

5,4,1

2,4,1

8,4,1

9,4,1

3,9,1

Node 11

By default these batches only contain CQL Rows which share the same

partition key

Java Driver

1,1,1

1,2,1

2,1,1

3,8,1

3,2,1

3,4,1

3,5,1

3,1,1

1,4,1

5,4,1

2,4,1

8,4,1

9,4,1

11,4, spark.cassandra.output.batch.grouping.key partition spark.cassandra.output.batch.size.rows 4 spark.cassandra.output.batch.buffer.size 3 spark.cassandra.output.concurrent.writes 2 spark.cassandra.output.throughput_mb_per_sec 5

3,9,1

Node 11 Java Driver

1,1,11,2,1

2,1,1

3,8,1

3,2,1

3,4,1

3,5,1

3,1,1

1,4,1

5,4,1

2,4,1

8,4,1

9,4,1


3,9,1

By default these batches only contain CQL Rows which share the same

partition key

PK=1

Node 11

When an element is not part of an existing batch, a new batch is started

Java Driver

1,1,1 1,2,1

2,1,1

3,8,1

3,2,1

3,4,1

3,5,1

3,1,1

1,4,1

5,4,1

2,4,1

8,4,1

9,4,1

11,4,

spark.cassandra.output.batch.grouping.key partition spark.cassandra.output.batch.size.rows 4 spark.cassandra.output.batch.buffer.size 3 spark.cassandra.output.concurrent.writes 2 spark.cassandra.output.throughput_mb_per_sec 5

3,9,1

PK=1

Node 11 Java Driver

1,1,1 1,2,1

2,1,1

3,8,1

3,2,1

3,4,1

3,5,1

3,1,1

1,4,1

5,4,1

2,4,1

8,4,1

9,4,1


3,9,1


PK=1

PK=2

Node 11 Java Driver

1,1,1 1,2,1

2,1,1

3,8,1

3,2,1

3,4,1

3,5,1

3,1,1

1,4,1

5,4,1

2,4,1

8,4,1

9,4,1


3,9,1


PK=1

PK=2

Node 11 Java Driver

1,1,1 1,2,1

2,1,1

3,8,13,2,1 3,4,1 3,5,1

3,1,1

1,4,1

5,4,1

2,4,1

8,4,1

9,4,1


3,9,1

If a batch size reaches batch.size.rows or batch.size.bytes

it is executed by the driver

PK=1

PK=2

PK=3

Node 11 Java Driver

1,1,1 1,2,1

2,1,1

3,8,13,2,1 3,4,1 3,5,1

3,1,1

1,4,1

5,4,1

2,4,1

8,4,1

9,4,1


3,9,1

PK=1

PK=2

PK=3



Node 11 Java Driver

1,1,1 1,2,1

2,1,1

1,4,1

5,4,1

2,4,1

8,4,1

9,4,1

11,4,3,9,1

3,1,1

spark.cassandra.output.batch.grouping.key partition spark.cassandra.output.batch.size.rows 4 spark.cassandra.output.batch.buffer.size 3 spark.cassandra.output.concurrent.writes 2 spark.cassandra.output.throughput_mb_per_sec 5



PK=1

PK=2

Node 11 Java Driver

1,1,1 1,2,1

2,1,1

3,1,1

1,4,1

5,4,1

2,4,1

8,4,1

9,4,1

11,4,3,9,1 spark.cassandra.output.batch.grouping.key partition spark.cassandra.output.batch.size.rows 4 spark.cassandra.output.batch.buffer.size 3 spark.cassandra.output.concurrent.writes 2 spark.cassandra.output.throughput_mb_per_sec 5



PK=1

PK=2

PK=3

Node 11

If more than batch.buffer.size batches are currently being made,

the largest batch is executed by the Java Driver

Java Driver

1,1,1 1,2,1

2,1,1

3,1,1

1,4,1

5,4,1

2,4,1

8,4,1

9,4,1


3,9,1

PK=1

PK=2

PK=3

Node 11 Java Driver

2,1,1

3,1,1

5,4,1

2,4,1

8,4,1

9,4,1


3,9,1

PK=2

PK=3



Node 11 Java Driver

2,1,1

3,1,1

5,4,1

2,4,1

8,4,1

9,4,1


3,9,1



PK=2

PK=3

PK=5

Node 11 Java Driver

2,1,1

3,1,1

5,4,1

2,4,1

8,4,1

9,4,1


3,9,1



PK=2

PK=3

PK=5

Node 11

If more batches are currently being executed by the Java driver than concurrent.writes, we

wait until one of the requests has been completed.

Java Driver

2,1,1

3,1,1

5,4,1

2,4,18,4,1

9,4,1


3,9,13,9,1

PK=2

PK=3

PK=5

Node 11



Java Driver

2,1,1

3,1,1

5,4,1

2,4,18,4,1

9,4,1


3,9,13,9,1

Write Acknowledged PK=2

PK=3

PK=5

Node 11



Java Driver

2,1,1

3,1,1

5,4,1

2,4,1

9,4,1


8,4,1

3,9,1

PK=2

PK=3

PK=5

Node 11



Java Driver

3,1,1

5,4,1

9,4,1


8,4,1

3,9,1

PK=3

PK=5

Node 11



Java Driver

3,1,1

5,4,1

9,4,1


8,4,1

3,9,1

PK=8

PK=3

PK=5

Node 11



Java Driver

9,4,1


3,1,1

5,4,1

8,4,1

3,9,1

PK=8

PK=3

PK=5

Node 11

The last parameter throughput_mb_per_sec blocks further batches if we have written more than

that much in the past second.

Java Driver

9,4,1


3,1,1

5,4,1

8,4,1

3,9,1

PK=8

PK=3

PK=5

Node 11



Java Driver

9,4,1


3,1,1

5,4,1

8,4,1

3,9,1

PK=8

PK=3

PK=5

Write Acknowledged

Node 11



Java Driver

9,4,1


3,1,1

5,4,1

8,4,1

3,9,1

PK=8

PK=3

PK=5

Node 11



Java Driver

9,4,1


3,1,1

5,4,1

8,4,1

3,9,1

PK=8

PK=3

PK=5

Write Acknowledged

Node 11



Java Driver

9,4,1


Block

3,1,1

5,4,1

8,4,1

3,9,1

PK=8

PK=3

PK=5

Node 11



Java Driver

9,4,1


3,1,1

5,4,1

8,4,1

3,9,1

PK=8

PK=3

PK=5

Thanks for Coming and I hope you Have a Great Time At C* Summit

http://cassandrasummit-datastax.com/agenda/the-spark-cassandra-connector-past-present-and-future/

Also ask these guys really hard questions

Jacek PiotrAlex