12
A (very) quick introduction to using PHP and Cassandra @davegardnerisme

PHP and Cassandra

Embed Size (px)

DESCRIPTION

A very quick introduction to Cassandra database plus accessing it from PHP. My slides from my 5 minute "lightening" talk at PHP London on 1/7/2010.

Citation preview

Page 1: PHP and Cassandra

A (very) quick introduction to using

PHP and Cassandra

@davegardnerisme

Page 2: PHP and Cassandra

What is Cassandra?

Highly scalable distributed database

Brings together Dynamo and Bigtable

Schema-less

#nosql!

Page 3: PHP and Cassandra

Why Cassandra?

Horizontally scalable – RW increase linearly

Fault tolerant – no single point of failure

Hadoop integration for scalable Map Reduce

Good support via community (plus professional support)

Page 4: PHP and Cassandra

Data model

Store your data in the way you want to query it – denormalise

Some people say Cassandra is a 4 or 5 level hash (1)

[KeySpace][ColumnFamily][Key][Column]

[KeySpace][ColumnFamily][Key][SuperColumn][SubColumn]

Page 5: PHP and Cassandra

Configuring – storage-conf.xml

<ColumnFamily Name="Standard1" CompareWith="BytesType"/>

<ColumnFamily Name="Standard2" CompareWith="UTF8Type"/>

<ColumnFamily Name="Super1" ColumnType="Super" CompareWith="BytesType" CompareSubcolumnsWith="BytesType" />

Good example within an easily understandableproblem domain: Twissandra! (2)

Page 6: PHP and Cassandra

PHP

Access the core API via Thrift (3)

Higher level libraries do exist; PHPCassa (4)

and Pandra (5)

Compile Thrift which will generate the PHP libraries for you (6)

Native PHP Thrift extension recommended

Page 7: PHP and Cassandra

// SOME CODE!// connect$socket = new TSocket('192.168.1.206', 9160);$transport = new TBufferedTransport($socket, 1024, 1024);$protocol = new TBinaryProtocolAccelerated($transport);$client = new CassandraClient($protocol);$transport->open();

// fetch single column from single row$columnPath = new cassandra_ColumnPath();$columnPath->column_family = 'UrlsVisited';$columnPath->column = $url;

$userUrl = $client->get($keyspace, // a bit like the database$userId, // the row key$columnPath, //identifies columns we want$consistencyLevelOne);

Page 8: PHP and Cassandra

// Inserting multiple columns for a single row// (bit like populating one row of MySQL)$key = 'UniqueRowKey';$columnFamily = 'ResponsePersonality';$mutationMap = array(

$key=>array($columnFamily=>array()) );

// add our first column:$column = new cassandra_Column(array( 'name'=> 'howMuchWork', 'value'=> 'quiteABit', 'timestamp'=> time() ));$columnOrSupercolumn = new cassandra_ColumnOrSuperColumn( array('column'=>$column));$mutationMap[$key][$columnFamily][] = new cassandra_Mutation( array('column_or_supercolumn'=>$columnOrSupercolumn));

Page 9: PHP and Cassandra

// add our second column!:$column = new cassandra_Column(array( 'name'=> 'nextColumnName', 'value'=> 'wow', 'timestamp'=> time() ));$columnOrSupercolumn = new cassandra_ColumnOrSuperColumn( array('column'=>$column));$mutationMap[$key][$columnFamily][] = new cassandra_Mutation( array('column_or_supercolumn'=>$columnOrSupercolumn));

// repeat with other columns ...

// finally we call batch_mutate to add!$client->batch_mutate(

$keyspace,$mutationMap,$consistencyLevelZero);

Page 10: PHP and Cassandra

Finally… Hadoop integration

Support for creating Hadoop jobs in Java (7)

Support for PIG (higher level language) (7)

Cassandra 0.7 will include output support (8)

No support for Hive yet! (9)

(SQL-like syntax for creating Map Reduce jobs)

Page 11: PHP and Cassandra

References/links

1. Cassandra is four or five level hash:https://www.cloudkick.com/blog/2010/mar/02/4_months_with_cassandra/

2. Twissandra:http://www.rackspacecloud.com/blog/2010/05/12/cassandra-by-example/

3. Thrift API:http://wiki.apache.org/cassandra/API

4. PHPCassa:http://github.com/hoan/phpcassa

5. Pandra:http://github.com/mjpearson/Pandra

Page 12: PHP and Cassandra

References/links

6. Using Cassandra with PHP (installing Thrift)https://wiki.fourkitchens.com/display/PF/Using+Cassandra+with+PHP

7. Hadoop Support in Cassandra:http://wiki.apache.org/cassandra/HadoopSupport

8. Output support in Cassandra:https://issues.apache.org/jira/browse/CASSANDRA-1101

9. Hive support (feature request!):https://issues.apache.org/jira/browse/CASSANDRA-913