Hailo and NoSQL
David Gardner, Architect at Hailo
JAXLONDON2013
JAXLONDON2013
JAXLONDON2013
1. Why choose NoSQL
2. A whistle-stop tour of Cassandra
3. Adoption of Cassandra at Hailo
What this talk is about
JAXLONDON2013
What is Hailo?
Hailo is The Taxi Magnet. Use Hailo to get a cab wherever you are, whenever you want.
JAXLONDON2013
JAXLONDON2013
JAXLONDON2013
JAXLONDON2013
• The world’s highest-rated taxi app – over 11,000 five-star reviews
• Over 500,000 registered passengers
• A Hailo hail is accepted around the world every 4 seconds
• Hailo operates in 15 cities on 3 continents from Tokyo to Toronto in nearly 2 years of operation
Facts and figures
JAXLONDON2013
• Hailo is a marketplace that facilitates over $100M in run-rate transactions and is making the world a better place for passengers and drivers
• Hailo has raised over $50M in financing from the world's best investors including Union Square Ventures, Accel, the founder of Skype (via Atomico), Wellington Partners (Spotify), Sir Richard Branson, and our CEO's mother, Janice
Hailo is growing
JAXLONDON2013
Why choose NoSQL?
JAXLONDON2013
“NoSQL DBs trade off traditional features to better support new and emerging use cases”
Andy Gross, Riak
http://www.slideshare.net/argv0/riak-use-cases-dissecting-the-solutions-to-hard-problems
JAXLONDON2013
• More widely used, tested and documented software
• Ad-hoc querying
• Talent pool with direct experience
What are we trading off?
JAXLONDON2013
• High availability
• Scalability
• Operational simplicity
What do we get back in return?
JAXLONDON2013
Cassandra 101
JAXLONDON2013
Consistent hashing Vector clocks * Gossip protocol Hinted handoff Read repair http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf
Columnar SSTable storage
Append-only Memtable
Compaction
http://labs.google.com/papers/bigtable-osdi06.pdf
Amazon Dynamo + Google Big Table
JAXLONDON2013
coordinator node
Client
tokens are integers from 0 to 2127
three replicas (RF=3)
JAXLONDON2013
Consistency level (CL)
Level Description
ONE 1st Response
QUORUM N/2 + 1 replicas
LOCAL_QUORUM N/2 + 1 replicas in local data centre
EACH_QUORUM N/2 + 1 replicas in each data centre
ALL All replicas
How many replicas must respond to declare success?
JAXLONDON2013
Big Table
• Sparse column based data model • SSTable disk storage • Append-only commit log • Memtable (buffer and sort) • Immutable SSTable files • Compaction
http://research.google.com/archive/bigtable-osdi06.pdf http://www.slideshare.net/geminimobile/bigtable-4820829
JAXLONDON2013
Plus timestamp, used for Last Write Wins (LWW) conflict resolution
Name
Value
Column
JAXLONDON2013
we can have millions of columns
* theoretically up to 2 billion!
Name
Value
Column
Name
Value
Column
Name
Value
Column
JAXLONDON2013
Name
Value
Column
Name
Value
Column
Name
Value
Column
Row Key
Row
JAXLONDON2013
Column Family
Column Row Key Column Column
we can have billions of rows
Column Row Key Column Column
Column Row Key Column Column
JAXLONDON2013
Write Memtable
SSTable SSTable Commit Log
Memory
Disk
buffers writes and sorts data
flush on time or size trigger
immutable
JAXLONDON2013
Cassandra at Hailo
JAXLONDON2013
Hailo launched in London in November 2011
• Launched on AWS
• Two PHP/MySQL web apps plus a Java backend
• Mostly built by a team of 3 or 4 backend engineers
• MySQL multi-master for single AZ resilience
JAXLONDON2013
Why Cassandra?
• A desire for greater resilience – “become a utility” Cassandra is designed for high availability
• Plans for international expansion around a single consumer app Cassandra is good at global replication
• Expected growth Cassandra scales linearly for both reads and writes
• Prior experience I had experience with Cassandra and could recommend it
JAXLONDON2013
The path to adoption
• Largely unilateral decision by developers – a result of a startup culture
• Replacement of key consumer app functionality, splitting up the PHP/MySQL web app into a mixture of global PHP/Java services backed by a Cassandra data store
• Launched into production in September 2012 – originally just powering North American expansion, before gradually switching over Dublin and London
JAXLONDON2013
One year on...
• Further breakdown of functionality into Go/Java SOA
• Migrating all online databases to Cassandra
JAXLONDON2013
Development perspective
JAXLONDON2013
“Cassandra just works”
Dom W, Senior Engineer
JAXLONDON2013
Use cases
1. Entity storage
2. Time series data
JAXLONDON2013
CF = customers
126007613634425612: createdTimestamp: 1370465412 email: [email protected] givenName: Dave familyName: Gardner locale: en_GB phone: +447911111111
JAXLONDON2013
Considerations for entity storage
• Do not read the entire entity, update one property and then write back a mutation containing every column
• Only mutate columns that have been set
• This avoids read-before-write race conditions
JAXLONDON2013
JAXLONDON2013
CF = stats_db
2013-06-01: 55374fa0-ce2b-11e2-8b8b-0800200c9a66: {“action”:”… a48bd800-ce2b-11e2-8b8b-0800200c9a66: {“action”:”… b0e15850-ce2b-11e2-8b8b-0800200c9a66: {“action”:”… bfac6c80-ce2b-11e2-8b8b-0800200c9a66: {“action”:”…
JAXLONDON2013
CF = stats_db
LON123456: 13b247f0-ce2c-11e2-8b8b-0800200c9a66: {“action”:”… 20f70a40-ce2c-11e2-8b8b-0800200c9a66: {“action”:”… 2b44d3b0-ce2c-11e2-8b8b-0800200c9a66: {“action”:”… 338a22f0-ce2c-11e2-8b8b-0800200c9a66: {“action”:”…
JAXLONDON2013
JAXLONDON2013
Considerations for time series storage
• Choose row key carefully, since this partitions the records
• Think about how many records you want in a single row
• Denormalise on write into many indexes
JAXLONDON2013
Analytics
• With Cassandra we lost the ability to carry out analytics eg: COUNT, SUM, AVG, GROUP BY
• We use Acunu Analytics to give us this abilty in real time, for pre-planned query templates
• It is backed by Cassandra and therefore highly available, resilient and globally distributed
• Integration is straightforward (HTTP POST)
NSQ Acunu C* events
JAXLONDON2013
JAXLONDON2013
AQL
SELECT SUM(accepted), SUM(ignored), SUM(declined), SUM(withdrawn) FROM Allocations WHERE timestamp BETWEEN '1 week ago' AND 'now’ AND driver='LON123456789’ GROUP BY timestamp(day)
JAXLONDON2013
Get a picture of driver supply
SELECT COUNT DISTINCT(driverId) FROM driverLocs WHERE timestamp BETWEEN '1 day ago' AND 'now' GROUP BY timestamp(hour) SELECT COUNT FROM driverLocs WHERE timestamp BETWEEN '1 day ago' AND 'now' GROUP BY latitude(0.01), longitude(0.01)
JAXLONDON2013
JAXLONDON2013
Operational perspective
JAXLONDON2013
“Allows a team of 2 to achieve things they wouldn’t have considered before Cassandra existed”
Chris H, Operations Engineer
JAXLONDON2013
JAXLONDON2013
3 clusters
6 machines per region
3 regions (stats cluster is a long story)
Operational Cluster
Stats Cluster
ap-southeast-1 us-east-1 eu-west-1
us-east-1 eu-west-1
AZ1
eu-west-1
AZ1
AZ2 AZ2
AZ3 AZ3
AZ1
us-east-1
AZ1
AZ2 AZ2
AZ3 AZ3
AZ1
ap-southeast-1
AZ1
AZ2 AZ2
AZ3 AZ3
JAXLONDON2013
JAXLONDON2013
AWS VPCs with Open VPN links
3 AZs per region
m1.large machines
Provisoned IOPS EBS
Operational Cluster
Stats Cluster
~ 1TB/node
~ 200GB/node
JAXLONDON2013
Backups
• SSTable snapshot
• Used to upload to S3, but this was taking >6 hours and consuming all our network bandwidth
• Now take EBS snapshot of the data volumes
JAXLONDON2013
Encryption
• Requirement for NYC launch
• We use dmcrypt to encrypt the entire EBS volume
• Chose dmcrypt because it is uncomplicated
• Our tests show a 1% performance hit in disk performance, which concurs with what Amazon suggest
JAXLONDON2013
Datastax Ops Centre is a quick win
JAXLONDON2013
Multi DC
• Something that Cassandra makes trivial
• Would have been very difficult to accomplish active-active inter-DC replication with a team of 2 without Cassandra
• Rolling repair needed to make it safe (we use LOCAL_QUORUM)
• We schedule “narrow repairs” on different nodes in our cluster each night
JAXLONDON2013
Compression
• Our stats cluster was running at ~1.5TB per node
• We didn’t want to add more nodes
• With compression, we are now back to ~600GB
• Easy to accomplish
• `nodetool upgradesstables` on a rolling schedule
JAXLONDON2013
Management perspective
JAXLONDON2013
“The days of the quick and dirty are over”
Simon V, EVP Operations
JAXLONDON2013
Technically, everything is fine…
• Our COO feels that C* is “technically good and beautiful”, a “perfectly good option”
• Our EVPO says that C* reminds him of a time series database in use at Goldman Sachs that had “very good performance”
…but there are concerns
JAXLONDON2013
People who can attempt to query MySQL
People who can attempt to
query Cassandra
JAXLONDON2013
JAXLONDON2013
Lessons learned
JAXLONDON2013
There might be a gulf in experience
JAXLONDON2013
10 Average years experience per team member
MySQL Cassandra
JAXLONDON2013
Lesson learned
• Have an advocate - get someone who will sell the vision internally
• Learn the theory - teach each team member the fundamentals
• Make an effort to get everyone on board
JAXLONDON2013
Things can drift into failure
JAXLONDON2013
JAXLONDON2013
JAXLONDON2013
JAXLONDON2013
JAXLONDON2013
JAXLONDON2013
Lesson learned
• Be pro-active with Cassandra, even if it seems to be running smoothly
• Peer-review data models, take time to think about them
• Big rows are bad - use cfstats to look for them
• Mixed workloads can cause problems - use cfhistograms and look out for signs of data modeling problems
• Think about the compaction strategy for each CF
JAXLONDON2013
EBS is terrible
JAXLONDON2013
Lessons learned
• EBS is nearly always the cause of Amazon outages
• EBS is a single point of failure (it will fail everywhere in your cluster)
• EBS is slow
• EBS is expensive
• EBS is unnecessary!
JAXLONDON2013
Management need to know the trade offs
JAXLONDON2013
Lessons learned
• Keep the business informed – explain the tradeoffs in simple terms
• Sing from the same hymn sheet
• Make sure there solutions in place for every use case from the beginning
JAXLONDON2013
People who can attempt to query MySQL People who can
attempt to query Cassandra
JAXLONDON2013
Conclusions
JAXLONDON2013
We like Cassandra
• Solid design
• HA characteristics
• Easy multi-DC setup
• Simplicity of operation
JAXLONDON2013
Lessons for successful adoption
• Have an advocate, sell the dream
• Learn the fundamentals, get the best out of Cassandra
• Invest in tools to make life easier
• Keep management in the loop, explain the trade offs
JAXLONDON2013
The future
• We will continue to invest in Cassandra as we expand globally
• We will hire people with experience running Cassandra
• We will focus on expanding our reporting facilities
• We aspire to extend our network (1M consumer installs, wallet) beyond cabs
• We will continue to hire the best engineers in London, NYC and Asia
JAXLONDON2013
Questions?