Upload
acunu
View
1.218
Download
1
Embed Size (px)
DESCRIPTION
Hailo, the taxi app, has served more than 5 million passengers in 15 cities and has taken fares of $100 million this year. I'm going to talk about how that rapid growth has been powered by a platform based on Cassandra and operational analytics and insights powered by Acunu Analytics. I'll cover some challenges and lessons learned from scaling fast!
Citation preview
ALL YOUR BASE 2013
Acunu Analytics and Cassandra at Hailo
Tim Moreton, CTO at AcunuDavid Gardner, Architect at Hailo
ALL YOUR BASE 2013
Dave
ALL YOUR BASE 2013
What is Hailo?
Hailo is The Taxi Magnet. Use Hailo to get a cab wherever you are, whenever you want.
ALL YOUR BASE 2013
ALL YOUR BASE 2013
• The world’s highest-rated taxi app – over 11,000 five-star reviews
• Over 500,000 registered passengers
• A Hailo hail is accepted around the world every 4 seconds
• Hailo operates in 15 cities on 3 continents from Tokyo to Toronto in nearly 2 years of operation
What is Hailo?
ALL YOUR BASE 2013
The history
The story behind Cassandra and Acunu adoption at Hailo
ALL YOUR BASE 2013
Hailo launched in London in November 2011 • Launched on AWS
• Two PHP/MySQL web apps plus a Java backend
• Mostly built by a team of 3 or 4 backend engineers
• MySQL multi-master for single AZ resilience
• Get/create/update entity
• Analytics
• Text search
ALL YOUR BASE 2013
Why Cassandra?• A desire for greater resilience – “become a utility”
Cassandra is designed for high availability
• Plans for international expansion around a single consumer appCassandra is good at global replication
• Expected growthCassandra scales linearly for both reads and writes
• Prior experienceI had experience with Cassandra and could recommend it
ALL YOUR BASE 2013
The path to adoption• Largely unilateral decision by developers – a result of a startup
culture
• Replacement of key consumer app functionality, splitting up the PHP/MySQL web app into a mixture of global PHP/Java services backed by a Cassandra data store
• Launched into production in September 2012 – originally just powering North American expansion, before gradually switching over Dublin and London
ALL YOUR BASE 2013
One year on...• Further decompose functionality into Go/Java SOA
• Migrating:
Entity databases to Cassandra
Analytics to Acunu
Search into Elastic Search
ALL YOUR BASE 2013
Cassandra
ALL YOUR BASE 2013
We like Cassandra• Solid design
• HA characteristics
• Easy multi-DC setup
• Simplicity of operation
ALL YOUR BASE 2013
“Cassandra just works”
Dom W, Senior Engineer
ALL YOUR BASE 2013
CF = customers
126007613634425612: createdTimestamp: 1370465412 email: [email protected] givenName: Dave familyName: Gardner locale: en_GB phone: +447911111111
ALL YOUR BASE 2013
Considerations for entity storage• Do not read the entire entity, update one property and then write
back a mutation containing every column
• Only mutate columns that have been set
• This avoids read-before-write race conditions
ALL YOUR BASE 2013
ALL YOUR BASE 2013
CF = stats_db
2013-06-01: 55374fa0-ce2b-11e2-8b8b-0800200c9a66: {“action”:”… a48bd800-ce2b-11e2-8b8b-0800200c9a66: {“action”:”… b0e15850-ce2b-11e2-8b8b-0800200c9a66: {“action”:”… bfac6c80-ce2b-11e2-8b8b-0800200c9a66: {“action”:”…
ALL YOUR BASE 2013
CF = stats_db
LON123456: 13b247f0-ce2c-11e2-8b8b-0800200c9a66: {“action”:”… 20f70a40-ce2c-11e2-8b8b-0800200c9a66: {“action”:”… 2b44d3b0-ce2c-11e2-8b8b-0800200c9a66: {“action”:”… 338a22f0-ce2c-11e2-8b8b-0800200c9a66: {“action”:”…
ALL YOUR BASE 2013
ALL YOUR BASE 2013
Considerations for time series storage• Choose row key carefully, since this partitions the records
• Think about how many records you want in a single row
• Denormalise on write into many indexes/views
ALL YOUR BASE 2013
ALL YOUR BASE 2013
10 Average years experience per team member
MySQL Cassandra
ALL YOUR BASE 2013
ALL YOUR BASE 2013#CASSANDRAEU CASSANDRASUMMITEU
People who canattempt to queryMySQL
People who canattempt to
query Cassandra
ALL YOUR BASE 2013
ALL YOUR BASE 2013
Acunu Analytics
ALL YOUR BASE 2013
Analytics• With Cassandra we lost the ability to carry out analytics
eg: COUNT, SUM, AVG, GROUP BY
• We use Acunu Analytics to give us this ability in real time, for pre-planned query templates
• It is backed by Cassandra and therefore highly available, resilient and globally distributed
• Integration is straightforward
ALL YOUR BASE 2013
Events
NSQ
ALL YOUR BASE 2013
Events
NSQ
ALL YOUR BASE 2013
Analytics turns events and SQL-like queries into C* operations
Events
NSQ
ALL YOUR BASE 2013
Analytics turns events and SQL-like queries into C* operations
Events
Cassandra stores raw events and intermediate results
NSQ
ALL YOUR BASE 2013
Analytics turns events and SQL-like queries into C* operations
Events
Cassandra stores raw events and intermediate results
Acunu Dashboards provides real-time visualization
AlertsNSQ
ALL YOUR BASE 2013
ALL YOUR BASE 2013
count by day count by hour of day
uniques by hashtag
1 Define aggregate cubesCREATE CUBE APPROX TOP(keyword) WHERE browser, time GROUP BY time
ALL YOUR BASE 2013
count by day count by hour of day
uniques by hashtag
2 New events update cubes
1 Define aggregate cubesCREATE CUBE APPROX TOP(keyword) WHERE browser, time GROUP BY time
ALL YOUR BASE 2013
count by day count by hour of day
uniques by hashtag
2 New events update cubes
1 Define aggregate cubesCREATE CUBE APPROX TOP(keyword) WHERE browser, time GROUP BY time
ALL YOUR BASE 2013
count by day count by hour of day
uniques by hashtagraw events
2 New events update cubes
1 Define aggregate cubesCREATE CUBE APPROX TOP(keyword) WHERE browser, time GROUP BY time
ALL YOUR BASE 2013
count by day count by hour of day
uniques by hashtagraw events
2 New events update cubes
1 Define aggregate cubesCREATE CUBE APPROX TOP(keyword) WHERE browser, time GROUP BY time
3 Rich instant queries over cubesSELECT TOP(keyword) FROM table WHERE browser = ‘chrome’ AND time BETWEEN..GROUP BY d1, d2, ... JOIN ... HAVING.. ORDER BY ..
+
ALL YOUR BASE 2013
count by day count by hour of day
uniques by hashtagraw events
2 New events update cubes
1 Define aggregate cubesCREATE CUBE APPROX TOP(keyword) WHERE browser, time GROUP BY time
3 Rich instant queries over cubesSELECT TOP(keyword) FROM table WHERE browser = ‘chrome’ AND time BETWEEN..GROUP BY d1, d2, ... JOIN ... HAVING.. ORDER BY ..
+
4 Drilldown to raw events
ALL YOUR BASE 2013
count by day count by hour of day
uniques by hashtagraw events
2 New events update cubes
1 Define aggregate cubesCREATE CUBE APPROX TOP(keyword) WHERE browser, time GROUP BY time
3 Rich instant queries over cubesSELECT TOP(keyword) FROM table WHERE browser = ‘chrome’ AND time BETWEEN..GROUP BY d1, d2, ... JOIN ... HAVING.. ORDER BY ..
+
5 Backfill new cubes using historic data
ALL YOUR BASE 2013
AQLSELECT SUM(accepted), SUM(ignored), SUM(declined), SUM(withdrawn)FROM AllocationsWHERE timestamp BETWEEN '1 week ago' AND 'now’ AND driver='LON123456789’GROUP BY timestamp(day)
ALL YOUR BASE 2013
ALL YOUR BASE 2013
Use Cases• Infrastructure and Application monitoring
• Real-time A/B testing of app layout and incentives
• Real time geo-view of supply/demand for drivers
• Several more in the pipeline!
#AYBCONF ALL YOUR BASE 2013#CASSANDRAEU CASSANDRASUMMITEU
Conclusions
ALL YOUR BASE 2013
We like Cassandra and Acunu• Solid design
• HA characteristics
• Easy multi-DC setup
• Simplicity of operation
• With Acunu, rich queries again, easier denormalization
ALL YOUR BASE 2013
Lessons for successful adoption• Have an advocate, sell the dream
• Learn the fundamentals, get the best out of Cassandra
• Invest in tools to make life easier
• Keep management in the loop, explain the trade offs
ALL YOUR BASE 2013
Questions?