45
How Hailo fuels its growth using NoSQL Storage and Analytics David Gardner, Architect @ Hailo #NoSQLNow

How Hailo fuels its growth using NoSQL Storage and Analytics

Embed Size (px)

DESCRIPTION

Hailo is building the world's best taxi app -- we're already in 9 cities worldwide, have 300,000 registered passengers, and are growing (30%+) every month. Of course, that presents a serious infrastructure challenge. I'll explain how we've built our service around tools that have three key NoSQL characteristics -- they're all distributed, resilient and operationally simple. The particular goals we set ourselves were around making it easy to replicate our architecture as we launch in new cities, to scale as we grow in each city, while all the time being able to coordinate that setup in a straightforward way.

Citation preview

Page 1: How Hailo fuels its growth using NoSQL Storage and Analytics

How Hailo fuels its growth using NoSQL Storage and Analytics

David Gardner, Architect @ Hailo

#NoSQLNow

Page 2: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

Page 3: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

Page 4: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

Page 5: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

• The world’s highest-rated taxi app – over 10,000 five-star reviews

• Over 500,000 registered passengers

• A Hailo e-hail is accepted by a driver every four seconds around the world

• Hailo operates in ten cities from Tokyo to Toronto in just over eighteen months of operation

What is Hailo?

Page 6: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

• Hailo is a marketplace that facilitates over $100M in run-rate transactions and is making the world a better place for passengers and drivers

• Hailo has raised over $50M in financing from the world's best investors including Union Square Ventures, Accel, the founder of Skype (via Atomico), Wellington Partners (Spotify), Sir Richard Branson, and our CEO's mother, Janice

Hailo is growing

Page 7: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

• Why Hailo are using NoSQL

• How we use Cassandra

• How we use Acunu Analytics

• Challenges of NoSQL

What this talk is about

Page 8: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

Why choose NoSQL?

Page 9: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

“NoSQL DBs trade off traditional features to better support new and emerging use cases”Andy Gross, Riak

http://www.slideshare.net/argv0/riak-use-cases-dissecting-the-solutions-to-hard-problems

Page 10: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

• More widely used, tested and documented software

• Ad-hoc querying

• Talent pool with direct experience

What are we trading off?

Page 11: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

• High availability

• Scalability

• Operational simplicity

What do we get back in return?

Page 12: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

The path to adoption at Hailo

Page 13: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

Hailo launched in London in November 2011

• Launched on AWS

• Two PHP/MySQL web apps plus a Java backend

• Mostly built by a team of 3 or 4 backend engineers

• MySQL multi-master for single AZ resilience

Page 14: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

Why Cassandra?

• A desire for greater resilience – “become a utility”Cassandra is designed for high availability

• Plans for international expansion around a single consumer appCassandra is good at global replication

• Expected growthCassandra scales linearly for both reads and writes

• Prior experienceI had experience with Cassandra and could recommend it

Page 15: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

The path to adoption

• Largely unilateral decision by developers – a result of a startup culture

• Replacement of key consumer app functionality, splitting up the PHP/MySQL web app into a mixture of global PHP/Java services backed by a Cassandra data store

• Launched into production in September 2012 – originally just powering North American expansion, before gradually switching over Dublin and London

Page 16: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

Cassandra at Hailo

Page 17: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

“Cassandra just works”

Dom W, Senior Engineer

Page 18: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

Use cases

1. Entity storage

2. Time series data

Page 19: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

CF = customers

126007613634425612:createdTimestamp: 1370465412email: [email protected]: DavefamilyName: Gardnerlocale: en_GBphone:

+447911111111

Page 20: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

Considerations for entity storage

• Do not read the entire entity, update one property and then write back a mutation containing every column

• Only mutate columns that have been set

• This avoids read-before-write race conditions

Page 21: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

CF = comms

2013-06-01:55374fa0-ce2b-11e2-8b8b-0800200c9a66:

{“to”:”dave@c…a48bd800-ce2b-11e2-8b8b-0800200c9a66:

{“to”:”foo@ex…b0e15850-ce2b-11e2-8b8b-0800200c9a66:

{“to”:”bar@ho …bfac6c80-ce2b-11e2-8b8b-0800200c9a66:

{“to”:”baz@fo…

Page 22: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

CF = comms

[email protected]:13b247f0-ce2c-11e2-8b8b-0800200c9a66:

{“to”:”dave@c…20f70a40-ce2c-11e2-8b8b-0800200c9a66:

{“to”:”dave@c…2b44d3b0-ce2c-11e2-8b8b-0800200c9a66:

{“to”:”dave@c…338a22f0-ce2c-11e2-8b8b-0800200c9a66:

{“to”:”dave@c…

Page 23: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

Considerations for time series storage

• Choose row key carefully, since this partitions the records

• Think about how many records you want in a single row

• Denormalise on write into many indexes

Page 24: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

Client libraries

• Astyanax (Java)

• phpcassa (PHP)

• github.com/carloscm/gossie (Go)

Page 25: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

Page 26: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

2 clusters

6 machines per region

3 regions

(stats cluster pending addition of third DC) O

pera

tion

al C

luste

rS

tats

Clu

ste

r

ap-southeast-1

us-east-1 eu-west-1

us-east-1 eu-west-1

Page 27: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

AWS VPCs with Open VPN links

3 AZs per region

m1.large machines

Provisoned IOPS EBS

Op

era

tion

al C

luste

rS

tats

Clu

ste

r

~ 600GB/node

~ 100GB/node

Page 28: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

Multi DC

• Something that Cassandra makes trivial

• Would have been very difficult to accomplish active-active inter-DC replication with a team of 2 without Cassandra

• Rolling repair needed to make it safe (we use LOCAL_QUORUM)

• We schedule “narrow repairs” on different nodes in our cluster each night

Page 29: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

Page 30: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

Acunu Analytics at Hailo

Page 31: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

Analytics

• With Cassandra we lost the ability to carry out analyticseg: COUNT, SUM, AVG, GROUP BY

• We use Acunu Analytics to give us this abilty in real time, for pre-planned query templates

• It is backed by Cassandra and therefore highly available, resilient and globally distributed

• Integration is straightforward

Page 32: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

NSQ Acunu C*events

Page 33: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

AQL

SELECT SUM(accepted), SUM(ignored), SUM(declined), SUM(withdrawn)FROM AllocationsWHERE timestamp BETWEEN '1 week ago' AND 'now’ AND driver='LON123456789’GROUP BY timestamp(day)

Page 34: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

Page 35: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

Page 36: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

Challenges

Page 37: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

10 Average years experience per team

member

MySQL Cassandra

Page 38: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

People who canattempt to queryMySQL

People who canattempt to

query Cassandra

Page 39: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

Page 40: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

Lessons learned

• Have an advovate - get someone who will sell the vision internally

• Teach team members the fundamentals of how the solution works

• Don’t cause yourself a “big data” problem unnecessarily

• Explain trade-offs in choosing NoSQL to all parts of the business

• Provide solutions!

Page 41: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

People who canattempt to queryMySQL

People who canattempt to

query Cassandra

Page 42: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

Conclusion

Page 43: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

We like Cassandra

• Solid design

• HA characteristics

• Easy multi-DC setup

• Simplicity of operation

Page 44: How Hailo fuels its growth using NoSQL Storage and Analytics

#NoSQLNow

The future

• We will continue to invest in Cassandra as we expand globally

• We will hire people with experience running Cassandra

• We will focus on expanding our reporting facilities

• We aspire to extend our network (1M consumer installs, wallet) beyond cabs

• We will continue to hire the best engineers in London, NYC and Asia

Page 45: How Hailo fuels its growth using NoSQL Storage and Analytics

Thank you

#NoSQLNow

Come and work with NoSQL full time: jobs.hailocab.com