16
Scaling Dubsmash's backend from 0 to 100+ million users PYCON.DE Munich - Daniel Taschik – 10/29/2016

Scaling Dubsmash's backend from 0 to 100+ million users

Embed Size (px)

Citation preview

Page 1: Scaling Dubsmash's backend from 0 to 100+ million users

Scaling Dubsmash's backendfrom 0 to 100+ million users

PYCON.DE Munich - Daniel Taschik – 10/29/2016

Page 2: Scaling Dubsmash's backend from 0 to 100+ million users

We hit a nerve.

>100M Users

192 Countries

1.5BVideos

Page 3: Scaling Dubsmash's backend from 0 to 100+ million users

Dubsmash

Connect Create Communicate

Page 4: Scaling Dubsmash's backend from 0 to 100+ million users

The Start

Page 5: Scaling Dubsmash's backend from 0 to 100+ million users

The Start

 

Backend• Django-powered BE for content

management• web-based Dubloader to add

sounds• deployed on Heroku

Content Delivery• sound files in S3• meta information in JSON file in S3• files served via Cloudfront CDN

Metrics• Dubloader with < 100 req/min• >500 TB! of traffic in January 2015

Page 6: Scaling Dubsmash's backend from 0 to 100+ million users

Dubsmash Service Landscape

 

Backend

Router

S3 sound storage

Cloudfront CDN

Page 7: Scaling Dubsmash's backend from 0 to 100+ million users
Page 8: Scaling Dubsmash's backend from 0 to 100+ million users

New Features: Registration & Search

 

User registration • API based on REST framework• Django user model• store user’s most like sounds• push notifications for new content

Server-side Sound search• new Django-based service• search via ElasticSearch using Haystack• Celery-based indexing on RQ

Metrics• 100.000 registrations within first 24h• >20.000 requests per minute on search service

Page 9: Scaling Dubsmash's backend from 0 to 100+ million users

caching

Cloudfront CDN

S3 sound storage

Dubsmash Service Landscape

 Searc

h

Router

main API

Page 10: Scaling Dubsmash's backend from 0 to 100+ million users

DubTalk

 

Social Graph Service• friend relations on platform• Django • TitanDB on Cassandra• later DynamoDB

DubTalk Service• group & video management• Django

Service Communication• Async via Celery on RabbitMQ• Sync via internal HTTPS API

Metrics• > 50.000 requests per min on both• > 150.000.000 videos stored

Page 11: Scaling Dubsmash's backend from 0 to 100+ million users

NoSQL

Cloudfront CDN

caching

S3 sound storage

Dubsmash Service Landscape

 

GraphDubTalk

Router

Monolith

relational DB

Page 12: Scaling Dubsmash's backend from 0 to 100+ million users

Large Scale Problems

 

favorited sounds outgrew our PostgreSQL

• > 1.000.000.000 favorited sounds• simple data model & access pattern• Premium-7 120GB RAM, 1TB disk instance

dtaschik@unic0rn:~/dubsmash$ heroku pg:table-size -a dubsmash

name | size-----------------------------------+------------ users_favs | 158 GB

dtaschik@unic0rn:~/dubsmash $ heroku pg:index-size -a dubsmashname | size-----------------------------------+------------ users_favs_username_key | 132 GB

ID username

sound_id

1 daniel3 Dzdcjc

2 sarah 3jGYzH

Let’s make it a new service!

Page 13: Scaling Dubsmash's backend from 0 to 100+ million users

Cloudfront CDNS3 sound storage

Dubsmash Service Landscape

 

Auth

Graph DubTalk

Favs

Router

Monolith

relational DB

caching

NoSQL

many more

Page 14: Scaling Dubsmash's backend from 0 to 100+ million users

Our Goal

Interested? Come and join! 😜

Building the largest mobile video communication platform!

Page 15: Scaling Dubsmash's backend from 0 to 100+ million users

Questions?

[email protected] | daniel3 | @dtaschik

Page 16: Scaling Dubsmash's backend from 0 to 100+ million users

Let’s say it with video!Thank you!

[email protected] | daniel3 | @dtaschik