30
MAY 2016

MongoDB World 2016: Scaling Targeted Notifications in the Music Streaming World - The Story of MongoDB and Saavn

  • Upload
    mongodb

  • View
    697

  • Download
    1

Embed Size (px)

Citation preview

MAY 2016

Sri ManjunathCTO, Saavn

I N D I A ’ S M U S I C S T R E A M I N G S E R V I C ESaavn

3

SECTION

2nd biggest smartphone market4 billion $ in VC funding in 20152 M phones sold per month

4

India = Mobile

…what does it look like?

SECTION5

Saturday phone shopping

By the numbers…• 18 M Global MAUs• 300 M streams a month• 14 M India MAUs• 25+ M tracks on Saavn• 9x DAU growth in 24 Months

18%

64%

8%

11%

Web iOS Android Mobile Web

7

GLOBAL MAUS

• 64% on Android• 63% streams on mobile data• ~50% of registrations via Phone

numbers

90% of our Users are on Mobile

8

S U B S E C T I O NGroup

Top Music Streaming Apps by Active Users, iPhone & Android, 2015

SOURCE: APP ANNIE, 2015 ©8

S U B S E C T I O NMongoDB’s role

10

P U S H N O T I F I C A T I O N S , E M A I L S A N D A D SGoal: System to target users

11

Targetable features

G E OA R T I S T S A N D G E N R E S T I M E B A N D

F I T N E S S E N T H U S I A S TG E N D E R

D E V I C E SO P E R AT I N G S Y S T E M

I N T E R N AT I O N A LT R AV E L E R S

Push notificationsPrimary growth driver for mobile appsSend 30+ million every dayTargeted notifications get 3x more CTRs

Dispatch push notifications AND store them in an inboxDeliver at users local time Context is key - delivery should be fast!Update millions of notification inboxes while serving trafficIdentify cohorts in real timeDelete ‘expired’ notifications

System characteristics

Architecture

System characteristicsDispatch push notifications AND store them in an inboxDeliver at users local time Context is key - delivery should be fast!Update millions of notification inboxes while serving trafficIdentify cohorts in real timeDelete ‘expired’ notifications

SECTION

• Move to WiredTiger• Upgrade was surprisingly simple• Supports mixed engine clusters

16

Need granular locks

Inbox SchemaPluses• 1 document per user• Use push/slice to add messages to the

document• Delete expired messages at the app

layer

Minuses• Horrible performance on WiredTiger• Too many updates

{ _id: "", uid: "", messages: [ mid_1 : { text: "Message 1", deeplink: "saavn://message_1", image: "...." }, mid_2: { text: "Message 2", deeplink: "saavn://message_2", image: "...." }, mid_3: { text: "Message 3", deeplink: "saavn://message_3", image: "...." }, ...... ]

}

Take 2 - ReferencingPluses:• No updates• TTLs can delete older records

Minuses:• Two queries per user

Inbox entries { _id: "Obj_1", mid : "mid_1", text: "Message 1", deeplink: "saavn://message_1", image: "...." }, { _id: "Obj_2", mid : "mid_2", text: "Message 2", deeplink: "saavn://message_2", image: "...." }, .....

Inbox map { _id: "", uid: "", mid_obj : "Obj_1", } { _id: "", uid: "", mid_obj : "Obj_2" }, .....

INBOX SCHEMA19

Key LearningsUpgrade to WiredTiger. Document locks are worth it.Ungroup dataTTLs are a boon to temporary data.WiredTiger uses Copy On WriteWiredTiger updates are expensive - avoid push/slice operation.Optimize indexes based on the access pattern.Use Provisioned IOPS on AWS

Architecture

User activity store18+ M MAU, 40 min avg session time => lot of data!Measuring listening activity is criticalBursty writes and frequent readsGrows linearly with the number of users{1:N} ratio of users to artists, songs, genres etc.DB size > 500gb => Should we shard?

Standalone Architecture

User activity store

Sharding criteriaShould we shard?• Indexes could no longer fit in memory.

What’s a good shard key?• Uniformly distributed• Hash based on day of registration is a

bad shard key• Random device id can work

How do we migrate 500+ gb of data to a shard?• Pre-shard

26

Tools

Cloud managerGreat for monitoring, backups and upgrades.Easy to setup and works with Slack.Good at managing restarts.Adding new machines into the cluster or transitioning to xfs is not straight forward.

https://www.mongodb.com/cloud

mToolsExcellent suite of tools to debug performance issuesHelps detect and fix bottlenecksSet up cronjobs to use mloginfo and mplotqueries on a regular basis.https://github.com/rueckstiess/mtools

Anticipate behavior - the best products can “mind read” aka Saavn A.I.Intermediate ML models in MongoDBMove frequently accessed map/reduce data to mongo

29

What’s next?

Thank you!@eagleshack