SECTION
2nd biggest smartphone market4 billion $ in VC funding in 20152 M phones sold per month
4
India = Mobile
…what does it look like?
By the numbers…• 18 M Global MAUs• 300 M streams a month• 14 M India MAUs• 25+ M tracks on Saavn• 9x DAU growth in 24 Months
18%
64%
8%
11%
Web iOS Android Mobile Web
7
GLOBAL MAUS
• 64% on Android• 63% streams on mobile data• ~50% of registrations via Phone
numbers
90% of our Users are on Mobile
8
S U B S E C T I O NGroup
Top Music Streaming Apps by Active Users, iPhone & Android, 2015
SOURCE: APP ANNIE, 2015 ©8
11
Targetable features
G E OA R T I S T S A N D G E N R E S T I M E B A N D
F I T N E S S E N T H U S I A S TG E N D E R
D E V I C E SO P E R AT I N G S Y S T E M
I N T E R N AT I O N A LT R AV E L E R S
Push notificationsPrimary growth driver for mobile appsSend 30+ million every dayTargeted notifications get 3x more CTRs
Dispatch push notifications AND store them in an inboxDeliver at users local time Context is key - delivery should be fast!Update millions of notification inboxes while serving trafficIdentify cohorts in real timeDelete ‘expired’ notifications
System characteristics
System characteristicsDispatch push notifications AND store them in an inboxDeliver at users local time Context is key - delivery should be fast!Update millions of notification inboxes while serving trafficIdentify cohorts in real timeDelete ‘expired’ notifications
SECTION
• Move to WiredTiger• Upgrade was surprisingly simple• Supports mixed engine clusters
16
Need granular locks
Inbox SchemaPluses• 1 document per user• Use push/slice to add messages to the
document• Delete expired messages at the app
layer
Minuses• Horrible performance on WiredTiger• Too many updates
{ _id: "", uid: "", messages: [ mid_1 : { text: "Message 1", deeplink: "saavn://message_1", image: "...." }, mid_2: { text: "Message 2", deeplink: "saavn://message_2", image: "...." }, mid_3: { text: "Message 3", deeplink: "saavn://message_3", image: "...." }, ...... ]
}
Take 2 - ReferencingPluses:• No updates• TTLs can delete older records
Minuses:• Two queries per user
Inbox entries { _id: "Obj_1", mid : "mid_1", text: "Message 1", deeplink: "saavn://message_1", image: "...." }, { _id: "Obj_2", mid : "mid_2", text: "Message 2", deeplink: "saavn://message_2", image: "...." }, .....
Inbox map { _id: "", uid: "", mid_obj : "Obj_1", } { _id: "", uid: "", mid_obj : "Obj_2" }, .....
Key LearningsUpgrade to WiredTiger. Document locks are worth it.Ungroup dataTTLs are a boon to temporary data.WiredTiger uses Copy On WriteWiredTiger updates are expensive - avoid push/slice operation.Optimize indexes based on the access pattern.Use Provisioned IOPS on AWS
User activity store18+ M MAU, 40 min avg session time => lot of data!Measuring listening activity is criticalBursty writes and frequent readsGrows linearly with the number of users{1:N} ratio of users to artists, songs, genres etc.DB size > 500gb => Should we shard?
Sharding criteriaShould we shard?• Indexes could no longer fit in memory.
What’s a good shard key?• Uniformly distributed• Hash based on day of registration is a
bad shard key• Random device id can work
How do we migrate 500+ gb of data to a shard?• Pre-shard
Cloud managerGreat for monitoring, backups and upgrades.Easy to setup and works with Slack.Good at managing restarts.Adding new machines into the cluster or transitioning to xfs is not straight forward.
https://www.mongodb.com/cloud
mToolsExcellent suite of tools to debug performance issuesHelps detect and fix bottlenecksSet up cronjobs to use mloginfo and mplotqueries on a regular basis.https://github.com/rueckstiess/mtools
Anticipate behavior - the best products can “mind read” aka Saavn A.I.Intermediate ML models in MongoDBMove frequently accessed map/reduce data to mongo
29
What’s next?