185
Scaling Instagram AirBnB Tech Talk 2012 Mike Krieger Instagram

Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

Embed Size (px)

DESCRIPTION

on TechCrunch http://techcrunch.com/2012/04/12/how-to-scale-a-1-billion-startup-a-guide-from-instagram-co-founder-mike-krieger

Citation preview

Page 1: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

Scaling InstagramAirBnB Tech Talk 2012

Mike KriegerInstagram

Page 2: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

me

- Co-founder, Instagram

- Previously: UX & Front-end@ Meebo

- Stanford HCI BS/MS

- @mikeyk on everything

Page 3: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram
Page 4: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram
Page 5: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram
Page 6: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

communicating and sharing in the real world

Page 7: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

30+ million users in less than 2 years

Page 8: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

the story of how we scaled it

Page 9: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

a brief tangent

Page 10: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

the beginning

Page 11: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

Text

Page 12: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

2 product guys

Page 13: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

no real back-end experience

Page 14: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

analytics & python @ meebo

Page 15: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

CouchDB

Page 16: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

CrimeDesk SF

Page 17: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram
Page 18: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

let’s get hacking

Page 19: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

good components in place early on

Page 20: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

...but were hosted on a single machine

somewhere in LA

Page 21: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram
Page 22: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

less powerful than my MacBook Pro

Page 23: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

okay, we launched.now what?

Page 24: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

25k signups in the first day

Page 25: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

everything is on fire!

Page 26: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

best & worst day of our lives so far

Page 27: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

load was through the roof

Page 28: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

first culprit?

Page 29: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram
Page 30: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

favicon.ico

Page 31: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

404-ing on Django, causing tons of errors

Page 32: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

lesson #1: don’t forget your favicon

Page 33: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

real lesson #1: most of your initial scaling

problems won’t be glamorous

Page 34: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

favicon

Page 35: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

ulimit -n

Page 36: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

memcached -t 4

Page 37: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

prefork/postfork

Page 38: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

friday rolls around

Page 39: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

not slowing down

Page 40: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

let’s move to EC2.

Page 41: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram
Page 42: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram
Page 43: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

scaling = replacing all components of a car

while driving it at 100mph

Page 44: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

since...

Page 45: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

“"canonical [architecture] of an early stage startup

in this era."(HighScalability.com)

Page 46: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

Nginx & Redis & Postgres & Django.

Page 47: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

Nginx & HAProxy & Redis & Memcached &Postgres & Gearman &Django.

Page 48: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

24h Ops

Page 49: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram
Page 50: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram
Page 51: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

our philosophy

Page 52: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

1 simplicity

Page 53: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

2 optimize for minimal operational

burden

Page 54: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

3 instrument everything

Page 55: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

walkthrough:1 scaling the database2 choosing technology3 staying nimble4 scaling for android

Page 56: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

1 scaling the db

Page 57: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

early days

Page 58: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

django ORM, postgresql

Page 59: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

why pg? postgis.

Page 60: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

moved db to its own machine

Page 61: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

but photos kept growing and growing...

Page 62: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

...and only 68GB of RAM on biggest machine in EC2

Page 63: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

so what now?

Page 64: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

vertical partitioning

Page 65: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

django db routers make it pretty easy

Page 66: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

def db_for_read(self, model): if app_label == 'photos': return 'photodb'

Page 67: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

...once you untangle all your foreign key

relationships

Page 68: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

a few months later...

Page 69: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

photosdb > 60GB

Page 70: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

what now?

Page 71: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

horizontal partitioning!

Page 72: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

aka: sharding

Page 73: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

“surely we’ll have hired someone experienced before we actually need

to shard”

Page 74: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

you don’t get to choose when scaling challenges

come up

Page 75: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

evaluated solutions

Page 76: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

at the time, none were up to task of being our

primary DB

Page 77: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

did in Postgres itself

Page 78: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

what’s painful about sharding?

Page 79: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

1 data retrieval

Page 80: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

hard to know what your primary access patterns will be w/out any usage

Page 81: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

in most cases, user ID

Page 82: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

2 what happens if one of your shards

gets too big?

Page 83: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

in range-based schemes (like MongoDB), you split

Page 84: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

A-H: shard0I-Z: shard1

Page 85: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

A-D: shard0E-H: shard2I-P: shard1Q-Z: shard2

Page 86: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

downsides (especially on EC2): disk IO

Page 87: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

instead, we pre-split

Page 88: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

many many many (thousands) of logical

shards

Page 89: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

that map to fewer physical ones

Page 90: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

// 8 logical shards on 2 machines

user_id % 8 = logical shard

logical shards -> physical shard map

{ 0: A, 1: A, 2: A, 3: A, 4: B, 5: B, 6: B, 7: B}

Page 91: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

// 8 logical shards on 2 4 machines

user_id % 8 = logical shard

logical shards -> physical shard map

{ 0: A, 1: A, 2: C, 3: C, 4: B, 5: B, 6: D, 7: D}

Page 92: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

little known but awesome PG feature: schemas

Page 93: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

not “columns” schema

Page 94: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

- database: - schema: - table: - columns

Page 95: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

machineA: shard0 photos_by_user shard1 photos_by_user shard2 photos_by_user shard3 photos_by_user

Page 96: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

machineA: shard0 photos_by_user shard1 photos_by_user shard2 photos_by_user shard3 photos_by_user

machineA’: shard0 photos_by_user shard1 photos_by_user shard2 photos_by_user shard3 photos_by_user

Page 97: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

machineA: shard0 photos_by_user shard1 photos_by_user shard2 photos_by_user shard3 photos_by_user

machineC: shard0 photos_by_user shard1 photos_by_user shard2 photos_by_user shard3 photos_by_user

Page 98: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

can do this as long as you have more logical shards than physical

ones

Page 99: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

lesson: take tech/tools you know and try first to adapt them into a simple

solution

Page 100: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

2 which tools where?

Page 101: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

where to cache / otherwise denormalize

data

Page 102: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

we <3 redis

Page 103: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

what happens when a user posts a photo?

Page 104: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

1 user uploads photo with (optional) caption

and location

Page 105: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

2 synchronous write to the media database for

that user

Page 106: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

3 queues!

Page 107: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

3a if geotagged, async worker POSTs to Solr

Page 108: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

3b follower delivery

Page 109: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

can’t have every user who loads her timeline

look up all their followers and then their photos

Page 110: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

instead, everyone gets their own list in Redis

Page 111: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

media ID is pushed onto a list for every person

who’s following this user

Page 112: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

Redis is awesome for this; rapid insert, rapid

subsets

Page 113: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

when time to render a feed, we take small # of IDs, go look up info in

memcached

Page 114: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

Redis is great for...

Page 115: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

data structures that are relatively bounded

Page 116: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

(don’t tie yourself to a solution where your in-

memory DB is your main data store)

Page 117: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

caching complex objects where you want to more

than GET

Page 118: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

ex: counting, sub-ranges, testing membership

Page 119: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

especially when Taylor Swift posts live from the

CMAs

Page 120: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

follow graph

Page 121: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

v1: simple DB table(source_id, target_id,

status)

Page 122: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

who do I follow?who follows me?

do I follow X?does X follow me?

Page 123: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

DB was busy, so we started storing parallel

version in Redis

Page 124: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

follow_all(300 item list)

Page 125: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

inconsistency

Page 126: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

extra logic

Page 127: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

so much extra logic

Page 128: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

exposing your support team to the idea of cache invalidation

Page 129: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram
Page 130: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

redesign took a page from twitter’s book

Page 131: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

PG can handle tens of thousands of requests, very light memcached

caching

Page 132: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

two takeaways

Page 133: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

1 have a versatile complement to your core data storage (like Redis)

Page 134: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

2 try not to have two tools trying to do the

same job

Page 135: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

3 staying nimble

Page 136: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

2010: 2 engineers

Page 137: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

2011: 3 engineers

Page 138: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

2012: 5 engineers

Page 139: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

scarcity -> focus

Page 140: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

engineer solutions that you’re not constantly returning to because

they broke

Page 141: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

1 extensive unit-tests and functional tests

Page 142: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

2 keep it DRY

Page 143: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

3 loose coupling using notifications / signals

Page 144: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

4 do most of our work in Python, drop to C when

necessary

Page 145: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

5 frequent code reviews, pull requests to keep things in the ‘shared

brain’

Page 146: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

6 extensive monitoring

Page 147: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

munin

Page 148: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

statsd

Page 149: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram
Page 150: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

“how is the system right now?”

Page 151: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

“how does this compare to historical trends?”

Page 152: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

scaling for android

Page 153: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

1 million new users in 12 hours

Page 154: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

great tools that enable easy read scalability

Page 155: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

redis: slaveof <host> <port>

Page 156: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

our Redis framework assumes 0+ readslaves

Page 157: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

tight iteration loops

Page 158: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

statsd & pgfouine

Page 159: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

know where you can shed load if needed

Page 160: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

(e.g. shorter feeds)

Page 161: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

if you’re tempted to reinvent the wheel...

Page 162: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

don’t.

Page 163: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

“our app servers sometimes kernel panic

under load”

Page 164: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

...

Page 165: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

“what if we write a monitoring daemon...”

Page 166: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

wait! this is exactly what HAProxy is great at

Page 167: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

surround yourself with awesome advisors

Page 168: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

culture of openness around engineering

Page 169: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

give back; e.g. node2dm

Page 170: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

focus on making what you have better

Page 171: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

“fast, beautiful photo sharing”

Page 172: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

“can we make all of our requests 50% the time?”

Page 173: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

staying nimble = remind yourself of what’s

important

Page 174: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

your users around the world don’t care that you

wrote your own DB

Page 175: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

wrapping up

Page 176: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

unprecedented times

Page 177: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

2 backend engineers can scale a system to

30+ million users

Page 178: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

key word = simplicity

Page 179: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

cleanest solution with the fewest moving parts as

possible

Page 180: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

don’t over-optimize or expect to know ahead of time how site will scale

Page 181: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

don’t think “someone else will join & take care

of this”

Page 182: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

will happen sooner than you think; surround yourself with great

advisors

Page 183: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

when adding software to stack: only if you have to, optimizing for operational

simplicity

Page 184: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

few, if any, unsolvable scaling challenges for a

social startup

Page 185: Mike Krieger, Instagram at the Airbnb tech talk, on Scaling Instagram

have fun