87
SCALING INSTAGRAM INFRA Lisa Guo— March 7th, 2017 [email protected]

SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

SCALING INSTAGRAM INFRALisa Guo— March 7th, 2017 [email protected]

Page 2: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:
Page 3: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

2017

INSTAGRAM HISTORY

2010

2012/4/9joined

Facebook

2014/1

600M users/month

Page 4: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

INSTAGRAM EVERYDAY

400 Million Users

4+ Billion likes

100 Million photo/video uploads

Top account: 110 Million followers

Page 5: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:
Page 6: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

SCALING MEANS

Scale out

Scale up

Scale dev team

Page 7: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

SCALE OUT

Page 8: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

SCALE OUT

Page 9: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:
Page 10: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

SCALE OUT

Page 11: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:
Page 12: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:
Page 13: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

“Let’s all pray that Amazon gets everything sorted out in short order.”

Page 14: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

INSTAGRAM STACK

Tuesday, June 25th, 2013

memcache

RabbitMQ

PostgreSQL

Cassandra

Celery

OtherServicesDjango

Page 15: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

STORAGE VS. COMPUTING

• Storage: needs to be consistent across data centers• Computing: driven by user traffic, as needed basis

Page 16: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

SCALE OUT: STORAGE

Tuesday, June 25th, 2013

user, media, friendship etc

Page 17: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

SCALE OUT: STORAGE

Tuesday, June 25th, 2013

user, media, friendship etc

Master

Replica

ReplicaDjango

Write

Read

Page 18: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

SCALE OUT: STORAGE

Tuesday, June 25th, 2013

user, media, friendship etc

Master

Replica

ReplicaDjango

Write

ReadDC1

DC2

DC3

Page 19: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

SCALE OUT: STORAGE

Tuesday, June 25th, 2013

user feeds, activities etc

Replica

ReplicaReplica

Write - 2Read - 1

Page 20: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

SCALE OUT: STORAGE

Tuesday, June 25th, 2013

user feeds, activities etc

Replica

ReplicaReplica

Write - 2Read - 1

Page 21: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

COMPUTING

Tuesday, June 25th, 2013

Page 22: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

Tuesday, June 25th, 2013

Django

RabbitMQ PostgreSQL

CassandraCelery

Django

RabbitMQPostgreSQL

CassandraCelery

memcacheDC1 DC2memcache

Page 23: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

MEMCACHE

Tuesday, June 25th, 2013

• High performance key-value store in memory• Millions of reads/writes per second• Sensitive to network condition• Cross region operation is prohibitive

No global consistency

Page 24: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

feed

get

Django

User R

DC1

Django

PostgreSQL memcache

User Ccomment

setinsert

Page 25: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

Django

memcache PostgreSQL

User Ccomment

insertset

DC1

Django

memcachePostgreSQL

User R

feed

get

DC2

replication

Page 26: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

Django

memcache PostgreSQL

User Ccomment

insertset

DC1

Django

memcachePostgreSQL

User R

feed

DC2

replication

Cache invalidate

Cache invalidate

get

Page 27: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

COUNTERS

select count(*) from user_likes_media

where media_id=12345;

100s ms

Page 28: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

COUNTER

Tuesday, June 25th, 2013

select count from media_likes where media_id=12345;

10s us

Page 29: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

Cache invalidatedAll djangos try to access DB

Page 30: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

MEMCACHE LEASE

d1 d2 memcache dbtime

lease-get

filllease-get

wait or use stale

read from DB

lease-set

lease-get

hit

Page 31: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

INSTAGRAM STACK - MULTI REGION

Tuesday, June 25th, 2013

Django

RabbitMQ

PostgreSQL

Cassandra

Celery

memcache

Django

RabbitMQ

PostgreSQL

Cassandra

Celery

memcache

DC1 DC2

Page 32: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

SCALING OUT

Tuesday, June 25th, 2013

• Capacity• Reliability• Regional failure ready

Page 33: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

SCALING OUT - CHALLENGES, OPPORTUNITIES

Tuesday, June 25th, 2013

• Beyond North America• More localized social network• Direct messaging• Live streaming

Page 34: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

20

40

60

80

100

0 2 4 6 8 10 12 14 16 18 20 22 24

User growth Server growth

Page 35: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

“Don’t count the servers, make the servers count”

Page 36: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

SCALE UP

Page 37: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

SCALE UP

Use as few CPU instructions as possible

Use as few servers as possible

Page 38: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

SCALE UP

Use as few CPU instructions as possibleUse as few servers as possible

Page 39: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

CPU

Monitor

Optimize

Analyze

Page 40: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

COLLECT

struct perf_event_attr pe;

pe.type = PERF_TYPE_HARDWARE;

pe.config = PERF_COUNT_HW_INSTRUCTIONS;

fd = perf_event_open(&pe, 0, -1, -1, 0);

ioctl(fd, PERF_EVENT_IOC_ENABLE, 0); <code you want to measure> ioctl(fd, PERF_EVENT_IOC_DISABLE, 0); read(fd, &count, sizeof(long long));

Page 41: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

DYNOSTATS

20

40

60

80

100

0 2 4 6 8 10 12 14 16 18 20 22 24

Follow

Feed

Explore

Page 42: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

REGRESSION

20

40

60

80

100

0 2 4 6 8 10 12 14 16 18 20 22 24

Page 43: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

With new feature

Without new feature

Page 44: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:
Page 45: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

CPU

Monitor

Optimize

Analyze

Page 46: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

PYTHON CPROFILE

import cProfile, pstats, StringIO pr = cProfile.Profile()

pr.enable() # ... do something ... pr.disable() s = StringIO.StringIO() sortby = 'cumulative' ps = pstats.Stats(pr, stream=s).sort_stats(sortby) ps.print_stats() print s.getvalue()

Page 47: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:
Page 48: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

CPU - ANALYZEcontinuous profiling

generate_profile explore --start <start-time> --duration <minutes>

Page 49: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

CPU - ANALYZEcontinuous profiling

20

40

60

80

100

0 2 4 6 8 10 12 14 16 18 20 22 24

Caller

Callee

Callee

Page 50: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:
Page 51: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

CPU

Monitor

Optimize

Analyze

Page 52: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

igcdn-photos-d-a.akamaihd.net/hphotos-ak-xpl1/t51.2885-19/s300x300/12345678_1234567890_987654321_a.jpg

Page 53: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

igcdn-photos-d-a.akamaihd.net/hphotos-ak-xpl1/t51.2885-19/s150x150/12345678_1234567890_987654321_a.jpg

igcdn-photos-d-a.akamaihd.net/hphotos-ak-xpl1/t51.2885-19/s400x600/12345678_1234567890_987654321_a.jpg

igcdn-photos-d-a.akamaihd.net/hphotos-ak-xpl1/t51.2885-19/s200x200/12345678_1234567890_987654321_a.jpg

igcdn-photos-d-a.akamaihd.net/hphotos-ak-xpl1/t51.2885-19/s300x300/12345678_1234567890_987654321_a.jpg

Page 54: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

CPU - OPTIMIZE

Page 55: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

igcdn-photos-d-a.akamaihd.net/hphotos-ak-xpl1/t51.2885-19/s300x300/12345678_1234567890_987654321_a.jpg

150x150

400x600

200x200

Page 56: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

CPU - OPTIMIZE

C is really faster

• Candidate functions:• Used extensively• Stable

• Cython or C/C++

Page 57: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

Use as few CPU instructions as possible

Use as few servers as possible

SCALE UP

Page 58: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

ONE WEB SERVER

Process 1

SharedMemory

PrivateMemory

Process N

Page 59: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

SCALE UP: MEMORY

• Run in optimized mode (-O)• Remove dead code

Reduce code

Page 60: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

SCALE UP: MEMORY

• Move configuration into shared memory• Disable garbage collection

Share more

Page 61: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

SCALE UP: MEMORY

20+% capacity increase

Page 62: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

SCALE UP: NETWORK LATENCY

Synchronous processing model with long latency

===> Worker starvation and fewer CPU instr executed

Page 63: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

Stories

FeedDjango

Feed

Stories

SuggestedUsers

ASYNC IO

Page 64: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

Use as few CPU instructions as possible

Use as few servers as possible

Scale up

Page 65: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

SCALE UP: CHALLENGES, OPPORTUNITIES

• Faster python run-time• Async web framework• Better memory analysis• etc etc

Page 66: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

SCALE DEV TEAM

Page 67: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

SCALING TEAM

30% engineers joined in last 6 months

Bootcampers - 1 week

Hack-A-Month - 4 weeks

Intern - 12 weeks

Page 68: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

Comment Filtering

Self-harm Prevention

Windows App

Multiple media in one post

Video View Notification

Saved Posts

First Story Notification

Instagram Live

Instagram Stories

Page 69: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

Which server?

NewTable or New Column?

What Index?Should

I cache it?

Will I lock up DB?

Will I bring down Instagram?

Page 70: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

WHAT WE WANT

• Automatically handle cache• Define relations, not worry about implementations• Self service by product engineers• Infra focuses on scale

Page 71: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

TAO

USER1

USER2

USER3mediaposted

posted bylikes

liked by

likes

liked by

Page 72: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

Comment Filtering

Self-harm Prevention

Windows App

Multiple media in one post

Video View Notification

Saved Posts

First Story Notification

Instagram Live

Instagram Stories

Page 73: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

SOURCE CONTROL

Master

Live

Direct

Page 74: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

SOURCE CONTROL

• Context switching• Code sync/merge overhead• Surprises• Refactor/major upgrade• Performance tracking harder

With branches

Page 75: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

SOURCE CONTROL

Master

Live

Direct

Page 76: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

SOURCE CONTROL

Master Live Direct

Page 77: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

SOURCE CONTROL

• Continous integration• Collaborate easily• Fast bisect and revert• Continuous performance monitoring

No branches

Page 78: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

FEATURE LAUNCH

Engineers

Employees

Dogfooder

Some demographics

World

Page 79: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

FEATURE LOAD TEST

Page 80: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

Once a

40-60 rollouts per day

daydiffweek?!!

Page 81: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

CHECKS AND BALANCES

Code reviewunittest

Code acceptedcommitted Canary To the Wild

Page 82: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:
Page 83: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

SCALING MEANS

Scale out

Scale up

Scale dev team

Page 84: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:
Page 85: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

TAKEAWAYS

Scaling is everybody’s responsibility

Scaling is continuous effort

Scaling is multi-dimensional

Page 86: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:

QUESTIONS?

Page 87: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage: