42
timelines at scale @ra qcon sf 2012

Timelines at scale

  • Upload
    viet-nt

  • View
    353

  • Download
    5

Embed Size (px)

DESCRIPTION

Raffi Krikorian explains the architecture used by Twitter to deal with 300K queries per second - tweets, social graph mutations, and direct messages

Citation preview

Page 1: Timelines at scale

timelines at scale

@ra!qcon sf 2012

Page 2: Timelines at scale
Page 3: Timelines at scale
Page 4: Timelines at scale

Pull Push

Targeted twitter.comhome_timeline API

User / Site StreamsMobile Push (SMS, etc.)

Queried Search API Track / Follow Streams

Page 5: Timelines at scale

the challenge⇢> 150M world wide active users

⇢> 300K QPS for timelines

⇢naïve timeline “materialization” can be slow

Page 6: Timelines at scale

Timeline Service

Ingester

Sear

ch C

ache

RedisRedisEarlybird

Blender

Push

Com

pute

HTTP Push

Mobile Push B

atch

Com

pute

Hadoop

Write API

Fanout

RedisRedis

Tim

elin

e C

ache

Redis

Page 7: Timelines at scale

Timeline Service

Ingester

Sear

ch C

ache

RedisRedisEarlybird

Blender

Push

Com

pute

HTTP Push

Mobile Push B

atch

Com

pute

Hadoop

Write API

Fanout

RedisRedis

Tim

elin

e C

ache

Redis

Social Graph Service

Page 8: Timelines at scale

Timeline Service

Ingester

Sear

ch C

ache

RedisRedisEarlybird

Blender

Push

Com

pute

HTTP Push

Mobile Push B

atch

Com

pute

Hadoop

Write API

Fanout

RedisRedis

Tim

elin

e C

ache

Redis

Social Graph Service

insert⇢keyed o"

“recipient”

⇢pipelined 4k “destinations” at a time

⇢replicated

Page 9: Timelines at scale

Timeline Service

Ingester

Sear

ch C

ache

RedisRedisEarlybird

Blender

Push

Com

pute

HTTP Push

Mobile Push B

atch

Com

pute

Hadoop

Write API

Fanout

RedisRedis

Tim

elin

e C

ache

Redis

using redis⇢native list

structure

Tweet ID BitsUser ID

8 bytes 4 bytes8 bytes

Page 10: Timelines at scale

Timeline Service

Ingester

Sear

ch C

ache

RedisRedisEarlybird

Blender

Push

Com

pute

HTTP Push

Mobile Push B

atch

Com

pute

Hadoop

Write API

Fanout

RedisRedis

Tim

elin

e C

ache

Redis

using redis⇢native list

structure

⇢RPUSHX to only add to cached timelines

Tweet ID BitsUser ID

Tweet ID BitsUser ID

Tweet ID BitsUser ID

Tweet ID BitsUser ID

Tweet ID BitsUser ID

Tweet ID BitsUser ID

Tweet ID BitsUser ID

Tweet ID BitsUser ID

Tweet ID BitsUser ID

Tweet ID BitsUser ID Tweet ID

Tweet ID

Tweet ID

Page 11: Timelines at scale

Ingester

Sear

ch C

ache

RedisRedisEarlybird

Blender

Push

Com

pute

HTTP Push

Mobile Push B

atch

Com

pute

Hadoop

Write API

Fanout

RedisRedis

Tim

elin

e C

ache

Timeline Service

Redis

Page 12: Timelines at scale

Timeline Service

Write API

Fanout

RedisRedis

Tim

elin

e C

ache

Redis

TweetyPieGizmoduck

Page 13: Timelines at scale

Pull Push

Targeted twitter.comhome_timeline API

User / Site StreamsMobile Push (SMS, etc.)

Queried Search API Track / Follow Streams

Page 14: Timelines at scale

Ingester

Sear

ch C

ache

RedisRedisEarlybird

Blender

Push

Com

pute

HTTP Push

Mobile Push B

atch

Com

pute

Hadoop

Write API

Fanout

RedisRedis

Tim

elin

e C

ache

Timeline Service

Redis

Page 15: Timelines at scale

Push

Com

pute

HTTP Push

Mobile Push B

atch

Com

pute

HadoopSe

arch

Inde

x

Blender

Redis

Timeline Service

Ingester

Earlybird

Write API

Fanout

RedisRedis

Tim

elin

e C

ache

RedisEarlybird

blender⇢queries one

replica of all indexes

⇢merges & ranks results

Page 16: Timelines at scale

Push

Com

pute

HTTP Push

Mobile Push B

atch

Com

pute

HadoopSe

arch

Inde

x

Blender

Redis

Timeline Service

Ingester

Earlybird

Write API

Fanout

RedisRedis

Tim

elin

e C

ache

RedisEarlybird

Page 17: Timelines at scale

Write API Redis

Redis

Redis

Write API Earlybird

Earlybird

Earlybird

API

Cach

e

Read API Redis

Redis

Redis

Read API Earlybird

Earlybird

EarlybirdAPI

Cach

e⇢O(n) write

⇢O(1) write

⇢O(1) read

⇢O(n) read

Page 18: Timelines at scale

the challenge (part #2)⇢fanout can be really slow!

⇢ ...especially for high follower counts

Page 19: Timelines at scale
Page 20: Timelines at scale

@barackobama23 million followers

31 million followers

@katyperry28 million followers

@justinbieber28 million followers

@ra!0.019 million followers

@ladygaga

Page 21: Timelines at scale

there are over400 million tweetsa day

Page 22: Timelines at scale

a second4600 tweets

0.2 msa tweet≈

Page 23: Timelines at scale
Page 24: Timelines at scale

Write API

Ingester Fanout

Sear

ch In

dex

RedisEarlybird

EarlybirdRedis

RedisRedis

Tim

elin

e C

ache

search index ⇢[‘hello’,‘world’]

fanout index ⇢[@danadanger, ...]

Page 25: Timelines at scale

User Intent Query Expansion

“Hello, world” “Hello” AND “world”

@ra!’s home timeline home_timeline:ra!

Page 26: Timelines at scale

User Intent Query Expansion

“Hello, world” “Hello” AND “world”

@ra!’s home timelineuser_timeline:nelson

ORuser_timeline:danadanger

Page 27: Timelines at scale

User Intent Query Expansion

“Hello, world” “Hello” AND “world”

@ra!’s home timeline home_timeline:ra!

Page 28: Timelines at scale

User Intent Query Expansion

“Hello, world” “Hello” AND “world”

@ra!’s home timelinehome_timeline:ra!

ORuser_timeline:taylorswift13

Page 29: Timelines at scale

Bat

ch C

ompu

te

Hadoop

Push

Com

pute

HTTP Push

Sear

ch In

dex

Blender

Redis

Timeline Service

Ingester

Earlybird

Write API

Fanout

RedisRedis

Tim

elin

e C

ache

RedisEarlybirdMobile Push

Page 30: Timelines at scale

Asynchronous Path

Query Path

Bat

ch C

ompu

te

Hadoop

Synchronous Path

Push

Com

pute

HTTP Push

Sear

ch In

dex

Blender

Redis

Timeline Service

Ingester

Earlybird

Write API

Fanout

RedisRedis

Tim

elin

e C

ache

RedisEarlybirdMobile Push

Page 31: Timelines at scale

Synchronous Path

Query Path

Bat

ch C

ompu

te

Hadoop

Asynchronous Path

Push

Com

pute

HTTP Push

Sear

ch In

dex

Blender

Redis

Timeline Service

Ingester

Earlybird

Write API

Fanout

RedisRedis

Tim

elin

e C

ache

RedisEarlybirdMobile Push

Page 32: Timelines at scale

Asynchronous Path

Synchronous Path

Bat

ch C

ompu

te

Hadoop

Query Path

Push

Com

pute

HTTP Push

Sear

ch In

dex

Blender

Redis

Timeline Service

Ingester

Earlybird

Write API

Fanout

RedisRedis

Tim

elin

e C

ache

RedisEarlybirdMobile Push

Page 33: Timelines at scale

timeline query statistics⇢>150m active users worldwide

⇢>300k qps poll-based timelines @ 1ms p50 / 4ms p99

⇢>30k qps search-based timelines

Page 34: Timelines at scale

tweet input⇢~400m tweets per day

⇢~5K/sec daily average

⇢~7K/sec daily peak

⇢>12K/sec during large events

Page 35: Timelines at scale

timeline delivery statistics⇢30b deliveries / day (~21m / min)

⇢3.5 seconds @ p50 to deliver to 1m

⇢~300k deliveries / sec

Page 36: Timelines at scale
Page 37: Timelines at scale
Page 38: Timelines at scale
Page 39: Timelines at scale
Page 40: Timelines at scale
Page 41: Timelines at scale
Page 42: Timelines at scale

thanks!