146
How to Get a Job 35 Million Times a Day Using RabbitMQ Ketan Gangatirkar and Cameron Davison

[@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Embed Size (px)

DESCRIPTION

@IndeedEnd March: Wednesday, March 27th Video available: http://www.youtube.com/watch?v=MeRHetCMiHg The goal of Indeed's aggregation engine is to find and retrieve every job in the world, as quickly and accurately as possible. As we described in our previous tech talk, we strive to build products that are simple, fast, comprehensive, and relevant. The world's most comprehensive job search site is fueled by the more than 35 million job postings we process every day, which we deliver to jobseekers within minutes of discovery. Our original aggregation architecture was implemented using standard patterns. Our growth required levels of scalability, performance, and resilience this architecture simply could not handle. In a case study of scaling for the web, we will discuss how we tackled this problem. We will cover the issues we saw with our original architecture, how we analyzed our options to guide a solution, how we used RabbitMQ as a key component in the new architecture, and benchmarks to evaluate how successful we were. Speaker Ketan Gangatirkar is the development manager responsible for Indeed's continuous deployment infrastructure as well as its aggregation system. Speaker Cameron Davison is a software engineer on the aggregation team at Indeed and a graduate of UT Austin. He re-architected Indeed's aggregation pipeline using RabbitMQ to sustain high write volumes, and continues to improve products in the aggregation system to make it run more efficiently.

Citation preview

Page 1: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

How to Get a Job 35 Million Times a Day Using RabbitMQ

Ketan Gangatirkar and Cameron Davison

Page 2: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

One search. All jobs.

Page 3: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Aggregation gets jobs

Page 4: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Aggregation gets jobs soJobseekers get jobs

Page 5: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Aggregation != Spidering

Spiders see pages.

Aggregation sees jobs.

Page 6: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

How spiders see job sites

Page

Page

Page

Page

Page

Page

Page

Page

Page

Page

Page

Page

Page

Page

Page

Page

Page

Page

Page 7: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

How Indeed sees job sitesStart

Job List

Job Job Job

Job List

Job Job Job

Job List

Job Job Job

Navigation Navigation

JobJob

Job

Page 8: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Aggregation != Spidering

Job sites have structure

Job pages have semantics

Navigation is more than following links

Page 9: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ
Page 10: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Rememberthis

Page 11: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Aggevery

job

Page 12: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ
Page 13: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

{ Url: http://www.applytracking.com/track.aspx/3VYzR Title: Senior Erlang Engineer Company: Machine Zone Location: Palo Alto,CA,US, 94301 Source Type: Employer Job Type: Full-time ... Description: The Senior Erlang Engineer is an integral ... ... Createdate: 2013-02-05 23:18:05 ...}

What's in a job

Page 14: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

location

description

Company

Title

Page 15: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Title

salary

locationjob type

description

Company

Page 16: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ
Page 17: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

How we build products

simple

fast

comprehensive

relevant

Page 18: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Simple

Tough problems, simple solutions

Page 19: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Fast

Discover the jobs quickly

Get them to jobseekers in minutes

Page 20: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

10% of jobseekers sort by date

Page 21: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Do you want only new jobs?

Page 22: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

20% of jobseekers want only new jobs

Page 23: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Daily new job emails

Page 24: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Speed matters

Page 25: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Comprehensive

Get every job

Page 26: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Relevant

Semantic extraction

The job is still available

Ignore non-jobs

Page 27: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

This is a hard problem

Flaky sites

Site redesigns

Javascript

Missing or bad information

Page 28: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Big N makes it even harder

Examine 38M jobs every day

Page 29: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Do this in minutes

Search 100MJobseekersAggregation

EmployersJob BoardsStaffing firmsRecruiters

Page 30: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Strawman* architecture

Datacenter B

MySQL

Engine

Datacenter A

Job site

Engine

Job site

Engine

Job site

Engine

Job site

Engine

Job site

Engine

Job site

Primary Datacenter

Page 31: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Limitations

Page 32: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

N connections

MySQL

Job siteJob siteJob siteJob siteJob siteJob site

Primary Datacenter

EngineEngineEngineEngineEngineEngine

Datacenter BDatacenter A

Page 33: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

N concurrent writers

MySQL

Job siteJob siteJob siteJob siteJob siteJob site

Primary Datacenter

EngineEngineEngineEngineEngineEngine

Datacenter BDatacenter A

Page 34: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

High latency

MySQL

Job siteJob siteJob siteJob siteJob siteJob site

Primary Datacenter

EngineEngineEngineEngineEngineEngine

Datacenter BDatacenter A

Page 35: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Limitation: failure points

Datacenter B

MySQL

Engine

Datacenter A

Job site

Engine

Job site

Engine

Job site

Engine

Job site

Engine

Job site

Engine

Job site

Primary Datacenter

X

X

Page 36: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Scaling Patterns

What has worked for us so far?

Page 37: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Service-Oriented Architecture

Engine

Engine

Engine

Job Write Service MySQL

RemoteDatacenter

PrimaryDatacenter

see http://go.indeed.com/boxcar

Page 38: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Standard Service Interaction

Client Service Database

Page 39: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Our Interaction

Client Service Database

Page 40: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Does this do what we need?

● Lots of workers...● Sending lots of results...● Over a long distance...● That need to get processed fast...● Reliably?

Page 41: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Engine Failure

Engine

Engine

Engine

Job Write Service MySQL

RemoteDatacenter

XPrimaryDatacenter

Page 42: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Engine failure fix:Buffer to disk

Engine

Engine

Engine

Job Write Service MySQL

RemoteDatacenter

disk

disk

disk

PrimaryDatacenter

X

Page 43: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Network Failure

Engine

Engine

Engine

Job Write Service MySQL

RemoteDatacenter

XPrimaryDatacenter

Page 44: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Network failure fix:Disks solve that too

Engine

Engine

Engine

Job Write Service MySQL

RemoteDatacenter

disk

disk

disk

XPrimaryDatacenter

Page 45: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Write Service Failure

Job Write Service MySQL

RemoteDatacenter

XEngine

Engine

Engine

PrimaryDatacenter

Page 46: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Write Service Failure fix:Disks solve that too

Job Write Service MySQL

RemoteDatacenter

XEngine

Engine

Engine

PrimaryDatacenter

disk

disk

disk

Page 47: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Write Service Failure fix:Redundancy

Job Write Service

MySQL

RemoteDatacenter

PrimaryDatacenter

XEngine

Engine

Engine

Job Write Service

Job Write Service

Page 48: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Database Failure

Job Write Service MySQL

RemoteDatacenter

XEngine

Engine

Engine

PrimaryDatacenter

Page 49: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Database Failure fix:Buffer to disk

Job Write Service

MySQL

RemoteDatacenter

XEngine

Engine

Engine

disk

PrimaryDatacenter

Page 50: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Our new architectureJob Write Service

MySQL

RemoteDatacenter

PrimaryDatacenter

Engine

Engine

Engine

disk

disk

disk

Job Write Service

Job Write Service

disk

disk

disk

Page 51: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

We could build this...Job Write Service

MySQL

RemoteDatacenter

PrimaryDatacenter

Engine

Engine

Engine

disk

disk

disk

Job Write Service

Job Write Service

disk

disk

disk

Page 52: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

... maybe someone already hasJob Write Service

MySQL

RemoteDatacenter

PrimaryDatacenter

Engine

Engine

Engine

disk

disk

disk

Job Write Service

Job Write Service

disk

disk

disk

Page 53: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

We should use a message queue

Page 54: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Cameron Davison

Page 55: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Aggregation Requirements

● Durable

● Multi-Data Center (latency)

● 38 million jobs a day

● 2KB average job size○ 76 GB a day

● Target peaks of 1000 jobs / second

● Programming language agnostic

Page 56: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Selection

Page 57: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

What we found

High Availability

Open Source/Free

Self-hosted

Performant

Page 58: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Out-of-the-box Experience

Page 59: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Advanced Message Queuing Protocol (AMQP)

● Open Standard

● Wire protocol

● Existing Clients in Multiple Languages

Page 60: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Concepts

● Confirmation and Ack

● At least once

● Asynchronous Confirms

● Persistent

● Clustering

Page 61: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Confirmation and Ack

MQ

Producer Consumer

msg

confi

rm

ackmsg

1

2 3

4

Page 62: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

At least once

MQ

At most once

Consumer

Message

Ack

MQ ConsumerMessage

Auto Ack

Page 63: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Asynchronous Confirms1

2

3

4

5

6

7

8

9

1011

12

13

14

15

16

Producer

messages

confirm #6

Page 64: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Persistent

MQ

Producer Consumer

Page 65: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Persistent

MQ

Producer Consumer

Page 66: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Persistent

MQ

Producer Consumer

X

Page 67: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Persistent

MQ

Producer Consumer

Page 68: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Persistent

MQ

Producer Consumer

Page 69: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Clustering

SlaveMaster

Producer

1

2

3

4

Page 70: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Testing

Page 71: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Test RabbitMQ

● Send millions of 2KB messages

● 20 producers and 20 consumers

● 1000 messages / second

● Simulate multiple failures

Page 72: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Test Consistency

Producers

RabbitMQ

RabbitMQ

Consumers

Slave

Master

Page 73: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Test Consistency

Producers

RabbitMQ

RabbitMQ

Consumers

Master

Slave

Page 74: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Test Consistency

Producers

RabbitMQ

RabbitMQ

Consumers

Master

Slave

Page 75: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Test Consistency

Producers

RabbitMQ

RabbitMQ

Consumers

X

Master

Page 76: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Test Consistency

Producers

RabbitMQ

RabbitMQ

Consumers

X

Master

Page 77: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Test Consistency

Producers

RabbitMQ

RabbitMQ

Consumers

Master

Slave

Page 78: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

RabbitMQ Clustering

Master Slave

Page 79: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

RabbitMQ Clustering

Master Slave

Page 80: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

RabbitMQ Clustering

Master

X

Page 81: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

RabbitMQ Clustering

Master

X

Page 82: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

RabbitMQ Clustering

MasterSlave

Page 83: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

RabbitMQ Clustering

MasterSlave

Page 84: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

RabbitMQ Clustering

MasterSlave

Page 85: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

RabbitMQ Clustering

MasterSlave

Page 86: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

RabbitMQ Clustering

MasterSlave

Page 87: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

RabbitMQ Clustering

MasterSlave

Page 88: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

RabbitMQ Clustering

Master

X

Page 89: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

RabbitMQ Clustering

MasterSlave

Page 90: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

RabbitMQ Clustering

MasterSlave

Page 91: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

RabbitMQ Clustering

MasterSlave

Page 92: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

RabbitMQ Clustering

Master

X

Page 93: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

RabbitMQ Clustering

Master

XX

Page 94: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

RabbitMQ Clustering

Master

X

Page 95: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

RabbitMQ Clustering

Master

X

Page 96: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

RabbitMQ Clustering

Master Slave

Page 97: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Non-persistent

15990 Messages / Second30 MB/s

Page 98: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Persistent

2781 Message / Second5.5 MB/s

Page 99: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Clustered and Persistent

1262 Message / Second2.5 MB/s

Page 100: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Applying RabbitMQ

Page 101: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Unreliable High Latency Connections

Engine

Engine

Engine

Job Write Service

Remote DC Primary DC

MySQL

Page 102: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Replaced with RabbitMQ

Engine

Engine

Engine

Job Write ServiceRabbit

MQ

Remote DC Primary DC

MySQL

Page 103: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Replaced with RabbitMQ

Engine

Engine

Engine

Job Write ServiceRabbit

MQ

Remote DC Primary DC

Page 104: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Replaced with RabbitMQ

Engine

Engine

Engine

Job Write ServiceRabbit

MQ

Remote DC Primary DC

Page 105: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Replaced with RabbitMQ

Engine

Engine

Engine

Job Write Service

Remote DC Primary DC

RabbitMQ

Page 106: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Replaced with RabbitMQ

Engine

Engine

Engine

Job Write Service

Remote DC Primary DC

RabbitMQ

Page 107: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Rabbit can talk to Rabbit

Shovel Plugin

Producer RabbitMQ 1 ConsumerRabbitMQ 2

Page 108: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Replaced with RabbitMQ

Engine

Engine

Engine

Job Write ServiceRabbit

MQ

RabbitMQ

RabbitMQ

RabbitMQ

Remote DC Primary DC

Page 109: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Replaced with RabbitMQ

Engine

Engine

Engine

Job Write ServiceRabbit

MQ

RabbitMQ

RabbitMQ

RabbitMQ

Primary DC

RabbitMQ

Remote DC

Page 110: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Parallelize Job Write Service

RabbitMQ

Job Write Service

Job Write Service

Job Write Service

Job A

Job B

Job C

Page 111: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Replaced with RabbitMQ

Engine

Engine

Engine

Job Write ServiceRabbit

MQ

RabbitMQ

RabbitMQ

RabbitMQ

Primary DC

RabbitMQ

Job Write Service

Remote DC

Page 112: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Replaced with RabbitMQ

Engine

Engine

Engine

Job Write ServiceRabbit

MQ

RabbitMQ

RabbitMQ

RabbitMQ

Primary DC

RabbitMQ

Job Write Service

Page 113: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Message Flow

Page 114: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Message Flow

Engine

Engine

Engine

Job Write Service

Primary DC

Job Write Service

RabbitMQ

RabbitMQ

RabbitMQ

RabbitMQ

RabbitMQ

Page 115: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Message Flow

Engine

Engine

Engine

Job Write Service

Primary DC

Job Write Service

RabbitMQ

RabbitMQ

RabbitMQ

RabbitMQ

RabbitMQ

Page 116: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Message Flow

Engine

Engine

Engine

Job Write Service

Primary DC

Job Write Service

RabbitMQ

RabbitMQ

RabbitMQ

RabbitMQ

RabbitMQ

Page 117: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Message Flow

Engine

Engine

Engine

Job Write Service

Primary DC

Job Write Service

RabbitMQ

RabbitMQ

RabbitMQ

RabbitMQ

RabbitMQ

Page 118: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Jobs/minute

Page 119: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Jobs/minute from one site

220,000 jobs6 hours

611 jobs / minute

Page 120: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Jobs/minute from one site

251,000 jobs20 minutes

12550 jobs / minute

Page 121: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

RabbitMQ

Horizontal Scale

Engine

Engine

Engine Job Write ServiceRabbit

MQ

Job Write Service

RabbitMQ

RabbitMQ

RabbitMQ

RabbitMQ

RabbitMQ

Job Write Service

Job Write Service

Page 122: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Horizontal Scale

Page 123: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Horizontal Scale

Page 124: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Today 1000 messages / second

Page 125: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

RabbitMQ 3

2486 Message / Second5MB/s

Page 126: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

RabbitMQ Configuration

● Confirmations - Fire and Forget

● Persistent Messages - Durable

● Shoveling - Multi-Data Center

● Mirrored Queues in Cluster - High Reliability

Page 127: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Can we do more with RabbitMQ?

Page 128: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Aggregation Viewer

Real-time browser-based view of job stream

Page 129: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

● Almost real-time● Exclusive queue● Transient messages

Aggregation Viewer Architecture

Agg JobsRabbit MQ

ClusterAgg ViewerRabbit MQ

Agg Viewer

Shovel* SubscribeJobs HTTP Browser

Page 130: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Resume Contacts Billing

Pay-per-contact: limited budget

Page 131: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ
Page 132: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Resume Contacts BillingOriginal Path

Pacific

Asia DC US DC

Log repoResume Search

MySQL

see http://go.indeed.com/logrepo

Page 133: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Resume Contacts BillingFast Path

Pacific

Asia DC US DC

RabbitMQ

MySQL

Log repo

RabbitMQ

Resume Search

X

Page 134: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Company Page Edits

User-contributed content about companies

Page 135: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Company Page

Page 136: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Company Page EditsImplementation

Writing data AND reading it back

Page 137: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Company Page EditsSingle Datacenter

Browser

Web Server MySQL

Page 138: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Company Page Serving

Browser

Web Server

LSM Tree

Asia Datacenter

Memcached

see http://go.indeed.com/lsmtree

Page 139: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Pacific

Company Page Edits

Browser

Web Server

RabbitMQ RabbitMQ MySQL

Primary US Datacenter

Asia Datacenter EU Datacenter

Atlantic

[Et cetera]

Memcached

Page 140: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Pacific

Company Page Reads

MySQL

LSM TreeBuilderLSM Tree

Primary US Datacenter

Asia Datacenter

LSM Tree

EU Datacenter

Atlantic

[Et cetera]

Page 141: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Memcached

Pacific

Company Pages System

Browser

Web Server

RabbitMQ RabbitMQ MySQL

LSM TreeBuilderLSM Tree

Primary US Datacenter

Asia Datacenter

LSM Tree

EU Datacenter

Atlantic

[Et cetera]

Page 142: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Other applications

Page 143: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Company Pages

Page 144: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ
Page 145: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ
Page 146: [@IndeedEng] How to Get a Job 35 Million Times a Day Using RabbitMQ

Recap: The jobs must flow

● Durability● High throughput● Low latency● Partition-tolerance● Efficient use of the database● Minimal points of failure