100
Distribute the workload Helgi Þormar Þorbjörnsson PHP Barcelona, 29th of October 2011 Saturday, 29 October 11

Distribute the workload, PHP Barcelona 2011

Embed Size (px)

Citation preview

Page 1: Distribute the workload, PHP Barcelona 2011

Distribute the workload

Helgi Þormar ÞorbjörnssonPHP Barcelona, 29th of October 2011

Saturday, 29 October 11

Page 2: Distribute the workload, PHP Barcelona 2011

Who am I?

Saturday, 29 October 11

Page 3: Distribute the workload, PHP Barcelona 2011

Co-founded Orchestra.io

Developer at PEAR

From Iceland

@h on Twitter

Helgi

Saturday, 29 October 11

Page 4: Distribute the workload, PHP Barcelona 2011

Why Distribute?

Budget

Efficiency

Perception

Saturday, 29 October 11

Page 5: Distribute the workload, PHP Barcelona 2011

Efficiency

10 small servers > 1 big

Saturday, 29 October 11

Page 6: Distribute the workload, PHP Barcelona 2011

Budget

Spend wisely

Commodity servers

Cloud Computing (EC2)

Saturday, 29 October 11

Page 7: Distribute the workload, PHP Barcelona 2011

Perception

Defer intensive processes

Give instant feedback

Users keep on browsing

Saturday, 29 October 11

Page 8: Distribute the workload, PHP Barcelona 2011

Saturday, 29 October 11

Page 9: Distribute the workload, PHP Barcelona 2011

Ant Colonies

Saturday, 29 October 11

Page 10: Distribute the workload, PHP Barcelona 2011

Teamwork

When faced with a problem they will solve the problem as one.

Saturday, 29 October 11

Page 11: Distribute the workload, PHP Barcelona 2011

Saturday, 29 October 11

Page 12: Distribute the workload, PHP Barcelona 2011

Saturday, 29 October 11

Page 13: Distribute the workload, PHP Barcelona 2011

Architect for Distribution

Saturday, 29 October 11

Page 14: Distribute the workload, PHP Barcelona 2011

Characteristics

Decoupling

Elasticity

High Availability

Concurrency

Saturday, 29 October 11

Page 15: Distribute the workload, PHP Barcelona 2011

Decoupling

Saturday, 29 October 11

Page 16: Distribute the workload, PHP Barcelona 2011

Application

DB API

Cache FE

Saturday, 29 October 11

Page 17: Distribute the workload, PHP Barcelona 2011

Application

DB API

Cache FE

Cache

API

API

Saturday, 29 October 11

Page 18: Distribute the workload, PHP Barcelona 2011

Elasticity

Saturday, 29 October 11

Page 19: Distribute the workload, PHP Barcelona 2011

Cloud Computing

Saturday, 29 October 11

Page 20: Distribute the workload, PHP Barcelona 2011

Load Balancing

Saturday, 29 October 11

Page 21: Distribute the workload, PHP Barcelona 2011

HA Proxy

Nginx

My Favourite

Saturday, 29 October 11

Page 22: Distribute the workload, PHP Barcelona 2011

Monitoring

Saturday, 29 October 11

Page 23: Distribute the workload, PHP Barcelona 2011

When do I need more servers?

Saturday, 29 October 11

Page 24: Distribute the workload, PHP Barcelona 2011

Needs to be around from the start!

Saturday, 29 October 11

Page 25: Distribute the workload, PHP Barcelona 2011

Keep records

Saturday, 29 October 11

Page 26: Distribute the workload, PHP Barcelona 2011

Spot trends

Saturday, 29 October 11

Page 27: Distribute the workload, PHP Barcelona 2011

Different types

Hardware Performance

Software Performance

Availability

Resourcing

Saturday, 29 October 11

Page 28: Distribute the workload, PHP Barcelona 2011

ApplicationsNew Relic

CloudKick

ScoutApp

Nagios

Cacti

Circonus

Saturday, 29 October 11

Page 29: Distribute the workload, PHP Barcelona 2011

Automation

Saturday, 29 October 11

Page 30: Distribute the workload, PHP Barcelona 2011

Plug into your monitoring

Saturday, 29 October 11

Page 31: Distribute the workload, PHP Barcelona 2011

Bringing together Monitoring and Elastic behaviour into one

beautiful whole!

Saturday, 29 October 11

Page 32: Distribute the workload, PHP Barcelona 2011

Add some intelligence to add / remove servers as needed based

on current information.

Saturday, 29 October 11

Page 33: Distribute the workload, PHP Barcelona 2011

Just make sure it doesn’t turn into...

Saturday, 29 October 11

Page 34: Distribute the workload, PHP Barcelona 2011

Skynet!!Saturday, 29 October 11

Page 35: Distribute the workload, PHP Barcelona 2011

High Availability

Saturday, 29 October 11

Page 36: Distribute the workload, PHP Barcelona 2011

Get a highly available and resilient setup by following a few

of those recommendations

Saturday, 29 October 11

Page 37: Distribute the workload, PHP Barcelona 2011

Remember, even Google has outages

Saturday, 29 October 11

Page 38: Distribute the workload, PHP Barcelona 2011

What to avoid

Saturday, 29 October 11

Page 39: Distribute the workload, PHP Barcelona 2011

Local Sessions

Saturday, 29 October 11

Page 40: Distribute the workload, PHP Barcelona 2011

Store sessions in DB / Memcache

Solution

Saturday, 29 October 11

Page 41: Distribute the workload, PHP Barcelona 2011

Local Memory

Saturday, 29 October 11

Page 42: Distribute the workload, PHP Barcelona 2011

Networked Memcache

Solution

Saturday, 29 October 11

Page 43: Distribute the workload, PHP Barcelona 2011

Local Files

Saturday, 29 October 11

Page 44: Distribute the workload, PHP Barcelona 2011

Local Uploads

Saturday, 29 October 11

Page 45: Distribute the workload, PHP Barcelona 2011

Writing to /tmp

Saturday, 29 October 11

Page 46: Distribute the workload, PHP Barcelona 2011

Store on S3 or a networked FS

Solution

Saturday, 29 October 11

Page 47: Distribute the workload, PHP Barcelona 2011

Serve up static files from CDNs

Solution

Saturday, 29 October 11

Page 48: Distribute the workload, PHP Barcelona 2011

Servers can vanish at any given time

Saturday, 29 October 11

Page 49: Distribute the workload, PHP Barcelona 2011

Internal APIs

Saturday, 29 October 11

Page 50: Distribute the workload, PHP Barcelona 2011

Application

S3GFS FS

Internal Storage API

Saturday, 29 October 11

Page 51: Distribute the workload, PHP Barcelona 2011

Application

MySQLMongo Cache

Internal DB API

Saturday, 29 October 11

Page 52: Distribute the workload, PHP Barcelona 2011

SOA

Saturday, 29 October 11

Page 53: Distribute the workload, PHP Barcelona 2011

Service Oriented Architecture

Saturday, 29 October 11

Page 54: Distribute the workload, PHP Barcelona 2011

Sort of :-)

Saturday, 29 October 11

Page 55: Distribute the workload, PHP Barcelona 2011

Eventually Consistent

Saturday, 29 October 11

Page 56: Distribute the workload, PHP Barcelona 2011

CAP Therom

Saturday, 29 October 11

Page 57: Distribute the workload, PHP Barcelona 2011

Consistency

Availability

Partition Tolerance

Saturday, 29 October 11

Page 58: Distribute the workload, PHP Barcelona 2011

Consistency

All nodes see the same data at the same time

Saturday, 29 October 11

Page 59: Distribute the workload, PHP Barcelona 2011

Availability

Node failures do not prevent survivors from continuing to

operate

Saturday, 29 October 11

Page 60: Distribute the workload, PHP Barcelona 2011

Partition Tolerance

The system continues to operate despite arbitrary message loss

Saturday, 29 October 11

Page 61: Distribute the workload, PHP Barcelona 2011

Consistency

Availability

Partition Tolerance

Saturday, 29 October 11

Page 62: Distribute the workload, PHP Barcelona 2011

Queue Systems

Saturday, 29 October 11

Page 63: Distribute the workload, PHP Barcelona 2011

Good forImage Processing

Distributed Logs

Data Mining

Mass Emails

Intensive transformation

Search

Saturday, 29 October 11

Page 64: Distribute the workload, PHP Barcelona 2011

Common Tools

Gearman

Hadoop

ZeroMQ

RabbitMQ

And many others!

Saturday, 29 October 11

Page 65: Distribute the workload, PHP Barcelona 2011

New York Times

4TB of TIFF files

Needed to get 11 million PDF versions

Used Hadoop and EC2

100 machines took 24 hours

Saturday, 29 October 11

Page 66: Distribute the workload, PHP Barcelona 2011

Map/Reduce

Saturday, 29 October 11

Page 67: Distribute the workload, PHP Barcelona 2011

Map

Master gets a problem to solve

Breaks into multiple sub-problems

Distributed to multiple workers

A worker can take the same steps

Answer passed back to Master

Saturday, 29 October 11

Page 68: Distribute the workload, PHP Barcelona 2011

Reduce

Takes in answers from the map workers

Combines together to get an answer

There can be multiple reducers

Saturday, 29 October 11

Page 69: Distribute the workload, PHP Barcelona 2011

process petabytes of data in few hours on commodity server farm

Saturday, 29 October 11

Page 70: Distribute the workload, PHP Barcelona 2011

CouchDB

Saturday, 29 October 11

Page 71: Distribute the workload, PHP Barcelona 2011

CouchDB

Highly Concurrent

Schema free, document based

RESTful API

Map/Reduce Views

Easy Replication

Saturday, 29 October 11

Page 72: Distribute the workload, PHP Barcelona 2011

Gearman

Saturday, 29 October 11

Page 73: Distribute the workload, PHP Barcelona 2011

Your Client Code

Gearman Client API(C, PHP, Perl, MySQL UDF, ...)

Gearman Job Servergearmand

Gearman Worker API(C, PHP, Perl, Python, ...)

Your Worker Code

Your App Gearman

Saturday, 29 October 11

Page 74: Distribute the workload, PHP Barcelona 2011

pear.php.net/net_gearman

Saturday, 29 October 11

Page 75: Distribute the workload, PHP Barcelona 2011

A Story!

Saturday, 29 October 11

Page 76: Distribute the workload, PHP Barcelona 2011

Financial Software

Saturday, 29 October 11

Page 77: Distribute the workload, PHP Barcelona 2011

3000+ Clients

Saturday, 29 October 11

Page 78: Distribute the workload, PHP Barcelona 2011

Each one has 5 external data sources

Saturday, 29 October 11

Page 79: Distribute the workload, PHP Barcelona 2011

Each data source is a web service

Saturday, 29 October 11

Page 80: Distribute the workload, PHP Barcelona 2011

Ran every 6 hours every day

Saturday, 29 October 11

Page 81: Distribute the workload, PHP Barcelona 2011

Cron

Gearman

Job 11

2

3

4

5

Web Services

1

43

2

5

Processing

Saturday, 29 October 11

Page 82: Distribute the workload, PHP Barcelona 2011

But! That wasn’t enough

Saturday, 29 October 11

Page 83: Distribute the workload, PHP Barcelona 2011

Job kicked off on login

Saturday, 29 October 11

Page 84: Distribute the workload, PHP Barcelona 2011

Supervisord

Saturday, 29 October 11

Page 85: Distribute the workload, PHP Barcelona 2011

j.mp/supervisord

Saturday, 29 October 11

Page 86: Distribute the workload, PHP Barcelona 2011

Another Story!

Saturday, 29 October 11

Page 87: Distribute the workload, PHP Barcelona 2011

CloudSplit

Saturday, 29 October 11

Page 88: Distribute the workload, PHP Barcelona 2011

Near Real Time Cloud Analytics

Saturday, 29 October 11

Page 89: Distribute the workload, PHP Barcelona 2011

Clients install logging agent locally

Saturday, 29 October 11

Page 90: Distribute the workload, PHP Barcelona 2011

syslogd

Saturday, 29 October 11

Page 91: Distribute the workload, PHP Barcelona 2011

Public API

Saturday, 29 October 11

Page 92: Distribute the workload, PHP Barcelona 2011

Multiple Persistent Gearman Servers

Saturday, 29 October 11

Page 93: Distribute the workload, PHP Barcelona 2011

Internal DB API

Saturday, 29 October 11

Page 94: Distribute the workload, PHP Barcelona 2011

Agent syslogd

API

Gearman

Gearman

CouchDB

Worker

Worker

Worker

Internal API

Load Balanced

Load Balanced

PersistentSaturday, 29 October 11

Page 95: Distribute the workload, PHP Barcelona 2011

CouchDB Setup

Saturday, 29 October 11

Page 96: Distribute the workload, PHP Barcelona 2011

Write vs Read

Saturday, 29 October 11

Page 97: Distribute the workload, PHP Barcelona 2011

Writes

Multi Master setup

Replicated

Deals with writes only

Saturday, 29 October 11

Page 98: Distribute the workload, PHP Barcelona 2011

Reads

Multi Master setup

Replicated from write cluster

Slaves handle website requests

Saturday, 29 October 11

Page 99: Distribute the workload, PHP Barcelona 2011

Heavy Map/Reduce usage for data

Saturday, 29 October 11

Page 100: Distribute the workload, PHP Barcelona 2011

Questions?

@[email protected]

Joind.in: http://joind.in/4326

Saturday, 29 October 11