47
A tale of scalability From one node to multiples DC's

A tale of scalability

Embed Size (px)

Citation preview

Page 1: A tale of scalability

A tale of scalabilityFrom one node to multiples DC's

Page 2: A tale of scalability

Who?

2014 FIFA World Cup Brazil- 450K simultaneous users (ARG vs SWE)- 580Gbps (ARG vs SWE)- 1659 years watched (all games)- 7 x 1 (GER vs BRA)bbb, sportv, off, pfc, combate, gnews ...

Page 3: A tale of scalability

Agenda

1. What this presentation is not about! 2. Basic glossary3. The story4. Questions

PS1: this presentation was not made by someone good with graphs/drawing/diagrams.

PS2: this reflects only my own unique personal individual stupid opinion not my employer.

Page 4: A tale of scalability

What this presentation is not about!

Byzantine fault, 2PC, Paxos, RAFT, Threading, Locks, Leader election, ZAB, Consensus

problem, CRDTs, CALM, CAP theorem, to sum up this is not a deep distributed systems

presentation.

Page 5: A tale of scalability

Basic glossary

Scalability: ability to enlarge to accommodate growth. (why?)

Page 6: A tale of scalability

Basic glossary

Availability: the proportion of time a system is in a functioning condition. (why?)

Page 7: A tale of scalability

Basic glossaryFault tolerance: ability to continue operating properly in the event of the failure. (why?)

Failover systems: software with automatic fault tolerance.

Page 8: A tale of scalability

The story :: BananaApp

Page 9: A tale of scalability

1st Solution :: 1 Server

databaseappserver

BRA

US

CAN Lost World

Page 10: A tale of scalability

1st Solution :: problems

More users hit the app, it becomes slower :(CPU load was 1.3 (NOK) [h/top, w]I/O utilization was high (NOK) [iostat, vmstat]RAM usage 45% (OK) [free -m]Disk space 5% (OK) [df -h]

database

Page 11: A tale of scalability

2nd Solution :: 2 Servers

appserver

BRA

USCAN

Lost World

database

Page 12: A tale of scalability

2nd Solution :: good parts

Distributed the loadWe can fine tune each server separately

Page 13: A tale of scalability

2nd Solution :: problems

More users hit the app, it becomes slower :(APP's CPU load was 1.3 (NOK)DB's CPU load was 0.3 (OK)I/O utilization was normal (OK)Introduce of network latency / point of failure

appserver

Page 14: A tale of scalability

2nd Solution :: new conceptsPoint of Failure (or single point of failure [SPoF]) is a part of a system that, if it fails, will stop the entire system from working.Examples: our database and app server

When a solution is free from SPoF we can say it's a failover system.

Page 15: A tale of scalability

3rd Solution :: 4 Servers

loadbalancerPPL

database

app1

app2

Page 16: A tale of scalability

3rd Solution :: new concepts"Load balancer (LB) is a device/software that distributes network or application traffic across a number of servers (reals)." (F5)

1. How does it chooses which server to send? round-robin, least conn, weighted... 2. How does it knows about a dead node? health check /page, tcp:80...

Page 17: A tale of scalability

3rd Solution :: LB exampleshttp {

upstream myapp1 {

server srv1.example.com;

server srv2.example.com;

}

server {

listen 80;

location / {

proxy_pass http://myapp1;

health_check uri=/health;

}

}}

listen appname 0.0.0.0:80 mode http stats enable balance leastconn option httpclose option forwardfor option httpchk HEAD /health HTTP/1.1 server srv1 srv1.example.com:80 check server srv2 srv2.example.com:80 check

NGINX HAProxy

Page 18: A tale of scalability

3rd Solution :: problems

Users are getting signed out "randomly". This problem is also known as: session persistence, session stickiness. Nginx: sticky cookie srv_id expires=1h domain=.example.com path=/;

HAProxy: cookie srv_id insert indirect nocache

Page 19: A tale of scalability

4th Solution :: 5 Servers

loadbalancerPPL

database

app1

app2

memcachedredis ...

Page 20: A tale of scalability

4th Solution :: +problems

We now have 3 SPoFs: LB, memcache and database.

Page 21: A tale of scalability

5th Solution :: LB (float/virtual ip)

/etc/sysctl.conf net.ip_nonlocal_bind=1/etc/ha.d/haresources lb1 192.168.0.10

Page 22: A tale of scalability

5th Solution :: Database

Partition and Replication

ABCD

C B

D A

B (1,2) C (2,3)

A (3,0) D (0,1)

Page 23: A tale of scalability

5th Solution :: mongo (master/bkp)

Page 24: A tale of scalability

5th Solution :: cassandra (cluster)

Page 25: A tale of scalability

5th Solution :: you got the idea

lb1PPL lb2

app1

app2

app3

appn

DB

Session

db1 db2

db3 db4

s1 s2

Page 26: A tale of scalability

5th Solution :: + caching

lb1PPL lb2

app1

app2

app3

appn

DB

Session

db1 db2

db3 db4

s1 s2

Caching

c1

c2

c3

Page 27: A tale of scalability

5th Solution :: Cachingproxy_cache_path /data/nginx/cache keys_zone=one:10m;

http {

upstream myapp1 {

server srv1.example.com;

server srv2.example.com;

}

server {

listen 80;

proxy_cache one;

location / {

proxy_cache_valid any 1m;

proxy_pass http://myapp1;

}

}}

NGINX

Page 28: A tale of scalability

5th :: + Microservice Architecture

Application API's

Core

n1

c1 c1

n1

Search

n1

c1 c1

n1

Recommendation

n1

c1

n1

Social

n1

c1 c1

n1

mongodb elasticsearch spark/hadoop neo4j

n1

c1

Page 29: A tale of scalability

5th :: Single datacenter (yet SPoF)

Page 30: A tale of scalability

6th solution :: multihoming

Page 31: A tale of scalability

6th solution :: models to replication

master / backup

master / master

2PC

Paxos

Page 32: A tale of scalability

6th solution :: database

Cassandra can help you

Page 33: A tale of scalability

6th :: DNS round robin

$ dig a www.youtube.com

Page 34: A tale of scalability

6th solution :: anycastBorder Gateway Protocol (BGP) makes routing decisions based on paths, network policies or rule-sets configured by a network administrator, and is involved in making core routing decisions.

DNS solves www.example.com to 1.1.1.1

Clients from Colorado mostly will be routed to Colorado's DC.

Clients from California mostly will be routed to California's DC.

1.1.1.1

1.1.1.1

Page 35: A tale of scalability

6th solution :: sub domain per cli

Page 36: A tale of scalability

6th :: GSLB (Global Server Load Balancing)

Page 37: A tale of scalability

6th :: GSLB multiples A records (BR)

$ dig a www.youtube.com

Page 38: A tale of scalability

6th :: GSLB multiples A records (DE)

Page 39: A tale of scalability

7th day: you shall rest

Page 40: A tale of scalability

BR-DKC101

Summarizing

lb1

lb2

app1app2app3

appn

DB

Session

db1

db2d

b3

db4

s2

Cachingc

1

c2

c3

s1

US-DKC102

lb1

lb2

app1app2app3

appn

DB

Session

db1

db2d

b3

db4

s2

Cachingc

1

c2

c3

s1

JP-DKC103

lb1

lb2

app1app2app3

appn

DB

Session

db1

db2d

b3

db4

s2

Cachingc

1

c2

c3

s1

Page 41: A tale of scalability

Bonus - Vagrant

Page 42: A tale of scalability

Bonus - Docker (docker-compose)

Page 43: A tale of scalability

Bonus - don’t blindly trust vendors

Page 44: A tale of scalability

Link to this presentation

slideshare.net/leandro_moreira

Page 45: A tale of scalability

Questions?

leandromoreira.com.br

Page 46: A tale of scalability

References● https://f5.com/glossary/load-balancer● http://leandromoreira.com.br/2014/11/20/how-to-start-to-learn-high-scalability/● http://nginx.org/en/docs/http/load_balancing.html● https://www.digitalocean.com/community/tutorials/how-to-use-haproxy-to-set-up-http-load-balancing-on-an-ubuntu-vps● https://academy.datastax.com/courses/● http://en.wikipedia.org/wiki/Single_point_of_failure● http://book.mixu.net/distsys/single-page.html● https://www.howtoforge.com/high-availability-load-balancer-haproxy-heartbeat-debian-etch-p2● http://docs.mongodb.org/manual/core/sharding-introduction/● http://docs.mongodb.org/manual/core/replication-introduction/● http://nginx.com/resources/admin-guide/caching/● https://developers.google.com/web/fundamentals/performance/optimizing-content-efficiency/http-caching?hl=en● http://martinfowler.com/articles/microservices.html● http://www.netflix.com/WiMovie/70140358?trkid=12244757● http://highscalability.com/blog/2009/8/24/how-google-serves-data-from-multiple-datacenters.html● http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html● http://docs.mongodb.org/manual/tutorial/deploy-shard-cluster/● http://tech.3scale.net/2014/06/18/redis-sentinel-failover-no-downtime-the-hard-way/● http://www.slideshare.net/gear6memcached/implementing-high-availability-services-for-memcached-1911077● http://docs.couchbase.com/moxi-manual-1.8/● http://highscalability.com/blog/2009/8/24/how-google-serves-data-from-multiple-datacenters.html● http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/35590.pdf● http://static.googleusercontent.com/media/research.google.com/en//archive/chubby-osdi06.pdf● http://the-paper-trail.org/blog/distributed-systems-theory-for-the-distributed-systems-engineer/● http://nil.csail.mit.edu/6.824/2015/papers/paxos-simple.pdf● http://the-paper-trail.org/blog/consensus-protocols-paxos/● donkeykong.com● http://backreference.org/2010/02/01/geolocation-aware-dns-with-bind/● http://www.tenereillo.com/GSLBPageOfShame.htm● http://backreference.org/2010/02/01/geolocation-aware-dns-with-bind/