76
Scalability Erik Schultink International Week of Tech Innovation – 21 Apr, 2010

Scalability -

Embed Size (px)

Citation preview

Page 1: Scalability -

ScalabilityErik Schultink

International Week of Tech Innovation – 21 Apr, 2010

Page 2: Scalability -

What is Tuenti.com?

Page 3: Scalability -

Tuenti.com

• Started 2007• 1:6 pages, 1:10 minutes• Based in Madrid• ~130 employees, 60 engineers

Page 4: Scalability -
Page 5: Scalability -

INTROWhat is a scalable system?

Page 6: Scalability -

Scalability is throughput, not response time

Page 7: Scalability -

What is a scalable system?

responsetime

requests/second

Page 8: Scalability -

The Problem: Concurrency

25k pageviews/second at peak

Page 9: Scalability -

What is a scalable system?

responsetime

requests/second

1. code / architecture2. machines

Variables:

Page 10: Scalability -

What is a scalable system?

responsetime

requests/second

1. code / architecture2. machines

Variables:

Page 11: Scalability -

What is a scalable system?

response

time

requests/second

x machines 2x machines

Page 12: Scalability -
Page 13: Scalability -

THE DATABASE TIER

Page 14: Scalability -

The Solution: Partition

Page 15: Scalability -

The Solution: Partition

Page 16: Scalability -

The Solution: Partition

Page 17: Scalability -

Technologies

• MySQL– simple RDBMS– InnoDB

• Memcache• Lighttpd• PHP

Page 18: Scalability -

The Solution: Partition

• Work must be structured such that each resource can complete it independently

• Overhead to divide workload

Page 19: Scalability -

Data architecture

• Look at queries you perform. • Divide data such that each query can

be answered by querying no more than 1 partition.

Page 20: Scalability -

Comments on a profile

Comments (user_id, author_id, comment)

• Post a comment on a user’s profile• Get list of comments on a user’s

profile• Delete a comment from a user’s

profile

Give up for now:• Comments written by a user

Page 21: Scalability -

Comments on a profile

Partition by user

Costs:• Determining partition of a user

– constant

• Consistency check on access that author still exists– linear on number of comments to display

Page 22: Scalability -

The Solution: Partition

Constantoverhead

Page 23: Scalability -

Alternative Solution

Partition by user, duplicate by author

Comments(user_id, author_id, comment)

AuthoredComments(author_id, user_id, comment_id)

Page 24: Scalability -

Alternative Solution

Comments(user_id, author_id, comment)

AuthoredComments(author_id, user_id, comment_id)

Costs:• double writes• extra storage• delete by author still very expensive

Page 25: Scalability -

THE WEB SERVER TIER

Page 26: Scalability -

Traditional Systems Architecture

Load Balancer

Web server farm

Web server farm

Web server farm

www.tuenti.com

Page 27: Scalability -

Concurrency

Page 28: Scalability -

The Solution: Partition

Page 29: Scalability -

Traditional Systems Architecture

Load Balancer

Web server farm

Web server farm

Web server farm

www.tuenti.com

Load Balancer

Web server farm

12.45.34.178 12.45.34.179

Page 30: Scalability -

AJAX

• What is AJAX?– “Asynchronous JavaScript and XML”

• Paradigm for client-server interaction• Change state on client, without

loading a complete HTML page

Page 31: Scalability -

Traditional HTML Browsing1. User clicks link2. Browser sends request3. Server receives, parses request,

generates response4. Browser receives response and

begins rendering5. Dependent objects (images, js, css)

load and render6. Page appears

Page 32: Scalability -

AJAX Browsing1. User clicks link2. Browser sends request3. Server receives, parses request,

generates response4. Browser receives response and

begins rendering5. Dependent objects (images, js, css)

load and render6. Page appears

Page 33: Scalability -

How does Tuenti use AJAX?

• Only pageloads are login and home page

• Loader pulls in all JS/CSS• Afterwards stay within one HTML

page, rotating canvas area content

Page 34: Scalability -

Balancing Load

• Top-level requests to www.tuenti.com• Each request tells client which farm it

should be using, based on a mapping• Mapping can be changed to balance

load, perform maintenance, etc

Page 35: Scalability -

Client-side Routing

Load Balance

rWeb server

farm

wwwb3.tuenti.com

Load Balance

rWeb server

farm

wwwb2.tuenti.com

Load Balance

rWeb server

farm

wwwb1.tuenti.com

Load Balance

rWeb server

farm

wwwb4.tuenti.com

www.tuenti.com

Linearly scalable …

Page 36: Scalability -

Client-side Routing

Load Balance

rWeb server

farm

wwwb3.tuenti.com

Load Balance

rWeb server

farm

wwwb2.tuenti.com

Load Balance

rWeb server

farm

wwwb1.tuenti.com

Load Balance

rWeb server

farm

wwwb4.tuenti.com

www.tuenti.com

Linearly scalable … except for top level

Page 37: Scalability -

Client-side Routing

Load Balance

rWeb server

farm

wwwb3.tuenti.com

Load Balance

rWeb server

farm

wwwb2.tuenti.com

Load Balance

rWeb server

farm

wwwb1.tuenti.com

Load Balance

rWeb server

farm

wwwb4.tuenti.com

www.tuenti.com

lots of content creation = lots of dynamic data

Page 38: Scalability -

Client-side Routing

Load Balance

rWeb server

farm

wwwb3.tuenti.com

Load Balance

rWeb server

farm

wwwb2.tuenti.com

Load Balance

rWeb server

farm

wwwb1.tuenti.com

Load Balance

rWeb server

farm

wwwb4.tuenti.com

www.tuenti.com

lots of dynamic data = lots of cache = internal network traffic

Cache Farm

Page 39: Scalability -

Client-side Routing

Load Balance

rWeb server

farm

wwwb3.tuenti.com

Load Balance

rWeb server

farm

wwwb2.tuenti.com

Load Balance

rWeb server

farm

wwwb1.tuenti.com

Load Balance

rWeb server

farm

wwwb4.tuenti.com

www.tuenti.com

Partition cacheRoute requests to a farm near cache needed to respond

Cache Farm Cache Farm Cache FarmCache Farm

Page 40: Scalability -

Internal network savings

Page 41: Scalability -

SERVER-SIDE GAIN?

Page 42: Scalability -
Page 43: Scalability -
Page 44: Scalability -
Page 45: Scalability -
Page 46: Scalability -

CONTENT DELIVERY

Page 47: Scalability -

Image Serving

• Tuenti serves ~2.5 billion images/day• At peak, this is >6 Gbps and >70k

hits/sec• We use CDNs

Page 48: Scalability -

What is a CDN?

Content Delivery Network

Page 49: Scalability -

What is a CDN?

• Examples: Akamai, Limelight– also dozens more, including Amazon

• Big distributed, object cache• Pay per use

– either per request, per TB transfer, or per peak Mbps

Page 50: Scalability -

What is a CDN?

• Advantages:– Outsource dev and infrastructure– Geographically distributed– Economies of scale

• Disadvantages:– High cost– Less control and transparency– Commitments

Page 51: Scalability -

What affects image load time?

• Client internet connection• Response time of CDN• CDN cache hit rate

Page 52: Scalability -

What affects image load time?

• Client internet connection• Response time of CDN• CDN cache hit rate

Page 53: Scalability -
Page 54: Scalability -

Monitor Performance from Client

• Closer to performance experienced by end-user

• Only way to get view of network issues faced by users (ie last mile)

Page 55: Scalability -
Page 56: Scalability -

How to fix slow ISP?

• Choose better transit provider• Set-up peering (or get CDN too)• Traffic management

Page 57: Scalability -

What affects image load time?

• Client internet connection• Response time of CDN• CDN cache hit rate

Page 58: Scalability -
Page 59: Scalability -
Page 60: Scalability -

Quality of End-User Experiencevs.

Cost

Page 61: Scalability -

We use multiple CDNs, and shift content based on

price/performance.

Page 62: Scalability -

Know your content

Page 63: Scalability -

Know your content

Page 64: Scalability -

Know your content

Page 65: Scalability -

Know your content30

200

75

Page 66: Scalability -

Know your content

600

Page 67: Scalability -

Know your content

120

Page 68: Scalability -

Know your content

Page 69: Scalability -

Pre-fetch Content

• Exploit predictable user behavior• Ex: clicking to next photo in an

album• Simple solution – load next image

hidden• Client browser will cache it (next

response < 100 ms)• Increase tolerance for slow response

time

Page 70: Scalability -

Pre-fetch Content

• More complex solution– Pre-fetch next canvas (full html), render

in background – rotate in on Next

• Even more complex– Instantiate HTML template w/ data on

client– Pre-fetch data X photos in advance,

render Y templates in advance with this data

Page 71: Scalability -

Pre-fetch Content

Problems:• Rendering still takes time• Increases browser load• Need to set cache headers correctly

Page 72: Scalability -

Image delivery

Small images: High request, low volume– Most cost-effective to cache in memory

Large images: High volume, low requests, greater tolerance for latency

Page 73: Scalability -

What affects image load time?

• Client internet connection• Response time of CDN• CDN cache hit rate

Page 74: Scalability -

Monitor Performance from Client

cold servers online

Page 75: Scalability -

More

• jobs.tuenti.com• dev.tuenti.com

Page 76: Scalability -

Q & A