Scalability -

Preview:

Citation preview

ScalabilityErik Schultink

International Week of Tech Innovation – 21 Apr, 2010

What is Tuenti.com?

Tuenti.com

• Started 2007• 1:6 pages, 1:10 minutes• Based in Madrid• ~130 employees, 60 engineers

INTROWhat is a scalable system?

Scalability is throughput, not response time

What is a scalable system?

responsetime

requests/second

The Problem: Concurrency

25k pageviews/second at peak

What is a scalable system?

responsetime

requests/second

1. code / architecture2. machines

Variables:

What is a scalable system?

responsetime

requests/second

1. code / architecture2. machines

Variables:

What is a scalable system?

response

time

requests/second

x machines 2x machines

THE DATABASE TIER

The Solution: Partition

The Solution: Partition

The Solution: Partition

Technologies

• MySQL– simple RDBMS– InnoDB

• Memcache• Lighttpd• PHP

The Solution: Partition

• Work must be structured such that each resource can complete it independently

• Overhead to divide workload

Data architecture

• Look at queries you perform. • Divide data such that each query can

be answered by querying no more than 1 partition.

Comments on a profile

Comments (user_id, author_id, comment)

• Post a comment on a user’s profile• Get list of comments on a user’s

profile• Delete a comment from a user’s

profile

Give up for now:• Comments written by a user

Comments on a profile

Partition by user

Costs:• Determining partition of a user

– constant

• Consistency check on access that author still exists– linear on number of comments to display

The Solution: Partition

Constantoverhead

Alternative Solution

Partition by user, duplicate by author

Comments(user_id, author_id, comment)

AuthoredComments(author_id, user_id, comment_id)

Alternative Solution

Comments(user_id, author_id, comment)

AuthoredComments(author_id, user_id, comment_id)

Costs:• double writes• extra storage• delete by author still very expensive

THE WEB SERVER TIER

Traditional Systems Architecture

Load Balancer

Web server farm

Web server farm

Web server farm

www.tuenti.com

Concurrency

The Solution: Partition

Traditional Systems Architecture

Load Balancer

Web server farm

Web server farm

Web server farm

www.tuenti.com

Load Balancer

Web server farm

12.45.34.178 12.45.34.179

AJAX

• What is AJAX?– “Asynchronous JavaScript and XML”

• Paradigm for client-server interaction• Change state on client, without

loading a complete HTML page

Traditional HTML Browsing1. User clicks link2. Browser sends request3. Server receives, parses request,

generates response4. Browser receives response and

begins rendering5. Dependent objects (images, js, css)

load and render6. Page appears

AJAX Browsing1. User clicks link2. Browser sends request3. Server receives, parses request,

generates response4. Browser receives response and

begins rendering5. Dependent objects (images, js, css)

load and render6. Page appears

How does Tuenti use AJAX?

• Only pageloads are login and home page

• Loader pulls in all JS/CSS• Afterwards stay within one HTML

page, rotating canvas area content

Balancing Load

• Top-level requests to www.tuenti.com• Each request tells client which farm it

should be using, based on a mapping• Mapping can be changed to balance

load, perform maintenance, etc

Client-side Routing

Load Balance

rWeb server

farm

wwwb3.tuenti.com

Load Balance

rWeb server

farm

wwwb2.tuenti.com

Load Balance

rWeb server

farm

wwwb1.tuenti.com

Load Balance

rWeb server

farm

wwwb4.tuenti.com

www.tuenti.com

Linearly scalable …

Client-side Routing

Load Balance

rWeb server

farm

wwwb3.tuenti.com

Load Balance

rWeb server

farm

wwwb2.tuenti.com

Load Balance

rWeb server

farm

wwwb1.tuenti.com

Load Balance

rWeb server

farm

wwwb4.tuenti.com

www.tuenti.com

Linearly scalable … except for top level

Client-side Routing

Load Balance

rWeb server

farm

wwwb3.tuenti.com

Load Balance

rWeb server

farm

wwwb2.tuenti.com

Load Balance

rWeb server

farm

wwwb1.tuenti.com

Load Balance

rWeb server

farm

wwwb4.tuenti.com

www.tuenti.com

lots of content creation = lots of dynamic data

Client-side Routing

Load Balance

rWeb server

farm

wwwb3.tuenti.com

Load Balance

rWeb server

farm

wwwb2.tuenti.com

Load Balance

rWeb server

farm

wwwb1.tuenti.com

Load Balance

rWeb server

farm

wwwb4.tuenti.com

www.tuenti.com

lots of dynamic data = lots of cache = internal network traffic

Cache Farm

Client-side Routing

Load Balance

rWeb server

farm

wwwb3.tuenti.com

Load Balance

rWeb server

farm

wwwb2.tuenti.com

Load Balance

rWeb server

farm

wwwb1.tuenti.com

Load Balance

rWeb server

farm

wwwb4.tuenti.com

www.tuenti.com

Partition cacheRoute requests to a farm near cache needed to respond

Cache Farm Cache Farm Cache FarmCache Farm

Internal network savings

SERVER-SIDE GAIN?

CONTENT DELIVERY

Image Serving

• Tuenti serves ~2.5 billion images/day• At peak, this is >6 Gbps and >70k

hits/sec• We use CDNs

What is a CDN?

Content Delivery Network

What is a CDN?

• Examples: Akamai, Limelight– also dozens more, including Amazon

• Big distributed, object cache• Pay per use

– either per request, per TB transfer, or per peak Mbps

What is a CDN?

• Advantages:– Outsource dev and infrastructure– Geographically distributed– Economies of scale

• Disadvantages:– High cost– Less control and transparency– Commitments

What affects image load time?

• Client internet connection• Response time of CDN• CDN cache hit rate

What affects image load time?

• Client internet connection• Response time of CDN• CDN cache hit rate

Monitor Performance from Client

• Closer to performance experienced by end-user

• Only way to get view of network issues faced by users (ie last mile)

How to fix slow ISP?

• Choose better transit provider• Set-up peering (or get CDN too)• Traffic management

What affects image load time?

• Client internet connection• Response time of CDN• CDN cache hit rate

Quality of End-User Experiencevs.

Cost

We use multiple CDNs, and shift content based on

price/performance.

Know your content

Know your content

Know your content

Know your content30

200

75

Know your content

600

Know your content

120

Know your content

Pre-fetch Content

• Exploit predictable user behavior• Ex: clicking to next photo in an

album• Simple solution – load next image

hidden• Client browser will cache it (next

response < 100 ms)• Increase tolerance for slow response

time

Pre-fetch Content

• More complex solution– Pre-fetch next canvas (full html), render

in background – rotate in on Next

• Even more complex– Instantiate HTML template w/ data on

client– Pre-fetch data X photos in advance,

render Y templates in advance with this data

Pre-fetch Content

Problems:• Rendering still takes time• Increases browser load• Need to set cache headers correctly

Image delivery

Small images: High request, low volume– Most cost-effective to cache in memory

Large images: High volume, low requests, greater tolerance for latency

What affects image load time?

• Client internet connection• Response time of CDN• CDN cache hit rate

Monitor Performance from Client

cold servers online

More

• jobs.tuenti.com• dev.tuenti.com

Q & A