The Web Scale

Preview:

DESCRIPTION

Tuenti architecture to withstand1500+ million pageviews / day

Citation preview

The Web ScaleTuenti architecture to withstand1500+ million pageviews / day

Guillermo Pérez - bisho@tuenti.com Security & Backend Architecture Tech Lead

What is a scalable system?

What is scalability

Some Tuenti stats

Tuenti Stats

13M usersREALLY ACTIVE

50%+ active weekly>1h browsing per DAY!

Tuenti Stats

- Each month, over:40,000 M pageviews50,000 M requests100 M new photos2,000+ Tb served photos

- On peaks:1,600 million pageviews/day35,000 requests/second6,000 million served photos/day

Tuenti Stats

- 1200+ servers~500 FEs~300 DBs~100 MCs~100 image serversOthers: Chat, HBase, Queues, Processors...

How to scale?

No silver bullet

MonitorKnow your toolsEvolve, iterate

Learn

Monitoring

- Your crystal ball!Glimpse of the futureAnswer questions

- Detect bottlenecks- Detect what needs to be optimized

The 90/10 RuleNo premature optimization

- Detect bad usages- Detect browser patterns- Detect changes, issues 

Monitoring

Monitoring

Monitoring

MonitorKnow your toolsEvolve, iterate

Learn

Know your tools

- Stop reading blogs- Read internals documentation- Test software- Test hardware- Experiment 

Know your tools

- Mysql (innoDB) IS fastphotos table (photo_id, user_id, ...)

PK photo_id, KEY user_idPK user_id, photo_id, KEY photo_idUsage: select * from photos where user=X

sortingcovering indexEven No SQL :)Hardware limits, replication

Know your tools

Know your tools

- MemcacheTons of persistent TCP conns eats your ram

UDP performance issuesSingle thread for UDPMultiport patch

proxiesStresses the network to the max

Driver issues, configurationVariable performance with net devices

Know your tools

- No SQLNot magic!Good for heavy write loadsGood for data processingStill needs tweaking partitioning, schemas

MonitorKnow your toolsEvolve, iterate

Learn

Evolve, iterate

- All architectures scale till certain point- Then you must rethink everything

Then, and only then!Remember premature optimization?Scale != efficientFuture is hard to predict

  

MonitorKnow your toolsEvolve, iterate

Learn

Learn

Learn from:Experience

FailureOthers

Architecture

Architecture

- Basic rules:Static: Add layers (easy caching)Dynamic: Move responsibility to edgesGeneral: Decentralize, redundancy

 

Architecture

- Design for failure:Support disablingNice degradation, fallbacksControlled launches

- Test with dark launches- Think on storage operations- Be able to migrate live- Focus on your core, use CDNs

Architecture

- Move work to the browser:Request routingTemplatesCachePefetch

- Move remaining to your FEs:Data relationsConsistencyPrivacy, access checkLive migrationsKnowledge of the storage infraestructure

Architecture

- All teams involvedFrontend

Good JS, templating, caching, prefetchingBackend

Data design, parallelization, optimizationsSystems

Iron benchmarks, tunning, networking

Dynamic site example

Scaling a website

- Setup: 1 server- Bottleneck: cpu - Solution: Add fronteds- Changes: Share sessions

Scaling a website

- Setup: N fronteds, 1 DB- Bottleneck: DB Reads - Solution: Add DB slaves- Changes: Split reads to slaves or DB proxy

Scaling a website

- Setup: N fronteds, 1 DB Master + N Slaves- Bottleneck: Limited # of slaves, so DB Reads - Solution: Chain replication / Add cache layer- Changes: Big ones!

Some caches in certain places is easyBut for dynamic app, Memcache as storageMakes your DB nor relational

Scaling a website

- Setup: N FEs, 1 DB Master + N Slaves, Caches- Bottleneck: DB Writes - Solution: Split tables into DB clusters- Changes: Add some DB abstraction

Scaling a website

- Setup: N FEs, N DB clusters, Caches- Bottleneck: DB Writes on certain table - Solution: Partition tables- Changes: DB abstraction and big changes

DB no longer relational, more key basedPartition key limits queriesDenormalization, duplicity 

Scaling a website

- Setup: N FEs, N partitioned DBs, Caches- Bottleneck: Disk space, DB cost - Solution: Archive tables- Changes: DB abstraction + migration scripts

Scaling a website

- Setup: N FEs, N partition+archive DBs, Cache- Bottleneck: Internal network traffic - Solution: 2 level caches, split services, cache affinity- Changes: Cache abstraction, browsers

Scaling a website

- Setup: N FEs, N partition+archive DBs, multilayered Cache, services- Bottleneck: Datacenter - Solution:

Split servicesPartition users data

- Changes: Big ones!Greater replication lags, inconsistencies

The Tuenti Backend Framework

Backend Framework

- Our mission:Provide easy to use, productive, easy to debug, testable, fast, extensible, customizable, deterministic, reusable, instrumentalized (stats) framework and tools to ease developers daily work and manage the infraestructure.

Backend Framework

- From Request routing to Storage- Simple layers, clean responsibilities- Clean, organized codebase- Using:

convention over configurationconfiguration over coding

- Queuing system for async execution- Gathering stats from all levels

Backend Framework

- Request routing:Multiple entry pointsFast request parsers route to AgentsData centric agentsPrinters

Backend Framework

- Domain Api:Expose top-level business actionsClean, semantic ApiNo state, no magic, all data in paramsCheck privacy (the right place!) 

Backend Framework

- Domain Backend:Implement public/internal business actionsClean, semantic ApiNo state, no magic, all data in paramsCoordinate transactionsNo privacy 

Backend Framework

- Domain Storages (ORM like)Configure storage access for a table

Fields, validation, partitioning, primary key, caching techniques, custom queries.

Provide access to storage via standard apis:CRUD actionsCached ListsCached Queries+ Custom

Data container 

 

Backend Framework

- Storage StrategiesCRUDCached ListsCached QueriesCUD Observers for custom actions

  

Backend Framework

- Storage ServiceProvides access to the different storage services:

mysql, memcache, hbase...Coordinates transactionsAbstract the infrastructure complexities:

partitioning, read/write, weights, hostsHandles transactions 

Backend Framework

- Storage Services (concrete ones)Abstract the infrastructure complexities:

partitioning, read/write, weights, hostsApi close to real one:

Memcache: set, get, cas...Mysql: insert, select, update...

Backend Framework

- Storage Drivers (concrete ones)Read configManage PHP driversEnhance API

Love challenges?

We are hiring!http://jobs.tuenti.com

And... Stay tuned for our

Tuenti Challenge 2!http://contest.tuenti.net

Thanks!

?

Guillermo Pérez - bisho@tuenti.comSecurity & Backend Architecture Tech Lead

Images Creative Commons from flickr:heydanielle, eschipul, deanfotos66, nrbelex, mikolski, fdecomite, guldfisken

Recommended