13
Scaling the WorthPoint web site To 3,000,000 page views a day D R A F T

Scaling Drupal - The Worth Point Web Site

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Scaling Drupal - The Worth Point Web Site

Scaling the WorthPoint web site

To 3,000,000 page views a day

DRAFT

Page 2: Scaling Drupal - The Worth Point Web Site

Starting Point

• WorthPoint is built on the Drupal Content Management System

• Drupal is– Open source– LAMP (Linux, Apache, MySQL, PHP)– Used by The Onion, MTV-UK, and LifetimeTV,

among many others• Professional support available from Acquia starting in

the latter half of 2008

DRAFT

Page 3: Scaling Drupal - The Worth Point Web Site

Background

• Drupal is an open source product that is being positioned to compete with enterprise class CMS products

• The Drupal community is working on multiple performance and scaling tasks

• WorthPoint is planning for 5M unique web pages and 3M page views (30/70 mix of dynamic/static) a day by the end of 2008– This positions WorthPoint as one of the larger

Drupal web sites

DRAFT

Page 4: Scaling Drupal - The Worth Point Web Site

Solution Clusters

• Drupal core / module rewrites & updates• WorthPoint specific module development• Database server scaling• Application server scaling• Image server scaling• Search server scaling

• NOTE: These solutions are for a single data center based system

DRAFT

Page 5: Scaling Drupal - The Worth Point Web Site

Drupal Core & Modules

• WorthPoint uses 35 core modules and 225 community contributed modules

• WorthPoint currently uses Drupal v5.1– V6.0 has been released but the community has not

updated many of the modules used by WorthPoint– Acquia is working on “Carbon”, a fully tested and

certified version of the 35 core modules• Many Drupal v5.1 modules need tweaks to work at the

current level of WorthPoint content and traffic

DRAFT

Page 6: Scaling Drupal - The Worth Point Web Site

WorthPoint Module Development

• A vast majority of WorthPoint content and page views are in the Worthopedia, Auctions, Classified, and Taxonomy areas

• Ground up modules designed and developed by WorthPoint for these four areas would significantly reduce the load on the WorthPoint servers

• Initial design work is in progress– Database?– Language – C, Java, PHP?

DRAFT

Page 7: Scaling Drupal - The Worth Point Web Site

Database Server Scaling

• MySQL reliably supports Master-Slave replication– Master is the INSERT, UPDATE, DELETE database; Slaves

are SELECT-only– Current WorthPoint code allows one database slave to

support 50,000 page views a day1; end of 2008 goal is 100,000 page views a day per DB slave2

– This means WorthPoint will have roughly 30 DB slaves at 3M page views a day with the current Drupal code

• Beginning to partition / shard the database

1. At the current mix of 20% dynamic and 80% static page views2. At an anticipated mix of 30% dynamic and 70% static page views

DRAFT

Page 8: Scaling Drupal - The Worth Point Web Site

Application Server Scaling

• Load balance multiple application servers• Move from Zend to Quercus• Simple web objects cached with Squid• Complex web objects cached with memcached• Misc

– User sessions moved to memory– Database connection pooling – Use content delivery network for CSS & Javascript

DRAFT

Page 9: Scaling Drupal - The Worth Point Web Site

Image Server Scaling

• Images on a SAN• Two load balanced image servers• Use a content delivery network

– Limelight (http://www.limelightnetworks.com/)

DRAFT

Page 10: Scaling Drupal - The Worth Point Web Site

Search Server Scaling

• SOLR– ReplicationD

RAFT

Page 11: Scaling Drupal - The Worth Point Web Site

Wildcards that may help with performance

• Cloud based solutions• Falcon Storage Engine for MySQL – may be ready

for prime time before the end of the year• A major computer industry player makes a

commitment to Drupal and brings significant resources to bear

DRAFT

Page 12: Scaling Drupal - The Worth Point Web Site

Summary

• While there is some additional analysis and work to be done, WorthPoint’s current software will scale to millions of page views a day with the addition of hardware

• The current challenge is to cost effectively scale• The next challenge is to replicate and load balance

the WorthPoint systems across multiple geographically distributed data centers e.g. U.S. East Coast, U.S. West Coast, Europe, the Far East, etc

DRAFT

Page 13: Scaling Drupal - The Worth Point Web Site

Contact

Andy Forbes, [email protected]

Marc Benton, Director of Product [email protected]

Arman Anwar, Director of Systems [email protected]

DRAFT