27
Scalability at GROU.PS Emre Sokullu

Scalability at GROU.PS

Embed Size (px)

DESCRIPTION

How does GROU.PS scale to serving 1PB of assets each month. memcache, nginx, gearman, tornado, libevent, kqueue, epoll, mysql, sharding, replication, memcached, tokyo cabinet

Citation preview

Page 1: Scalability at GROU.PS

Scalability at GROU.PS

Emre Sokullu

Page 2: Scalability at GROU.PS

Disclaimer

• We’re not fully there yet• We hire: [email protected]

Page 3: Scalability at GROU.PS

Challenges @ GROU.PS

• 3M unique visitors per month• 120M page views• 1PB assets to be served every month– Video,Photos, Files

• Support for 5Gbit/s• Very dynamic pages:– With social networks; p(u,t) = HTML– p(g,u,t) = HTML -> WHERE group_id = ? AND …

Page 4: Scalability at GROU.PS

What is GROU.PS ?

Page 5: Scalability at GROU.PS
Page 6: Scalability at GROU.PS
Page 7: Scalability at GROU.PS
Page 8: Scalability at GROU.PS
Page 9: Scalability at GROU.PS

Distributed Architecture25+ servers, S3 cloud, EdgeCast CDN4 cores + All Linux: Red HatSome Debian, Ubuntu, CentOS

Page 10: Scalability at GROU.PS

Amazon Technologies

• S3• CloudFront• EC2 (elastic IP and persistent storage)• SimpleDB• Queue technologies, distributed hadoop and

more…

Page 11: Scalability at GROU.PS

Amazon Technologies

• Downside: – Not so cheap– Bad database performance

Page 12: Scalability at GROU.PS

Serving Content?

• Use MogileFS – Distributed file serving

• Use CDN– hot content served off from local servers

• Sysctl tunings needed!

Page 13: Scalability at GROU.PS

Our typical sysctl additions• net.ipv4.tcp_syncookies = 1• net.ipv4.tcp_synack_retries = 2• ## Emre edited• # http://www.oracle-base.com/articles/11g/OracleDB11gR1InstallationOnFedora8.php• kernel.shmall = 2097152• kernel.shmmax = 2147483648• kernel.shmmni = 4096• # semaphores: semmsl, semmns, semopm, semmni• kernel.sem = 250 32000 100 128• net.ipv4.ip_local_port_range = 1024 65000• net.core.rmem_default=4194304• #net.core.rmem_max=4194304• net.core.wmem_default=262144• #net.core.wmem_max=262144• fs.file-max=5049800• vm.swappiness=10• ## Emre edited• # from http://forums.softlayer.com/showthread.php?t=3252• net.ipv4.tcp_rmem = 4096 87380 8388608• net.ipv4.tcp_wmem = 4096 87380 8388608• net.core.rmem_max = 8388608• net.core.wmem_max = 8388608• net.core.netdev_max_backlog = 5000• net.ipv4.tcp_window_scaling = 1• net.ipv4.ip_nonlocal_bind=1• # http://rackerhacker.com/2007/08/24/apache-no-space-left-on-device-couldnt-create-accept-lock/• kernel.msgmni = 1024• kernel.sem = 250 256000 32 1024• net.ipv4.ip_conntrack_max = 524288• net.ipv4.netfilter.ip_conntrack_max = 524288

Page 14: Scalability at GROU.PS

MySQL

• Load off via memcache– $memcache->set(“group_by_name.jtpd”, 1122, false, 0);– $memcache->set(“home_module_html.1122”,…, true, 30);– function getGroupID($group_name) {

global $memcache; if( !isset($memcache) || ($res=($memcache->get(“group_by_name.{$group_name}”)))===false ) { // get it from mysql and memcache } else { return $res; // serve from memcache }}

Page 15: Scalability at GROU.PS

MySQL

• Replication easy• Split Reads• What about writes?• That’s where sharding comes to play– Vertical Sharding– Horizontal Sharding

• MMM

Page 16: Scalability at GROU.PS

MySQL

• Runs poorly on multi-cores• query_cache_size = 0 # on master• query_cache_type = 0 # on master• thread_concurrency = 8 # total cores• max_connections = 750 # shouldn’t exceed

that• innodb_buffer_pool_size = 10G # a little less

than the total amount

Page 17: Scalability at GROU.PS

MySQL Query Optimization

• INDEX group, user• WHERE group = ? AND user = ?• Not WHERE user = ? AND group = ?• B-tree

Page 18: Scalability at GROU.PS

MySQL Query Optimization

• SHOW PROCESSLIST• Maatkit, mk-query-digest• Percona builds

Page 19: Scalability at GROU.PS

NOSQL

• Voldemort, Linkedin• Cassandra, Facebook• Tokyo Cabinet, mixi

Page 20: Scalability at GROU.PS

Logging

• Database logging is not the solution• File system is expensive too• A legal necessity

Page 21: Scalability at GROU.PS

Logging

• Solution:• Scribe & Thrift• By Facebook• Eventually consistent

Page 22: Scalability at GROU.PS

Nginx & libevent

Page 23: Scalability at GROU.PS

Nginx & libevent

• Handles 10000 connections• 5gbit/s• Rambler• Wordpress• Grou.ps

Page 24: Scalability at GROU.PS

Postfix

• Run multiple instances• Spam Clusters

Page 25: Scalability at GROU.PS

Monitoring

• Munin + monit• Other alternatives:– Cacti– Nagios– Hyperic – vmware

Page 26: Scalability at GROU.PS

PHP

Page 27: Scalability at GROU.PS

More to come on my blog

• http://emresokullu.com• More fine tuning tips• Become a member of my community• Love grou.ps ;)• Convert to PHP• We’re hiring: [email protected]