25
MAGENTO SCALABILITY from the trenches Piotr Karwatka

Magento scalability from the trenches (Meet Magento Sweden 2016)

  • Upload
    divante

  • View
    10.222

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Magento scalability from the trenches (Meet Magento Sweden 2016)

MAGENTO SCALABILITY

from the trenches

Piotr Karwatka

Page 2: Magento scalability from the trenches (Meet Magento Sweden 2016)

AGENDA

1. General scalability rules2. Action Plan – scalability framework3. Magento B2B case

1. EAV and indexes,2. Cache3. Replication4. Fine-tuning

4. Magento 2.0

2

Page 3: Magento scalability from the trenches (Meet Magento Sweden 2016)

THE CHALLENGE

- Good architecture – a rare good,- There is no holy grail of scalability,

- Always take custom approach – measure before optimizing,

- Start “cheap”, scale fast – risky- Processes driven over improvisation,- Redundancy – scalability goes with availability

- Divide and conquer – using layers- Measure and examine bottlenecks,- Scale only overloaded layers

- Good news: Magento is scalable by design

3

middleware

cache

storage

app

db

Page 4: Magento scalability from the trenches (Meet Magento Sweden 2016)

HARDWARE APPROACH

At start – optimize code & use cache (New Relic, collected to catch bottlenecks); try HHVM, nginx, OpCache

Vertical: more RAM, more CPUs + no code changes required, fast gain- technology barriers, - at some point very expensive

Horizontal: more cheap servers+ high availability when done right,+ cloud ready, - often requires code refactoring,- challenging configuration and dev-ops

4

Cost at scale

Page 5: Magento scalability from the trenches (Meet Magento Sweden 2016)

ACTION PLAN

Step 1- use vertical scaling as far as it’s reasonable,- optimize code to avoid bottlenecks,- use caching where it’s possible,- separate database server- separate static files or/and use CDN,

Step 2- add additional app servers,- establish cache cluster,- use reverse proxy (Varnish)

Step 3- use database replication,- scale up using horizontal scaling 5

First go vertical

Then go horizontal

Page 6: Magento scalability from the trenches (Meet Magento Sweden 2016)

MAGENTO CASE – THE CHALLENGE

TIM.PL – largest B2B site in Poland. About 100 000 000EUR / yearPlatform for customers – offers/inquiries, bulk orders, near real-time CRM/WMS integration

6

- B2B e-Commerce site with external integrations (CRM, PIM, ERP, WMS)

- Up to 1.5M SKU’s,- Up to 2K active concurrent users,

average session time: 4h+,- About 6000 attributes,- About 2189 attribute sets,- 1M+ website calls / day,- Challenging read/write ratio: 50/50%

- B2B features, site used as tool/platform; browse/checkout scenario

Page 7: Magento scalability from the trenches (Meet Magento Sweden 2016)

We called it MVP.It worked well to some point...

7

Page 8: Magento scalability from the trenches (Meet Magento Sweden 2016)

FIRST APPROACH – 3 years ago

- Cache for blocks enabled,- FLAT enabled – but at 5000+ attributes InnoDB limits achieved,- The code was optimized quite well (we’ve used Ivan’s tips: http://www.

slideshare.net/ivanchepurnyi/making-magento-flying-like-a-rocket-a-set-of-valuable-tips-for-developers)

- Separated DB server + master-master replication (backup purposes),- SSD disks (APP + DB), lot of RAM (16GB / server) – vertical scaling

approach,- MySQL tuning (IO buffers, InnoDB buffers),- Apache tuning (connection limits, FPM)- HHVM tested – about +50% boost, but no profiling

8

Page 9: Magento scalability from the trenches (Meet Magento Sweden 2016)

OPTIMIZE AND PROFILE!

Always measure impact of change before implementing it to production- JMeter – we used it to emulate throughput and conduct load tests after each

change,- New Relic – to analyze application speed, track slow-queries and method-calls;

it can be used on production servers as well because of near-zero overhead

9

- Collectd – installed on both app and db servers – we’ve discovered bottlenecks on IO and db-locking on Magento’s product indexation,

- Logs – we used ELK (Kibana) and custom New Relic integration to diagnose web-services response times,

- htop, iotop – during IO problems it can be useful to find what generates the problem exactly,

- Xdebug/XHProf profiler - on stage servers to debug and profile code and discover cache gaps,

JMeter 2h load

tests

Fine tuning

JMeter 24h load

tests

Optimize one piece

at time

Page 10: Magento scalability from the trenches (Meet Magento Sweden 2016)

High availability is crucial – we switched to 2N model

10

master masterApp servers + GlusterFSboth servers can handle user reqs.

Haproxy + Varnish – load balancerload balancing and reverse proxy for caching and static files

Page 11: Magento scalability from the trenches (Meet Magento Sweden 2016)

APP & CACHE

- Redis is faster than memcached as backend cache,- Varnish (with ESI) is a must for both static files and page caching (we used

Turpentine and Phoenix on some projects – both are fine) - VCL can be challenging,- We managed to use HAProxy as load balancer (using automatic failover),- We’ve added cache to Mage_Catalog_Model_Product::load

- Consider adding cache to Mage_Eav_Model_Entity_Abstract to avoid EAV at all – we couldn’t use FLAT because of attributes count,

- We turned on FLAT to 900 most frequently used attributes (InnoDb limits),- Sessions were moved to Redis,- We discovered lot of queries to core_url_rewrite - cache should help here,- We used Fast-Async Reindexing module while using Magento 1.x to avoid

database locking- GlusterFS used to handle uploads and replication

11

Page 12: Magento scalability from the trenches (Meet Magento Sweden 2016)

VARNISH IMPACT

12

Page 13: Magento scalability from the trenches (Meet Magento Sweden 2016)

APP & CACHE

- Remarks- GlusterFS/network file systems – stat(), open() without local caching are IO

exhausting,- we had some issues with APC on PHP 5.4 (segfaults) – now everybody uses

OpCache ☺- at some point we switched from Apache to nginx + php-fpm to gain speed req/s

throughput and lower memory usage (read more here: http://info.magento.com/rs/magentocommerce/images/MagentoECG-PoweringMagentowithNgnixandPHP-FPM.pdf)

- We had problems with Magento API (really slow responses – 0.5s); optimizations = 0.2s + HHVM = 0.1s; next step – fast responding façade without Magento overhead - http://divante.co/blog/magento-1-9-1-0-page-load-time-0-3s/

- We had problems with Redis clogging with cache Keys (http://divante.co/blog/magento-clogged-redis-cache/)

13

Page 14: Magento scalability from the trenches (Meet Magento Sweden 2016)

HHVM IMPACT

14

Page 15: Magento scalability from the trenches (Meet Magento Sweden 2016)

THE HARD WAY

- Most challenging issues: EAV and indexing- Will be great to use NoSQL DB (MongoDB, SOLR),- At this point we use only model-level cache,

- We’ve disabled Magento logs and reports – less queries, less useless data to store,

- Small configuration tips make big difference:- query_cache_size - up to 128MB works well; furthermore – cache cleaning can

be really, REALLY slow- innodb_thread_concurrency - setting to 0 prevents MySQL from clogging

worker threads (looks like it’s locking but it isn’t)

- We switched from MySQL to PerconaDB/XtraDB- Great gain performance gain on peaks – queries count vs.

response time – up to + 275%,- No code / SQL changes required – 100% compatible with

MySQL,- MemSQL – looks really promising, not tested yet

15

Page 16: Magento scalability from the trenches (Meet Magento Sweden 2016)

DATABASE CAVEATS

16

Without FLAT in place – lot of EAV-related quires, also lot of URL-redirect related queries. Those queries are unnecessary.

Page 17: Magento scalability from the trenches (Meet Magento Sweden 2016)

HOW TO DISABLE EAV?

– it will be great if we can switch to NoSQL DB (like MongoDB, SOLR, Sphinx Search),

– one can overwrite EAV->FLAT indexers but it’s extremely hard (relations, some modules works on RAW SQL),

– suggestions:- Add cache to Product::load method – invalidation is

extremely important (you can use modification date in cache-key or observer based mechanism to clear it up),

- Add cache to load EAV attributes – for products, product categories,

- Overwrite/refactor Mage_Catalog – for searching and browsing products – some search modules do this partially,

- Great knowledge base about EAV: http://www.solvingmagento.com/magento-eav-system/

17

If you cannot use FLAT (categories + products are must) – it’s too slow or you have too many attributes

Page 18: Magento scalability from the trenches (Meet Magento Sweden 2016)

DATABASE SCALABILITY - REPLICATION

With replicas one gets: high availability, more req/s.It doesn’t fit all cases:

Caution: replication-lagsIt’s possible to move selected tables to external servers (like product catalogs).Always consider using cache first!

18

:-)

:-(

master slave

mastermaster

master

master

master

TB: users

TB: photos

Page 19: Magento scalability from the trenches (Meet Magento Sweden 2016)

INDEXATION VS. REPLICATION

- Master-slave replication shall help with db-locking issue;

- MySQL replicates only UPDATE/INSERT operations using binlogs

- this is extremely fast and doesn’t lock replicas

19

public function processEntityAction(Varien_Object $entity, $entityType, $eventType)... $resourceModel = Mage::getResourceSingleton('index/process'); $resourceModel->beginTransaction(); $this->_allowTableChanges = false; try { $this->indexEvent($event); $resourceModel->commit(); } catch (Exception $e) { $resourceModel->rollBack(); if ($allowTableChanges) { $this->_allowTableChanges = true; $this->_changeKeyStatus(true); $this->_currentEvent = null; } throw $e;

Page 20: Magento scalability from the trenches (Meet Magento Sweden 2016)

DATABASE – NEXT STEPS

- We’ve tested app-local master-slave replication to avoid network latency and database-locking– Magento supports this kind of replication out of the box,– Next step – move catalog database to separate server,– Route Admin panel requests to separated servers (using multi-

master Magento2 feature)

20

master masterApp servers + GlusterFS + PerconaDBlocal db-slave’s for read access Each server can handle user requestsHaproxy & Varnish

load balancer + proxy

Indexing, updates,Imports, RDBM

Page 21: Magento scalability from the trenches (Meet Magento Sweden 2016)

INTEGRATIONS

- We use queuing to avoid bottlenecks,- On each app server there are Gearman workers

(PHP processes) – responsible for getting prices, stocks, transferring orders,

- Workers exchange data with CRM, WMS, ERP, PIM in both async and sync modes – using priorities,

- We used Command/Task design pattern,- We log everything using ELK – especially

Kibana and New Relic to analyze external systems

- Magento API can be very challenging (it’s extremely slow)

21

Page 22: Magento scalability from the trenches (Meet Magento Sweden 2016)

MONITORING

We use Kibana (ELK stack) and custom New Relic metrics to monitor real-time integrations (CRM, WMS, ERP)Zabbix with Sellenium scripts is used to monitor and alert website availability

22

Page 23: Magento scalability from the trenches (Meet Magento Sweden 2016)

FINAL ARCHITECTURE

23

master masterApp servers + GlusterFS + PerconaDBlocal db-slave’s for read access Each server can handle user requests

Haproxy & Varnishload balancer + proxy

Gearman queue workershandle background jobs and externalintegrations

API calls

Web requests

External sys. Calls

background jobs

Page 24: Magento scalability from the trenches (Meet Magento Sweden 2016)

WHAT I’VE MISSED + MAGENTO 2

- Search – we used FactFinder / SOLR,- Details about Varnish and HHVM

- Life is going to be easier: What excites me in Magento2?– Materialized views engine – smarter indexation,– Full page caching in community,– Multi master DB contexts,– Checkout optimizations

24

Page 25: Magento scalability from the trenches (Meet Magento Sweden 2016)

THANK YOU! QUESTIONS?

25

Technical or scalability challenges? Contact me to consult your case for free!

Piotr Karwatka ([email protected])Divante – http://divante.co