Upload
meet-magento-italy
View
33
Download
2
Embed Size (px)
Citation preview
About me
• Project Manager @ Webformat
• Magento and TYPO3 projects
• Requirements analysis
• Planning of development and support activities
Once upon a time..
• The project began as a migration from a proprietary platform to Magento 1 Community
• Shoes and accessories E-commerce
• We developed the integration between their management software, that was handling products anagraphic, warehouse and orders anagraphic
• Integration with Amazon e Ebay
Products Database
• The original products database counted around 150k products
• Configurable products• On average, 10 simple products for each configurable
• Virtual products
Continued Growth
• In one year only we reached the amount of 700K products stored in Magento
• 66k configurable products
Challenges facedAlignment between catalog and management software
Updating warehouse
Reindexing
Generating images
Server response time
Backoffice operations
Third parts modules integration
Export of feed for Google shopping & Co.
Marketplace synchronization
Disk space
Updating the catalog (1/2)
• Initially 150k products, this is what we planned:• Massive initial import
• Frequent update during the day via webservice
• When the catalog started growing, the data exchange volumes via webservice began unsustainable. The exchange procedure needed a redesign.
Updating the catalog (2/2)
• Today we have700k products• Based on Magmi and CSV file exchange (product anagraphic)
• Nighttime update – the DIFF
• Exceptional whole catalog update
• The client accepted that the new products will be published with a delay of 1 day
Warehouse update (1/2)
• No warehouse fully dedicated to web
• Shared with the offline shops
• It’s not possible to update the warehouse nighttime only and use that stock during the day
• Frequent updates
Warehouse update (2/2)
• Every 15 minutes update from management software by loading the DIFF
• Only stock update
• Via CSV file writing directly on database (Magmi)
Reindex (1/2)
• The bigger the catalog, the slower the reindex
• Initially, the reidex was lauched after each update (15 min)
• After a while, the reindex started being too much time demanding: the update cycle was starting when the previous update reindex cycle was still running.
Reindex (2/2)
• Solution: • All the reindexes have been disabled, except for the stock reindex
• All reindexes are now performed after the nighttime import
• Today a full reindex takes around 75 minutes and generates a heavy load on the database
Catalog_url_rewrite (1/2)
• Magento 1 has a critical point with URL rewrite process:• All product URLs are rewritten, also simple products that are «Not visible
individually» and exist only to be associated to a configurable.
• With 700k products catalog, this meant:• Creating millions of rows in the catalog_url_rewrite table
• An URL rewrite process that takes hours to be completed
Catalog_url_rewrite (2/2)
• A patch has been installed, to avoid the simple and not visible individually products url generation
• Module Dnd_Patchindexurl:https://www.magentocommerce.com/magento-connect/dn-d-patch-index-url-1.html
• Now the reindex process takes around 20 minutes
Images generation (1/2)
• One of the main problems that we had to face was the product thumbnails generation, done by Imagemagik
• Every day hundreds of products are published
We verified that the frontend CPUs were often stressed because of Imagemagik process and the writing operations on database
Images generation (2/2)
• We found a solution in generating the thumbnails during the massive import, so Imagemagik could work together with the import procedure
• Nighttime, the images are generated and saved in a dedicated server, without interfering with user navigation
• Today we have around 881K images saved
Server response time
• With such a huge catalog, some categories hold even hundreds of products
• The first loading time (if they are not cached) is indeed high
• We activated caching on Redis and Varnish
• Not enough, the first loading time was anyway too heavy
Solutions 1/2
• Moving the cache clearing process during the night
• At 8 in the morning, the website navigation was starting to suffer
• We planned a job to pre-cache all the critical pages
• Minimized cache invalidation• Clear cache only for products for which the stock quantity was updated via
WS
Solutions 2/2
• Client training to better handle the cache erasing
• Minimized the number of filters in layered navigation• Each filter increases the reindex time and the pages combinations not cached
Backoffice operations
• Initially all the catalog update activities were performed from Magento backoffice
• Problems:• Frequent reindexes
• Frequent cache updates
• Server load (the backoffice product list filters are CPU demanding and they charge MySql)
• Common operations were slown
• Several BE users ended to be concurrent
Solutions
• Initially a new backoffice server have been introduced• MySql load problem was not solved. Reindex re-caching as well.
• We introduced a new process to handle the catalog, using an excel file • This improved the efficiency of who was managing the anagraphic data
• Massive excel file import performed each 3 days via FTP
• Categories still handled from backoffice
Third party modules integration
• Critical point
• Not all the modules found in the Marketplace are developed in an optimal way• They «simply» load the products collection without pagination
• They execute nested query
• There are cycles on collections that initialize all products unnecessarily
• …
• A big profiling and optimization work was needed
Feed export (Google Shopping & Co.) 1/2
• While the catalog was growing, the feed time export was encreasing as well
• In the very beginning, the exports were handled by a Magento module
Feed export (Google Shopping & Co.) 2/2
• Solution steps:• The module have been replaced with ad-hoc procedures, with high level of
optimization
• The exportation jobs are executed on backoffice server during the night, to not load the frontend
• It have been introduced a MySql slave as data source, to not load the master and the website as a consequence
Marketplace synch
• We are using M2E Pro
• Client side: EAN code full check
• Tech side: handling the automatic synchronization process• An automatic full synchronization is too heavy. When synchronize?
• What synchronize?
• Magmi
Disk space (1/2)
• Well, here we are: even if disk space is quite cheap, using too much of it it’s not convenient..
• Data exchange logs very heavy• Frequent data exchange and huge amount of data
• Log files were growing fast
• Log rotate was activated hourly
• Log are archived after few days
Disk space (2/2)
• High image quantity, continuously growing
• Huge feed export
• Huge CSV import files
• …
• Solutions applied:• Constant monitoring activated
• Activated automatic procedures to clean log, old images, expired feed, etc.
Challenges to be facedElasticsearch integration
Growing catalog, until 1M products
More sells, more page views
Magento 2 migration
Elasticsearch
• For two reasons:• Improve the search functionality offered to the client
• Minimize the load produced by the Magento internal search engine
• Critical issues to be faced:• Catalog index time
• Only configurable products?
• What about the sizes?
1M products
• Expected growth: in 1 year we’ll have 1M products
• At the moment we are performing tests with fake products
• We didn’t detect other critical aspects• At the moment, we had to develop some more data exchange and feed
generation procedures optimization
More sells, more page views
• Sessions are increasing the number of not cached pages views is increasing• Pre – caching extension
• Increasing Varnish cache TTL
• Minimize products in categories and filters used
• Sales are increasing increasing also frequency of out-of stock products • To be evaluated: the impact of new reindex and re-caching politics on client
What if..?
• We’re planning with the client a Magento 2 migration
• We started our tests by migrating the actual Magento 1 environment (700K products) to a Magento 2 installation
• We collected the results and still performing some other tests
HW specs
All tests were run on a VirtualBox VM with Linux Ubuntu 16.04.1 LTS, 8 GB RAM, 1 x 2,60 GHz cpu
Lamp configuration was featuring PHP version 5.6, Apache 2.4.18, MySQL 14.14
Migration was performed from Magento version 1.9.2.2 through 2.1.3
Magento 2 migration (1/4)
• DB migration times: 1h 20‘
• BE performances:
BE Operation Magento 1 with cache Magento 2 with cache
Access to catalog almost 5' 7''
Access to product 3'' 10''
Access to categories 7'' 6''
Product searching 1'5'' 3''
Magento 2 migration (2/4)
• FE performaces for catalog browsing:
FE Operation Magento 1 with cache Magento 2 with cache
Catalog browsing / categories 30'' 7''
Magento 2 migration (4/4)
• We had some issues with the Catalog Fullsearch reindex (Magento 2)
• we had to apply a patch https://github.com/magento/magento2/issues/5146
• Catalog Fullsearch reindex without patch takes around 2 hours with patch applied took around 1 hour, so the times are quite comparable
02:12:37
02:12:37
Catalog URL rewrite
• M1 with Dnd_Patchindexurl module: 00:14:34
• M1 without Dnd_Patchindexurl module: 01:03:50
• M2: no catalog URL rewrite. URL Rewrite is handled at the product saving
Yes, we can!
• It’s possible, but not without effort
• Large initial analysis
• Special attention to optimization processes
• What about Magento 2?