📜 ⬆️ ⬇️

Increase Magento Performance

... or the right work with collections.

I want to tell you about the errors that I saw on almost every project on Magento that had performance problems. Working with Magento, I sometimes have to audit someone else's code. Therefore, I would like to share with you an experience that will help improve the performance of your sites and avoid mistakes in the future.

This article is about Magento 1. *, but the described is also suitable for Magento 2. *.

In almost every project where there are performance problems, you can come across something like this:
')
$temp = array(); $collection = Mage::getModel('catalog/product')->getCollection()->addAttributeToSelect('*'); foreach ($collection as $product) { $product = $product->load($product->getId()); $temp[] = $product->getSku(); } 
Wrong

instead

 $temp = array(); $collection = Mage::getModel('catalog/product')->getCollection()->addAttributeToSelect('sku'); foreach ($collection as $product) { $temp[] = $product->getSku(); } 
Right

The reasons for this are very simple:

  1. No attributes needed after upload.
  2. So do the "programmers" on the Internet
  3. Downloading extra attributes on a “no worse basis”

To understand what is wrong here and what we can do with performance, I suggest concentrating on working with collections:

  1. Eav / Flat tables
  2. Cache
  3. Proper work with collections

And of course the conclusions.



EAV / Flat tables


EAV is a data storage approach where the entity to which the attribute belongs, the attribute itself and its value are spaced apart in different tables.

In Magento, EAV entities include: products, categories, customers, and customer addresses. The attributes themselves are stored in the eav_attribute table.

Total attribute value types in Magento 5: text, varchar, int, decimal and datetime. There is 1 more type - static, it differs from the other 5 in that it is in a table with an entity.

The attribute table indicates which table or type of an attribute is present in, and Magento already knows where to write it and where to read it from.

Such a storage of values ​​allows you to have simply settable attribute sets (when each entity can have its own attribute or not have it at all), adding a new attribute is just another line in the database. Added a new value for 1 attribute for another store - a new line in the table of values ​​of this attribute.

How it is stored in the database
Entity:
Product - catalog_product_entity,
Category - catalog_category_entity,
Customer - customer_entity,
Customer address - customer_address_entity

Attribute:
eav_attribute
catalog_eav_attribute
customer_eav_attribute

Value:
* _text
* _varchar
* _int
* _decimal
* _datetime

Flat is the usual approach for all of us, where everything lies in one place and no additional tables are needed to get the product and all its attributes without unnecessary work - SELECT * FROM WHERE label id = some kind of id and that's it.

From EAV entities, the Flat view can be used only for categories and for products.

How it is stored in the database
Product:
catalog_product_flat_1 // * _N store_view
Category:
catalog_category_flat_1 // * _N store_view

In order to include an attribute in the Flat table and generally enable the use of Flat tables, do the following
In the admin panel of Catalog> Attributes> Manage attributes

Magento will add an attribute to the Flat table if the attribute has 1 of the following values.



In the admin System> Configuration> Catalog

Magento will use Flat tables for the entities listed below.



Note the following facts:

  1. Flat tables are used ONLY on the category pages, the list of products in the Group product, and indeed everywhere where the collection is used. They are not used on the product page, in the admin, when using the load method on the model.
  2. After the inclusion of Flat tables, it is necessary to re-index, otherwise Magento will continue to use only EAV tables.
  3. After enabling Flat Tables, Magento continues to use EAV anyway, but also starts copying changes to the Flat table while saving changes.

Why is all this necessary and why not use the Flat approach everywhere? Look at the summary table of pros and cons.
EAV:
+ More flexible system than Flat
+ When adding a new attribute, there is no need to re-index the data.
+ Virtually unlimited attributes
+ All attributes are always available.
+ Static attributes (sku, created_at, updated_at) are always present in the sample, even if they are not specified
- Fatal error: Call to a member function getBackend () when sampling / filtering by a non-existing attribute
- Performance

Flat:
+ Performance
+ Only existing attributes that have been added to the Flat table can be applied to the selection / filtering.
- A limit on the size of the row (up to 65,535 bytes, i.e. 85 varchar 255) and the number of columns (InnoDB up to 1000, some up to 4096)
- Used only when working with collections (EAV is always used when loading)
- The result is different from issuing a request for EAV (there are no static attributes)
- After activation, re-indexing is required, otherwise EAV tables will be used
- When adding a new attribute, it is necessary to re-index Flat tables.



Cache


Of course, each of you can tell me why we need to figure out how to speed up queries in the database and, in general, how collections work if the cache will save us and everything will be cached. I will answer shortly - the cache will not save you. None of the caches presented in Magento either caches collections automatically or does not work in your custom controllers and models that you use, for example, when importing data or counting something. And besides, before it gets into the cache, you need to somehow put it in there and quickly show it to the user.

Types of caches in Magento 1. *:




... and neither caches collections automatically.


Proper work with collections


In order to show more clearly why something needs to be done differently than many are used to, I decided to give some performance tests of different approaches. Let's start with the test bench. For testing, I used:

Test bench:
OS X 10.10
3.1 GHz Intel Core i5 (4 cores)
8GB

Magento configuration:
Magento EE 1.14.0
MySQL 5.5.38
PHP 5.6.2

Content:
3 Categories
2000 Products
2000 CMS pages

Process:
For tests, an extension with 1 controller and 1 action was created, each test was performed 5 times, then the average time was calculated. All results are shown in seconds.

 class Test_Test_IndexController extends Mage_Core_Controller_Front_Action { public function indexAction() { $temp = array(); $start = microtime(true); Init values Loop start $temp[] = $product->getSku(); Loop end Or Some code snippet $stop = microtime(true); echo $stop - $start; } } 

Pseudo code

Tests


  1. EAV / Flat with and without model reload
  2. Collection caching
  3. Proper use of count () and getSize ()
  4. Proper use of getFirstItem and setPage (1,1)

EAV / Flat with and without model reload


The cycle of the collection. With load (reload) models inside the loop:

 $temp = array(); $collection = Mage::getModel('catalog/product')->getCollection()->addAttributeToSelect(...); foreach ($collection as $product) { $product = $product->load($product->getId()); $temp[] = $product->getSku(); } 

The cycle of the collection. Without load models inside:

 $temp = array(); $collection = Mage::getModel('catalog/product')->getCollection()->addAttributeToSelect(...); foreach ($collection as $product) { $temp[] = $product->getSku(); } 

3 types of data sampling:

  1. addAttributeToSelect ('*'); // all attributes
  2. addAttributeToSelect ('sku'); // 1 static attribute
  3. addAttributeToSelect ('name'); // 1 standard attribute

results


As you probably noticed, the time without rebooting the models is several times less than when you reload the models. Also, the time is even shorter when the Flat tables are turned on (i.e. there are no unnecessary joins and unions) and we select only the necessary attributes.

In the first case, we perform a download with a bunch of joins ... and then do it again, but for the model and so 2000 times.

The second time we do this is for attribute statics (it is in the same label as the product itself) and Magento does not need to make joins. Therefore, time is less.

The third time Magento need to add another nameplate where this attribute is stored.

With Flat tables, everything is the same, and in 2 cases everything is identical - this is because both attributes are in table 1, hence the time is identical.

I think the numbers speak for themselves.


Collection caching


Without cache:

 $collection = Mage::getModel('catalog/product')->getCollection() ->addAttributeToSelect('*'); 

Using the initCache method:

 $collection = Mage::getModel('catalog/product')->getCollection() ->addAttributeToSelect('*') ->initCache(Mage::app()->getCache(),'our_data',array('SOME_TAGS')); 

Custom caching implementation:

 $cache = Mage::app()->getCache(); $collection = $cache->load('our_data'); if(!collection) { $collection = Mage::getModel('collection/product')->getCollection()->addAttributeToSelect('*')->getItems(); $cache->save(serialize($collection),'our_data',array(Mage_Core_Model_Resource_Db_Collection_Abstract::CACHE_TAG)); } else { $collection = unserialize($collection); } 

Consider a sample without using a cache, using the method that Magento offers us and with a crutch, which I have never seen ... the pile itself, based on the methods of the model cache. Please note that for all the tests, after making a query, I downloaded the data and converted the collection to an array of objects.

results


Without the cache itself is not surprising ... everything is as usual.

But using the Magentov cache, I was personally surprised when I saw that time had become more. And about EAV, caching is generally a silly undertaking, because the EAV collection first loads entities from the product table (this is what is cached), and then selects the attribute values ​​and fills the objects with a separate query. In Flat there everything from 1 table is being chased. But nevertheless, the time is spent on working with the cache more than from the database (I tested it both with the file system and with redis - the differences are the 4th decimal point ... that is, it does not exist on 2k entities). The essence of the InitCache method is that it first collects all the data into the collection itself (pagination, filters, events, and so on), creates a hash from the sql query and will search it in the cache, and if there is something there, then it is anseralizes, and then all the events and subsequent methods are launched. This is the slowest procedure in the whole process; it is here that the cache is slower than a simple query in the database. But it does not send a request to the database ... which is not so scary already.

Separately, there is an example of the cache written by me on my knee, where we cache the final result of the collection, and bypassing all the events and reloading attributes. This works for EAV and for Flat collections.

Proper use of count () and getSize ()


getSize ()

 $size = Mage::getModel('catalog/product')->getCollection() ->addAttributeToSelect('*') ->getSize(); 

count ()

 $size = Mage::getModel('catalog/product')->getCollection() ->addAttributeToSelect('*') ->count(); 

results


The difference in methods is that count () loads all the objects in the collection, and then the usual count counts the number of objects and returns the number to us. getSize does not load the collection, but generates 1 more query to the database, where there are no limits, orders and a list of selectable attributes, there is only COUNT (*).

An example of using both methods is:

If you need to know if there are any values ​​in the database or how many there are, use getSize, if in any case you need a loaded collection, or already loaded, use count () - it will return you the number of elements loaded into the collection.

Proper use of getFirstItem and setPage (1,1)


getFirstItem ()

 $product = Mage::getModel('catalog/product')->getCollection() ->getFirstItem(); 

setPage (1,1)

 $product = Mage::getModel('catalog/product')->getCollection() ->setPage(1,1) ->getFirstItem(); 

load ()

 $product = Mage::getModel('catalog/product')->load(22); 

results


The problem with getFirstItem is that it loads the entire collection, and then simply returns the first item in foreach, and if it is not there, it returns an empty object.

setPage (also known as $ this-> setCurPage ($ pageNum) -> setPageSize ($ pageSize)) limits the selection to exactly 1 record, which, as you can see, significantly speeds up the loading of the result.

Even load is faster than getFirstItem, but note that load was slower than selecting one item from the collection. This is due to the fact that load always works with EAV tables.



findings


Summarizing everything written above, I want to advise all people working with Magento:

Source: https://habr.com/ru/post/282025/


All Articles