Approach to optimizing the application on the example of the popular CMS

An article can help your live project not to slow down (or slow down less), or to become a starting point for researching a third-party product.
For example, you have a task to understand what is happening inside “Samopisnaya system 3.14” and, again, to help her not to have 100 megabytes of RAM per client.

About the study program

WebAsyst Shop-Script is the second attempt by the guys from Artikus to do well. The first attempt was full of holes and bringing to this day many problems of Shop-Script Premium.
Strictly speaking, WebAsyst is a whole complex of programs such as a notebook, calendar and project manager, but for a person who is not the first day in Internet development or business, these solutions are unlikely to be interesting (the same basecamp).
About whether or not their attempt was crowned with success, I can say this, quite recently we celebrated the 666th revision of the alternative branch and this is not the end.

Goals

The goal is to identify the most resource-intensive operations and determine the state of the system with critical data volumes. By data, I mean the number of categories and products. In some cases, optimization recommendations will be given, but seeing the source of the trouble is not so difficult to neutralize it.
')

Preparation: directory structure and dependencies

Most recently, I asked about component in Habravopros for automatic generation of dependencies between files and discovered the inclued extension that is friendly with Graphviz, which I advise you, otherwise understand that practically the main component of the program is located at
\ published \ SC \ html \ scripts \ modules \ test \ class.test.php
and why is needed
published \ SC \ html \ scripts \ modules \ test \ _methods \ search_simple.php will be boring and uninteresting.
I went on the grep path, since I had the time and the need to know the essence, but I don’t want it anymore and I don’t advise you.
Especially considering their experiments with callback, from which I still feel bad.

Preparation: filling the contents

It would be foolish to argue that testing (if it is not just the alpha development stage) on a small amount of data does not make sense. Therefore, first of all fill your CMS content to the eyeballs. If there is no real data, write a spammer who writes to the database while it can or while you want it to be random.
WebAsyst SS behaves in completely different ways on catalogs with 450 and 4500 products.

Preparation: Overloading Standard Functions

I will speak not about overloading from the point of view of the PLO, but overloading into the forehead. The method is simple - we go through all the files in search of a standard function, replace it with ov_ {original_name} and here all the maps are open. We want to log all requests to the database, we want to see who and when knocking on fopen, fwrite, file_get_contents, or trying to use black magic, like eval. Logging is mysql_query is the most useful, since performance usually comes up against a backend.
I use something like this:

function ov_mysql_query($text) { $debug= false ; if ($debug){ $source=debug_backtrace(); $src_array_count=count($source); $what=array( '\r\n' , '=' , ',' , '?' , '&' , '+' , ')' , '(' , '/' ); $to=array( '\r\n' , '_' , '_' , '_' , '_' , '_' , '_' , '_' , '_' ); $filename=str_replace($what,$to,$_SERVER[ 'REDIRECT_URL' ]); static $function_counter_m = 0; $function_counter_m++; $oldDir=getcwd(); chdir($_SERVER[ 'DOCUMENT_ROOT' ]); $fp = fopen( 'logs/' .$filename. '.log' , 'a' ); fwrite($fp, $function_counter_m. ') ' .str_replace( '\r\n' , '' ,trim($text)). "\r\n" ); for ($i=0;$i<$src_array_count;$i++) { fwrite($fp, 'DEBUG INFO:' .$source[$i][ 'file' ]. ' | ' .$source[$i][ 'line' ]. "\r\n" ); } fwrite($fp, "\r\n" ); fclose($fp); chdir($oldDir); } $q=mysql_query($text); return $q; }

As a result, a file with debug information (call stack) about the request and the request itself is saved in the www / logs folder.

Preparation: xDebug

To be honest, I find it difficult to call debugging an attempt to understand another's mechanism. Rather, it is preparation. However, the fact whether you have a debugger directly depends on whether you can identify bottlenecks and optimize the system. If you write programs in php, then you need xDebug-it is free and all supported by at least a little self-respecting php code editors.

The debugger generates special dumps in the directory you set, in which various data is stored (optional). Since my main OS is Windows, you can have an advantage in this step, since Linux’s kcachegrind is much more convenient for wincachegrind (both programs allow you to view these dumps, although in truth these are regular txt files that you can read through notepad when due severity).

Let's start to slow down the zombies.

Test stand

Used version of WebAsyst : 287 (clean, without patches and mods) *
Number of products in the database : 4461
Number of types of characteristics in the database : 144
Number of values of characteristics in the database : 44170
Number of categories in the database : 354
Number of images in the database : 3516

  * but those who follow their changelog don't care, because only the color of the buttons in the sub-window of the admin window is different from 250

Little about the initial configuration

The results of the clean engine on the default one product and one category. In the program there are regular caching mechanisms, but how much they justify themselves, you can judge for yourself.

Page*	Queries to the database with default cache	Queries to the database without default cache	Download speed with default cache **	Download speed without default cache **
the main	64	73	10,304	17,011
Category	83	90	10,616	19,457
Product	100	107	15,010	28,731
Search (successful)	69	76	10,507	18,209

  * the link leads to the screenshot of the page so that it is clear whether the requested data volumes are commensurate with the displayed ones
 ** camulative time from Wincachegrind

If your hair has not stood on end, read on and do not forget that there were only one product and one category.

Configuration with data

It is time to use our test bench with several thousand products and a powerful hierarchy of catalogs.

Page	Queries to the database with default cache	Queries to the database without default cache	Download speed with default cache	Download speed without default cache
the main	64	73	12,323	19,404
Category	186	193	20,333	29,881
Product	108	115	16,156	30,100
Search (successful)	69	76	20,733	25,162
Selection by characteristics (advanced search)	900	907	43,216	50,242

If the main page still somehow remains with its 64 requests (IN (a, b, c, d, ..., z)), then the category is a bit sausage, and selecting by the characteristics will destroy not just the usual hosting but also VPS. But you do not think that disabling the advanced search will help you? This software product has several undocumented features that in the hands of competitors can make life difficult.
You can find out about these features by digging into the class responsible for handling URl (class.furl.php). For example, you can continuously hammer at store.ru/category/category_with_lot_products/ all /. I have 113 pages in this category in the top level.
Plate:

Page	Queries to the database with default cache	Queries to the database without default cache	Download speed with default cache	Download speed without default cache
Category (/ all /)	241	248	430,049	439,102

Small subtotal

At the current stage of the research we know:

It is possible to create a potentially high load.
The number of database requests with a cache and without is too large

Also, if you look at the dump created by the debugger when you load the store.ru/category/category_with_lot_products page, you can safely identify two most voracious operations:

foreach ($Interfaces as $_Interface){ ModulesFabric::callInterface($_Interface); }

and

print $smarty->fetch($CurrDivision->MainTemplate);

In addition to them, a lot of resources are spent on getting the category tree, is_object is called more than 95 thousand times, the program asks LanguagesManager :: getInstance 70 thousand times and counts the string length more than 28 thousand times, and LanguagesManager :: ml_isEmpty calls 2/3 of the slowest operations- getExtraParametrs.

Problem Solving Options

Light

If you do not have many visitors, but the program slows down, you can use file caching with minimal integration time.
I offer this scheme:

Find a heavy function
Determine whether it depends on any global variables.
Rename it to something like {original_function} _cached
Create {original_function}, in the body of which it is called through the special function {original_function} _cached

In the early stages of optimization, when it was necessary for the program to work quickly and there was no time, I used this solution:

function cache_function($buildCallback, array $args = array(), $timeoutHours = 5){ $oldDir=getcwd(); chdir($_SERVER[ 'DOCUMENT_ROOT' ]); // - if (is_array($buildCallback)){ $cacheKey = get_class($buildCallback[0]) . '::' . $buildCallback[1]; } else { $cacheKey = $buildCallback . ':' . serialize($args); } $cacheKey .= ':' . serialize($args); if (!file_exists( 'functions_cache/' .$buildCallback. '/' )) { @mkdir( 'functions_cache/' .$buildCallback. '/' ); } $file_path = 'system_cache/' .$buildCallback. '/' . sha1($cacheKey); if (!file_exists($file_path) OR filemtime($file_path) < (time() - $timeoutHours*60)){ $result = call_user_func_array($buildCallback, $args); file_put_contents($file_path, serialize($result), LOCK_EX); } else { $result = unserialize(file_get_contents($file_path)); } chdir($oldDir); return $result; }

We get:

function original_function($arg1,$arg2){ return cache_function( 'original_function_cached' ,array($arg1,$arg2),10); }

As a result, in the directory www / functions_cache / original_function_cached the serialized result of executing the function original_function_cached will appear and 10 hours will be used.

Difficult

No matter how we cache the results of the functions, we still have the resource-intensive fetch, which collects from a dozen templates that use dozens of controllers and plug-ins a single page.
Here I would suggest optimizing the number of templates, creating a normal hierarchy of them (by default, all templates are stored together) and begin to move in the direction of block caching. Thus, on the most visited pages we will see a rather large increase in speed.

Very difficult

But, if you, like me, have no choice but to work with WA and you will work with him for a long time, these are all half measures.

Optimization, rewriting of algorithms is necessary (look at leisure as they have pagination implemented) and non-hacking caching. In general, it is easier for me in this regard, since I know that new content is added automatically at a certain time and at this time I can afford to reset the entire cache. You also have to tune cache groups and change a lot (from generating URLs to restructuring derrickia) in order to fight price invalidation and characteristics. In most of the problems, of course, you can cope with the help of smarty, the work of the program with which you will have to rebuild nicely, since WebAsyst SS itself seems not to be going to use its caching mechanisms (everything speaks about this).

For example: we cache the entire page with the product, and set the lifetime to 5 hours. It is assumed that the price may change before, and you don’t really want to drop the cache. You can create smarty-plugin, which will apply to the desired method of the desired model (say $ productModel-> getPrice ($ pID)) and return the price. On the page with the goods we receive 1 request to the database. The view cache is not rebuilt.

Conclusion

Some long it turned out, but it seems that everything is essentially.
I hope that the ready-made solutions and recommendations from this article will push you to something new (whether it be inclued or xDebug, or the rule not to take a word from the developers who say that they all go to the database through the class) or will help to develop old ones ideas.

Source: https://habr.com/ru/post/105887/

All Articles