Profiling PHP scripts on a live server

Good day, community!

Surely some of you have encountered such a problem: the site is slowly running on a real server.
It is important to quickly find out in what places there are difficulties. You cannot use xdebug for this, since it creates a heavy load on the server and introduces errors into the measurements itself. To solve this problem, we chose a system that allows you to very quickly collect tree statistics on the work of the site - pinba.

On Habré already talked about the specifics of working with pinba . If you have not read, you can read the link.

For the impatient immediately give a link to the result .
Plus1 WapStart works in normal mode with a load of more than 1000 requests per second per instance.
')

How does all this work?

Data collection

Pinba sends to its server (via UDP, very quickly) marks of the beginning and end of the length of time (hereinafter referred to as timers) and adds the data to MySQL tables (easy to read). for example

$timer = pinba_timer_start(array('tag' => 'some_logic')); .... pinba_timer_stop($timer);

To build a tree structure, we add 2 additional tags - tree_id (each time a unique id) and tree_parent_id is a tree_id from the timer in which the current is nested. for example

 $parent_timer = pinba_timer_start(array('tag' =>'some_logic', 'tree_id' => 1, 'tree_parent_id' => 'root')); $child_timer = pinba_timer_start(array('tag' =>'child_logic', 'tree_id' => 2, 'tree_parent_id' => 1)); pinba_timer_stop($child_timer); pinba_timer_stop($parent_timer);

Thus, on the server you can reproduce the nesting of timers and build a readable tree.
We have placed timers in all interesting places of the project in order to detect time (for example, with sql queries, writing to files, etc.).

Data preparation

Unfortunately, pinba does not use indexes for queries (except PRIMARY), since its pinba ENGINE is used (the tables are actually stored in memory, and data older than N minutes are deleted, in our case, 5 minutes). But you can not complain about pinba, since it is not intended for queries on indexes.
For us, indexes are important, because we copy all the data from pinba tables into regular MyISAM tables.

 truncate table pinba_cache.request; truncate table pinba_cache.tag; truncate table pinba_cache.timer; truncate table pinba_cache.timertag; insert ignore into pinba_cache.request select * from pinba.request; insert ignore into pinba_cache.tag select * from pinba.tag; insert ignore into pinba_cache.timer select * from pinba.timer; insert ignore into pinba_cache.timertag select * from pinba.timertag;

As you can see from the requests, our system works in the pinba database, and a copy in the pinba_cache database.

We also need another table to work with the tree_id and tree_parent_id fields.

 truncate table pinba_cache.timer_tag_tree; insert ignore into pinba_cache.timer_tag_tree SELECT * FROM ( SELECT null, timer_id, request_id, hit_count, timer.value, GROUP_CONCAT(timertag.value) as tags , (select timertag.value from pinba_cache.timertag where timertag.timer_id=timer.id and tag_id = (select id from pinba_cache.tag where name='treeId')) as tree_id , (select timertag.value from pinba_cache.timertag where timertag.timer_id=timer.id and tag_id = (select id from pinba_cache.tag where name='treeParentId')) as tree_parent_id FROM pinba_cache.timertag force index (timer_id) LEFT JOIN pinba_cache.timer ON timertag.timer_id=timer.id where not tag_id in ((select id from pinba_cache.tag where name='treeId'), (select id from pinba_cache.tag where name='treeParentId')) group by timertag.timer_id order by timer_id ) as tmp GROUP BY tree_id;

The structure of the timer_tag_tree table is shown below. The structure of the remaining tables is the same as in pinba.

 CREATE TABLE `timer_tag_tree` ( `id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT, `timer_id` INT(10) NOT NULL DEFAULT '0', `request_id` INT(10) NULL DEFAULT NULL, `hit_count` INT(10) NULL DEFAULT NULL, `value` FLOAT NULL DEFAULT NULL, `tags` VARCHAR(128) NULL DEFAULT NULL, `tree_id` VARCHAR(35) NOT NULL DEFAULT '', `tree_parent_id` VARCHAR(35) NOT NULL DEFAULT '', PRIMARY KEY (`id`), INDEX `timer_id` (`timer_id`), INDEX `tree_id_tree_parent_id` (`tree_id`, `tree_parent_id`), INDEX `tree_parent_id_tree_id` (`tree_parent_id`, `tree_id`) ) COLLATE='utf8_general_ci' ENGINE=MyISAM

Data analysis

Now - the most interesting. We collected the data, put them in the way we need for the subsequent work. Next, you need to write a script that will give all this data in a convenient form.
How to display one tree (from one request to the site) - I will not write, since this is a trivial task.
The problem is that to assess the bottlenecks you need to analyze hundreds of queries to php, each of which has its own function call tree (timers). We need to collect one generalized tree from these trees.

The combining algorithm is as follows:

For each node, we assume the sum of the execution times of this node over all trees.
Having written a function for combining two trees, you can cycle through all and get the sum.
But here we are in for an unpleasant surprise - a slow time of work.
As you can see from the picture, the complexity of combining 2 trees is O (N * N) (attentive ones will tell me what to do in N * log (N), but then there will be a simpler optimization method, in 3 lines), where N is to in nodes in the tree. Accordingly, it is advantageous to combine small trees, and very unprofitable large ones.
We will try to use this feature. Let's define the execution tree of a single script as a level 1 tree, the sum of two first level trees is a second level tree, etc. In such terms, we need to combine more trees of the first level, and a minimum of a large level. We will do it like this:

As you can see, the total number of associations was N-1, of which N / 2 is of the first level, N / 4 is of the second level, N / 8 is of the third, etc.
This trick is implemented very simply with the help of recursion (if desired, it can be decomposed into a cycle, but for greater clarity, I will not do this).

 //    ,   -   function mergeTreeList(array $treeList) { if (count($treeList) > 2) { return mergeTreeList( ($treeList), _($treeList)); } //... //    }

Thus, we first combine the original trees in 2x, and then they will be merged into more trees. Our time gain was about 10 times (1000 trees).

Total

We placed pinba-timers in our application, where we considered it necessary
We have formed an aggregated execution tree based on many script requests.
According to the constructed tree, you can analyze the bottlenecks of the project, you can build graphs of the speed of implementation of individual pieces of the project
All this happens right on a live server with a heavy load.

Pitfalls and cons

On our project, pinba writes so quickly (and deletes the old one) that the insert into table_copy select * from table query copies 2-3 times more data than was originally in the table. Therefore, at the time of copying the tables, it is necessary to stop writing to pinba (I stopped the network on the server with a firewall)
Pinba consumes a lot of memory (we have - 2 Gb to store data for 5 minutes), since instead of one tag we write 3 (+ tree_id, + tree_parent_id)
When copying, you have to turn off the network to stop writing to the tables (for 5-10 seconds), because of which data is lost during these 5-10 seconds

Useful files:
Script to display the tree: index.php
MySQL script to convert cron.sql data
PinbaClient.class.php - wrapper over pinba for more convenient use with automatic addition of tree_id, tree_parent_id
I would also like to mention the onphp framework, in which there is native pinba support
https://github.com/ents/pinba-php-profiler/ - source files to bring everything up
http://pinba.org/ - here you can download pinba

Disclaimer: This article is popular and can not be considered as a guide to action. All the actions described below are not the ultimate truth, but rather one of the few ways to make information visualized from pinba

Source: https://habr.com/ru/post/142975/

All Articles