Introduction
I will continue the series of articles on “Apache 2.x under supervision. Monitoring the system load web server. Again, the Apache module mod_performance, previously known [2], remains under consideration. At the time of writing this article on the module website [1] a new version of the module was posted for access - 0.2. Further narration in the article will be on the principle of "Question-Answer".
What's new in mod_performance 0.2?
')
I want to focus again on what the module intends for:
- the module is designed to collect and accumulate statistics on the use of resources (CPU and memory, script execution time, as well as process I / O) by the Apache 2.2 web server;
- The module allows you to analyze the collected data.
If you briefly describe the innovations, you get something like this:
- Saving the information collected by the module in the MySQL database.
- Saving the information collected by the module into a separate text log in a user-defined format.
- The number of collected data has been expanded, i.e. now the units of statistics collected are not limited to percents, but are still stored in seconds and megabytes
- Saving the information collected by the module in the PostgreSQL database.
- Fixed a number of bugs that affect the stability of the 0.1-x version.
- Added compatibility of the module with the configurations of Apache-itk, as well as Apache + mod_ruid2.
- Added statistics collection of input / output operations of the monitored process.
- The influence of the module on the server's performance has been reduced, that is, the module no longer generates an error 503 when it is impossible to connect to the daemon.
What are the principles of the module?
I repeat once again for those who have not read the previous article [2] and add new information.
The module allows you to keep track of how many resources consumed the request received by the web server. Each time saving a piece of data about the spent query.
At once I will make a reservation that the data about the request is saved only after the completion of the request, that is, the data is accumulated for history and analysis. Those who are interested in the current server load - use mod_status.
As a resource utilization analyzer, not a scoreboard is used, as in mod_status and perl extensions, but glibtop.
The module allows you to track absolutely all requests, as well as specific ones, filtered by the rule using regular expressions. More precisely, it will be said that the module ALWAYS processes only those requests that match the filter containing the regular expression.
And now I will describe exactly how the request statistics are collected.
When you start the Apache web server, the mod_performance module daemon starts. A running daemon opens a unix socket and waits for connections. When the server processes the request, it checks the request for a condition: is it necessary to keep statistics for the request or not. If the verification is successful, the server process that accepted the connection sends the information to the daemon and the PID (TID) of the process / thread that will process the request. The daemon starts two threads: 1) the first thread that is waiting for the final data transfer from the request processing process; 2) a thread that periodically polls the use of memory by the process that processes the request and calculates the maximum value. At the end of the process that processes the request, the daemon writes data to the statistics database.
How are statistics about CPU usage going?
The CPU usage indicator is calculated as follows. When a request is received, the module takes readings of the jiffies of the process (the system as a whole and the current process), and at the end of the request it is measured again and this data is sent to the daemon. Based on them, a decision is made on the use of the processor. Those. if watching with top you saw the download: 0%, 10%, 100%, 20%, then do not expect the module to save 100%, because the exact amount of time that will remain for this process during the life of the request will be preserved. "By eye" for the above example, this number will be equal to - 32%.
How is the memory usage statistics collected?
But the memory figure is going on a different principle. During the processing of the request, the daemon every 10 milliseconds measures the memory usage of the process processing the request and at the end saves the maximum value.
How are statistics collected about using I / O operations?
This indicator is monitored like a CPU — read and recorded process data is measured at the beginning of the request and at the end of the request. The difference of these values is converted to kilobytes and stored in the database. Those. in fact, this indicator tracks the number of bytes written / read during the life of the request. At once I will say that the indicator used as an analyzed source: / proc / [pid] / io - read_bytes, write_bytes, cancelled_write_bytes.
Recommendations for the launch
By default, the module all its files: socket, sqlite database, the global log saves to the / etc / httpd / log folder. But as practice has shown, it is not always possible to use this folder. Often, the demon has no rights to it, because The daemon works as an apache user (I’m writing about CentOS, which is why apache).
Immediately I recommend on the machine where you are going to use the module to create a folder, for example - / statistics / apache. Set it as the owner of the apache user and allow it to write and read (only neatly with the itk and mod_ruid modes, so that the module with the modified user can also write to the socket of this folder).
PerformanceSocket / statistics / apache / perfsock
Where can I save query statistics?
Here it is an important question - where to save the collected statistics. To simplify this puzzle, the new version of the module includes support for databases such as: SQLite, MySQL, PostgreSQL, as well as exotic - saving logs to a file. Now you do not need to adapt to the module - it adapts to you. For the module to work successfully (not in the “Save to log” mode), you must have at least one of the following libraries on the machine:
- libsqlite3.so;
- libmysqlclient_r.so;
- libpq.so.
When building a package, the presence of mysql-devel, sqlite-devel, postgresql-devel is not required. These libraries will be loaded dynamically while the module is running. More precisely, the library required for the selected mode will be loaded.
Example 1. Working with SQLite. The simplest option, the database is created automatically, the table is created automatically, there is no need for users and so on. Only one very important note for those who have already used the module version 0.1: in the old database and the new table structures are different, therefore for successful operation the old database is better to remove. Because The module itself does not recreate the existing table.
To work with SQLite you need:
PerformanceDB / statistics / apache / perfdb
PerformanceLogType SQLite
Example 2. Working with MySQL. More difficult option.
Create a database, for example, perf and a perf user with permissions to this database:
mysql> create database perf;
mysql> CREATE USER 'perf' @ 'localhost' IDENTIFIED BY 'perf';
mysql> GRANT ALL PRIVILEGES ON *. * TO 'perf' @ 'localhost' WITH GRANT OPTION;
the module will create the table itself. And now in the module settings:
PerformanceLogType MySQL
PerformanceDbUserName perf
PerformanceDBPassword perf
PerformanceDBName perf
And again, a very important note for those who have already used the module version 0.2 of earlier versions than 0.2-8: in the old database and the new table structures are different, therefore, for successful operation, the old database is better to delete. Because The module itself does not recreate the existing table.
Example 3 Work with PostgreSQL. More difficult option.
It is also necessary to create a database and user access:
postgres = # CREATE USER perf WITH PASSWORD 'perf';
postgres = # CREATE DATABASE perf;
postgres = # GRANT ALL PRIVILEGES ON DATABASE perf to perf;
in the /var/lib/pgsql/data/pg_hba.conf file
local all all trust
host all all 0.0.0.0/0 trust
host all all::: 1/128 trust
and finally the module settings:
PerformanceLogType Postgres
PerformanceDbUserName perf
PerformanceDBPassword perf
PerformanceDBName perf
Example 4 Work with the text log.
In this mode, no additional libraries are required. It is enough to assign a file to which statistics will be consolidated.
PerformanceLogType Log
PerformanceLog /statistics/apache/perf.log
By default, the data in this file fall into this format:
[% DATE%] from% HOST% (% URI%) script% SCRIPT%: cpu% CPU% (% CPUS%), memory% MEM% (% MEMMB%), execution time% EXCTIME%, IO: R -% BYTES_R% W -% BYTES_W%what takes place in:
[2011-06-05 19:28:28] from example.com (/index.php) script /var/www/example.com/index.php: cpu 0.093897 (0.010000), memory 0.558202 (5.597656), execution time 10.298639, IO: R - 104.000000 W - 248.000000
[2011-06-05 19:28:39] from example.com (/index2.php) script /var/www/example.com/index2.php: cpu 0.000000 (0.000000), memory 0.558202 (5.597656), execution time 10.159158, IO: R - 0.000000 W - 0.000000And now in more detail. For this mode, you can set the format of the string displayed in the log. There are predefined macro names for this:
- % DATE% - converted to the start date of the request;
- % CPU% - CPU utilization in percent;
- % MEM% - memory usage in percent;
- % URI% - Request URI
- % HOST% - the name of the virtual host to which the request was addressed;
- % SCRIPT% - script name;
- % EXCTIME% - duration of the script in seconds;
- % CPUS% - how many seconds the system spent on this process in seconds;
- % MEMMB% - memory usage in megabytes;
- % BYTES_W% - kilobytes recorded;
- % BYTES_R% - kilobyte read;
- %% - display the percent sign.
For example:
Hello from% HOST% I use% CPU% %% cpu today% DATE%unfold in
Hello from example.com I use 0.23% cpu today 2011-06-05 19:28:28Such a log can be global, as well as its own for each virtual host. As well as each host can have its own unique format of output to the log.
Another important feature is that the screen for analyzing accumulated data is not available in this mode. Those. don't handle module handlers. In this case, utilities analyzing logs need to be written separately.
What reports are available in the new version of the module by default?
As before, in version 0.1, reports are available in the new module:
- Show output without analytics - display the collected information without analysis, filtered by host, script and URI (graphic and text mode);
- Maximal% CPU - display only entries with the maximum value of% CPU (including filtering);
- Maximal memory% - display only records with the maximum value of% memory (including filtering);
- Maximal execution request time - display the most long-running script;
- Host requests statistics — show statistics of calls to hosts with sorting in descending order (in% of the total, taking into account filters);
- Number of requests per domain — show statistics of calls to hosts with sorting in descending order (not a percentage, but a number);
- Average usage per host — output the average server load by each host (% CPU sum,% MEMORY sum, script execution amount, average CPU% for a period, average memory usage%, average script execution time);
- Show current daemon threads - show the list of requests monitored by the daemon (displayed only for the performance-status handler and when the PerformanceExtended parameter is on).
Display fields in reports:
- ID - record identifier;
- DATE ADD - when the request was passed;
- HOSTNAME is the name of the virtual host;
- URI - request uri;
- SCRIPT - running script;
- CPU (%) - CPU utilization in%;
- MEM (%) - memory usage in%;
- TIME EXEC (sec) - query execution time;
- CPU TM (sec) - processor time in seconds;
- MEM USE (Mb) - memory usage in megabytes;
- IO READ (Kb) - read by KB process;
- IO WRITE (Kb) - written by Kbyte process.
Reports are available in: SQLite, MySQL, Postgres.
How to assemble a module?
I repeat, because compared with the previous version, there are changes (installation under Debian [4]).
All actions must be performed under the root user:
1) install the necessary packages for the assembly:
yum install httpd-devel apr-devel libgtop2-devel gd-devel
2) create a temporary paku for source codes:
mkdir ~ / my_tmp
cd ~ / my_tmp
3) create a temporary package for source codes:
wget http://lexvit.dn.ua/utils/getfile.php?file_name=mod_performance-0.2.tar.gz -O mod_performance-0.2.tar.gz
tar zxvf mod_performance-0.2.tar.gz
cd mod_performance-0.2 /
4) we assemble the module:
make
5) warning not to pay attention. The main thing is that there is no error. If everything is ok, then:
make install
or
cp .libs / mod_performance.so <path to copy>
The instruction on the parameters of the module is available at the address in the links [3].
How does the module affect the speed of processing requests?
From a theoretical point of view, the module in fact should not affect the speed of processing the request, since during the request itself, only CPU information is read, the daemon reads all the rest. And the main burden falls on the demon. The load may increase on the server, because The daemon needs to access the database for recording information. And also do not forget about the memory that is required for threading.
For the study of the practical part of this issue, a small study was conducted using the ab (ApacheBench) utility.
1st test. A php script was investigated that creates a load on the file subsystem:
Without mod_performance module:
Time taken for tests: 205.952423 seconds
Complete requests: 100
Failed requests: 0
Requests per second: 0.49 [# / sec] (mean)
Time per request: 10297.621 [ms] (mean)
Time per request: 2059.524 [ms] (mean, across all concurrent requests)
With mod_performance module:
Time taken for tests: 206.386260 seconds
Complete requests: 100
Failed requests: 0
Requests per second: 0.48 [# / sec] (mean)
Time per request: 10319.313 [ms] (mean)
Time per request: 2063.863 [ms] (mean, across all concurrent requests)
2nd test. The study of the php script that loads the CPU.
Without mod_performance module:
Time taken for tests: 60.333852 seconds
Complete requests: 100
Failed requests: 0
Requests per second: 1.66 [# / sec] (mean)
Time per request: 3016.692 [ms] (mean)
Time per request: 603.339 [ms] (mean, across all concurrent requests)
With mod_performance module:
Time taken for tests: 60.714260 seconds
Complete requests: 100
Failed requests: 0
Requests per second: 1.65 [# / sec] (mean)
Time per request: 3035.713 [ms] (mean)
Time per request: 607.143 [ms] (mean, across all concurrent requests)
3rd test. The study of the php script is fast running and does not create a load.
Without mod_performance module:
Time taken for tests: 0.075594 seconds
Complete requests: 100
Failed requests: 0
Requests per second: 1322.86 [# / sec] (mean)
Time per request: 3.780 [ms] (mean)
Time per request: 0.756 [ms] (mean, across all concurrent requests)
With mod_performance module:
Time taken for tests: 0.109116 seconds
Complete requests: 100
Failed requests: 0
Requests per second: 916.46 [# / sec] (mean)
Time per request: 5.456 [ms] (mean)
Time per request: 1.091 [ms] (mean, across all concurrent requests)
Investigated machine: virtual, 1Gb RAM, AMD Phenom (tm) 8650 Triple-Core Processor, CentOS 5.5 OS.
The more powerful the request, the less noticeable the impact of the module. The first tests showed that the influence of the module is insignificant, the last test showed an increase in query processing time by one and a half. But judging by the time of the request is a valid victim.
Links
- Mod_performance module site - http://lexvit.dn.ua/files/
- Previous article about the module - http://habrahabr.ru/blogs/server_side_optimization/119011/
- Instructions on module parameters - http://lexvit.dn.ua/articles/?art_id=mod_performance0_2_mht201105267239
- Building the module under Debian 6.0 (0.1 version, thanks to Maxim for the article) - http://linuxwork.org.ua/debian/ustanovka-i-nastrojka-modulya-mod_performance-dlya-apache-na-debian-6-0-squeeze /