One good person wants to get on Habr. To confirm his good intentions, he wrote an article that I bring to you.Surely many people know about the existence of the function FOUND_ROWS () in mySQL. It is most often used to count the total number of rows in a table that satisfy specified conditions. It is usually used in conjunction with the SQL_CALC_FOUND_ROWS directive as follows:
Mysql> SELECT SQL_CALC_FOUND_ROWS * FROM table WHERE column > 1 LIMIT 0, 50;
Mysql> SELECT FOUND_ROWS();
The result of the second query will be the total number of rows in the "table" table that satisfy the condition "column> 1" (and not just those that were returned by the first query).
Some features of this function should be taken into account when using it without the SQL_CALC_FOUND_ROWS directive, which the mySQL documentation conscientiously warns: in this case, it will return the total number of processed rows (not returned!). For example:
')
Mysql> SELECT * FROM table LIMIT 0, 50;
Mysql> SELECT FOUND_ROWS();
The result, as expected, will be "50". But the following requests will return not too expected result:
Mysql> SELECT * FROM table WHERE column > 1 LIMIT 50, 50;
Mysql> SELECT FOUND_ROWS();
Despite the fact that the first query returns 50 lines, the result will be “100”, since mySQL had to view exactly 100 lines.
In addition, when using UNION with SQL_CALC_FOUND_ROWS, some specific requirements also appear:
- the SQL_CALC_FOUND_ROWS directive should appear only once and only in the first query;
- accurate results are returned only when using UNION ALL, because when using UNION without ALL, some lines will be deleted, which is not taken into account by the FOUND_ROWS () function;
- if global LIMIT is not used, then the SQL_CALC_FOUND_ROWS directive is not ignored.
The mySQL documentation directly recommends using this function to count the total number of rows in a table that satisfy the parameters of the query (for example, when splitting a list into pages), arguing that it is faster than two classically used queries like:
Mysql> SELECT * FROM table WHERE column > 1 LIMIT 50, 50;
Mysql> SELECT COUNT(*) FROM table WHERE column > 1;
Question: if the variant using the FOUND_ROWS () function is indeed faster, then why is it not used everywhere? I decided to compare both approaches. For this table was created:
CREATE TABLE `table_1` (
`id` int (10) unsigned NOT NULL auto_increment,
`column_1` varchar (32) default NULL ,
`column_2` varchar (32) default NULL ,
PRIMARY KEY (`id`),
KEY `column_2` (`column_2`)
) ENGINE=MyISAM AUTO_INCREMENT=1;
* This source code was highlighted with Source Code Highlighter .
The table contains three columns: a numeric id field with an auto_increment and a PRIMARY key; the text column column_1 without an index and the text column with the index column_2. The table was filled with the following PHP script:
<?php
for ($i = 0; $i < 457128; $i ++) {
$db->insert ( 'table_1' , array ( 'column_1' , 'column_2' ), array ( md5(uniqid(rand(), true )), md5(uniqid(rand(), true )) ) );
}
?>
* This source code was highlighted with Source Code Highlighter .
$ db is a wrapper for PHP functions for working with mySQL. In particular, it measures the execution time of queries to the database.
Further, the following types of samples were made:
- “Standard” query with selection by PRIMARY key
SELECT * FROM `table_1` WHERE `id` < $id_limit LIMIT 1000
SELECT COUNT(*) FROM `table_1` WHERE `id` < $id_limit
- SQL_CALC_FOUND_ROWS with selection by PRIMARY key:
SELECT SQL_CALC_FOUND_ROWS * FROM `table_1` WHERE `id` < $id_limit LIMIT 1000
SELECT FOUND_ROWS() AS count
- “Standard” query with a selection of a column WITHOUT A KEY
SELECT * FROM `table_1` WHERE `column_1` LIKE '$column_1_limit%' LIMIT 1000
SELECT COUNT(*) FROM `table_1` WHERE `column_1` LIKE '$column_1_limit%'
- SQL_CALC_FOUND_ROWS with selection by column WITHOUT KEY
SELECT SQL_CALC_FOUND_ROWS * FROM `table_1` WHERE `column_1` LIKE '$column_1_limit%' LIMIT 1000
SELECT FOUND_ROWS() AS count
- “Standard” query with selection by VARCHAR column with KEY
SELECT * FROM `table_1` WHERE `column_2` LIKE '$column_2_limit%' LIMIT 1000
SELECT COUNT(*) FROM `table_1` WHERE `column_2` LIKE '$column_2_limit%'
- SQL_CALC_FOUND_ROWS with selection by column WITHOUT KEY
SELECT SQL_CALC_FOUND_ROWS * FROM `table_1` WHERE `column_2` LIKE '$column_2_limit%' LIMIT 1000
SELECT FOUND_ROWS() AS count
- "Standard" query with a sample of both columns
SELECT * FROM `table_1` WHERE `column_1` LIKE '$column_1_limit_2%' AND `column_2` LIKE '$column_2_limit_2%' LIMIT 1000
SELECT COUNT(*) FROM `table_1` WHERE `column_1` LIKE '$column_1_limit_2%' AND `column_2` LIKE '$column_2_limit_2%'
- SQL_CALC_FOUND_ROWS with a sample of both columns
SELECT SQL_CALC_FOUND_ROWS * FROM `table_1` WHERE `column_1` LIKE '$column_1_limit_2%' AND `column_2` LIKE '$column_2_limit_2%' LIMIT 1000
SELECT FOUND_ROWS() AS count
- "Standard" query with a sample of both columns + PRIMARY
SELECT * FROM `table_1` WHERE `column_1` LIKE '$column_1_limit_3%' AND `column_2` LIKE '$column_2_limit_3%' AND `id` < $id_limit_2 LIMIT 1000
SELECT COUNT(*) FROM `table_1` WHERE `column_1` LIKE '$column_1_limit_3%' AND `column_2` LIKE '$column_2_limit_3%' AND `id` < $id_limit_2
- SQL_CALC_FOUND_ROWS with sample by both columns + PRIMARY
SELECT SQL_CALC_FOUND_ROWS * FROM `table_1` WHERE `column_1` LIKE '$column_1_limit_3%' AND `column_2` LIKE '$column_2_limit_3%' AND `id` < $id_limit_2 LIMIT 1000
SELECT FOUND_ROWS() AS count
Each query was executed 10 times with different parameters, while the parameters of the Standard and SQL_CALC_FOUND_ROWS queries of the same type on the same iteration are the same. Those.:
for ($i = 0; $i < 10; $i ++) {
$id_limit = rand(15000, 20000);
$id_limit_2 = rand(15000, 20000);
$column_1_limit = rand(1, 9);
$column_2_limit = rand(1, 9);
$column_1_limit_2 = rand(10, 20);
$column_2_limit_2 = rand(10, 20);
$column_1_limit_3 = rand(20, 30);
$column_2_limit_3 = rand(20, 30);
// –
}
* This source code was highlighted with Source Code Highlighter .
Test results can be seen in the chart:

It turns out that everything is not so simple, as they say in the documentation of mySQL. When using a sample of columns with indices, the “classical” scheme is definitely faster. When using columns without indices, as well as mixed queries, the FOUND_ROWS () function becomes faster, but its gain is very insignificant.
Thus, the choice of approach is a matter of each specific case. When performance is not important and the convenience and speed of writing code are important, then FOUND_ROWS () + SQL_CALC_FOUND_ROWS can be used. Otherwise, apparently, it is preferable to use the classical scheme of the two queries.
Testing was conducted on two machines:
- Windows XP SP3, Intel Pentium Core 2 Duo E8300 @ 2.83 GHz, 2 GB, mySQL 5.0.51a
- Ubuntu 8.04, AMD Opteron 2344 HE Quad-Core, 4 GB, 5.0.51a-3ubuntu5.4
The chart shows the results in Windows XP. The results (not the sampling time) are the same for both systems.