📜 ⬆️ ⬇️

What inquiries got into QUERY CACHE

We all know that QUERY CACHE is very useful if you need to optimize frequently repeated database queries, provided that there is a low number of DML operations (if you are not at all familiar with this functionality, translation of official documentation for setting up this database structure can be read here ). When you configure query caching options on the server, the question often arises as to what exactly is contained in QUERY CACHE 's at the moment. Finding the answer to this question at one time put me in a dead end. The only practical answer that I considered absolutely correct at the time, I found here . The answer read as follows:
there is no way to find out what queries are now cached

those. This method does not exist. Only after some time I realized that this was not at all the case, and there is a completely legal way of obtaining the contents of this database structure without patching the source codes. This way is to write a small plugin for INFORMATION_SCHEMA .

Query Cache General Information


To simplify a bit, the QUERY CACHE data structure consists of:
- data pool
- hash tables with a list of "query structures"
- hash tables with a list of "table structures"
- various locks to control this structure.
It is precisely because of the last item (locks) that QUERY CACHE often recommended to be disabled on servers in a multi-core architecture and active DML, since the overhead associated with properly maintaining this cache may exceed its usefulness. In general, the memory area used by QUERY CACHE is a complexly connected graph of structures of type Query_cache_block . A large number of links appeared, most likely, in an evolutionary way in the name of optimizing the algorithms for cleaning and adding data. To write a plugin, we need only a couple of "trees". The first part describes the descendants of the HASH Query_cache::queries hash tables of query structures, and the second descendants of the HASH Query_cache::tables hash tables of table structures. For simplicity, we will present the QUERY CACHE pool data as follows:

The description of this structure, of course, varies somewhat from version to version, but the basic idea remains unchanged for many versions: a hash table with a list of “query structures” is used to search for queries and extract the calculated result for them, a hash table with a list of “table structures” it is used to delete the data of the changed tables from the query cache (there are many more connections between the two structures described, but for simplicity of description I will omit them, since in this context they are not interesting to us).
According to the received scheme, the main information we are interested in (marked in red) is contained in the structure
Query_cache_query
 struct Query_cache_query { ulonglong limit_found_rows; ... inline uchar* query() ... }; 

- SQL query and number of records obtained as a result of its execution

it remains only to pull it out.

General information about plugins


As I said before: to get this information, we will use the officially provided MySQL framework - INFORMATION_SCHEMA Plugins . The first job description for the QUERY CACHE structure was produced by Roland Bouman on his blog. His article, quite old, unfortunately has not been preserved to date. For those who are not familiar with how to write plugins, I recommend reading this post in Habré, from the MySQL petropavel developer or the book MySQL 5.1 Plugin Development written with the participation of the same author. In this article I will try to pay more attention to the details of the implementation of this particular plugin under MySQL 5.5.29 - the latest version of the server at the moment. The minimum plugin for INFORMATION_SCHEMA should include:
- ST_FIELD_INFO - table structure description
- mysql_declare_plugin - macro of the plugin library descriptor with the type MYSQL_INFORMATION_SCHEMA_PLUGIN and indicating the name of the resulting table
- st_mysql_information_schema - the structure of the descriptor of the plugin itself
- the initialization function of the plug-in, which specifies the procedure that fills the resulting table
- and the function of deinitialization.
For convenience, we will write a plugin that fills in two tables: a list of queries in the cache and the memory QUERY_CACHE_RESULTS they QUERY_CACHE_RESULTS , and a list of database tables that are used to cache QUERY_CACHE_TABLES queries. To simplify the code, we exclude from the plug-in the management of session variables, pushing through predicates and other useful, but not very important for us, features that can be added later.

We connect request cache


The QUERY CACHE implementation of the MySQL server is an instance of the Query_cache::query_cache . This object is global: its implementation is in sql_cache.cc . This cache is described in the depths of MySQL, so for the ability to connect our plugin to this structure we will have to compile with the directive
 #define MYSQL_SERVER 

This macro will make our plugin incompatible with other versions of MySQL, and even more - even if you try to apply it to the same version but compiled with different compilation directives, at best you will get an error when connecting this plugin, at worst the server will crash. ABORT when trying to execute it. Even if you put it together with the sources, you still need to be extremely careful, as the plugin runs in the memory of the database server, and the errors of this plug-in when working with memory will lead to the collapse of the entire MySQL server. We have activated this directive in order to gain access to the notorious hash tables described above, since they are not included in the public section for the Query_cache class, and we will have to inherit to work with them.
Add a couple of new methods for accessing private class variables.
 class MySQL_IS_Query_Cache : private Query_cache { public: HASH *get_queries_hash() { return &this->queries; } HASH *get_tables_hash() { return &this->tables; } }; 

- now we can directly work with these hash tables in our plugin.

The main methods for working with objects of type HASH described in the file hash.h We need only the my_hash_element method, since the plugin does not support parsing the WHERE and we always my_hash_element a complete table containing the entire list of queries available at the moment. The algorithms for applying predicate filtering and others will be left to the server.
')

QUERY_CACHE_TABLES


This view will display a list of all tables, the results of queries for which were cached, respectively, there will be only two columns in it:
schema name and table name
 #define MAX_SCHEMA_NAME_LENGTH 127 #define MAX_TABLE_NAME_LENGTH 127 #define COLUMN_SCHEMA_NAME 0 #define COLUMN_TABLE_NAME 1 ST_FIELD_INFO query_cache_table_fields[]= { {"SCHEMA_NAME", MAX_SCHEMA_NAME_LENGTH, MYSQL_TYPE_STRING, 0, 0, "Schema Name"}, {"TABLE_NAME", MAX_TABLE_NAME_LENGTH, MYSQL_TYPE_STRING, 0, 0, "Table Name"}, {0,0, MYSQL_TYPE_STRING, 0, 0, 0} }; 

- by default, we will make the maximum length of these elements no more than 127 characters, I think this is reasonable


QUERY_CACHE_RESULTS


In the table containing SQL queries, as well as information on the result of their execution, we will add five columns:
SQL directly, number of lines in the response, as well as information about the memory used by the response
 #define MAX_STATEMENT_TEXT_LENGTH 1024 #define COLUMN_STATEMENT_TEXT 0 #define COLUMN_FOUND_ROWS 1 #define COLUMN_RESULT_BLOCKS_COUNT 2 #define COLUMN_RESULT_BLOCKS_SIZE 3 #define COLUMN_RESULT_BLOCKS_SIZE_USED 4 ST_FIELD_INFO query_cache_result_fields[]= { {"STATEMENT_TEXT", MAX_STATEMENT_TEXT_LENGTH,MYSQL_TYPE_STRING, 0, 0, "Cached statement text"}, {"FOUND_ROWS", 21, MYSQL_TYPE_LONGLONG, 0, 0, "Result row count"}, {"RESULT_BLOCKS_COUNT", 21, MYSQL_TYPE_LONG, 0, 0, "Result Blocks count"}, {"RESULT_BLOCKS_SIZE", 21, MYSQL_TYPE_LONGLONG, 0, 0,"Result Blocks size"}, {"RESULT_BLOCKS_SIZE_USED", 21, MYSQL_TYPE_LONGLONG, 0, 0,"Result Blocks used size"}, {0,0, MYSQL_TYPE_STRING, 0, 0, 0} }; 

- the query text is artificially limited to 1024 characters, if necessary you can increase this parameter


We save data


After that, we proceed to the immediate filling of the table with data. Wherever we store string values, you must specify an encoding.
  // character set information to store varchar values CHARSET_INFO *cs = system_charset_info; 

To access the Query_cache::query_cache you need
set the appropriate lock
  // query_cache defined in sql_cache.h is MySQL Query Cache implementation; MySQL_IS_Query_Cache *qc = (MySQL_IS_Query_Cache *)&query_cache; HASH *h_queries; query_cache.lock(); h_queries = qc->get_queries_hash(); 

- h_queries object we are looking for, by iterating which we will be able to access all queries from QUERY CACHE (in the figure HASH queries )

Access to the data structure of HASH tables can be obtained
the same way
  // query_cache defined in sql_cache.h is MySQL Query Cache implementation; MySQL_IS_Query_Cache *qc = (MySQL_IS_Query_Cache *)&query_cache; HASH *h_tables; query_cache.lock(); h_tables = qc->get_tables_hash(); 

Further with HASH tables everything is simple. We iterate the resulting hash table and
save the elements we need
  for(uint i = 0; i < h_tables->records; i++) { query_cache_block_hash = my_hash_element(h_tables, i); query_cache_block_current = (Query_cache_block*)query_cache_block_hash; Query_cache_table* query_cache_table = query_cache_block_current->table(); // get tables data const char *schema_name = (const char*)query_cache_table->db(); size_t schema_name_length = strlen(schema_name)>MAX_SCHEMA_NAME_LENGTH?MAX_SCHEMA_NAME_LENGTH:strlen(schema_name); is_query_cache_tables->field[COLUMN_SCHEMA_NAME]->store((char*)schema_name, schema_name_length, cs); ... } 

In case of errors when saving, as well as in the case of the regular end of the procedure, do not forget to unlock the query_cache.unlock() block. This lock is global for the entire database instance; when it is installed, all other requests that apply to QUERY CACHE will wait for your plugin to finish.
With the receipt of SQL queries, everything is a bit more complicated, and additional manipulations are required. The block resulting from the iteration of the hash table does not contain the information we need. It contains only a link to structures of type Query_cache_query , reaching which we will get only part of the information we need, namely
query text and number of records in the result set
  // get statement data statement_text = (const char*)query_cache_query->query(); ulonglong found_rows = query_cache_query->found_rows(); 

Information about the amount of memory used for caching the results of this query is available through the result variable for the query_cache_query object. This will be a link to
first memory area
  Query_cache_block *first_result_block = query_cache_query->result(); 

dedicated to storing its result. In order to get information about all the allocated memory areas, you need to go through the linked list, which is cyclic and will eventually lead us to the first Query_cache_block block Query_cache_block . It is also worth considering that the status of the block received as a result of this call may be unrelated to Query_cache_block::RES_INCOMPLETE .
Crawl result
  // loop all query result blocks for current query while( (result_block= result_block->next) != first_result_block && result_block->type != Query_cache_block::RES_INCOMPLETE) /* This type of block can be not lincked yet (in multithread environment)*/ { result_blocks_count++; result_blocks_size += result_block->length; result_blocks_size_used += result_block->used; } 

- in fact, can we go to an unrelated block in such a way or not, I do not know, but it is better to err

We collect the server and enjoy the result
 install plugin QUERY_CACHE_RESULTS soname 'query_cache_results.so'; install plugin QUERY_CACHE_TABLES soname 'query_cache_tables.so'; show plugins; +--------------------------+--------+--------------------+------------------------+---------+ | Name | Status | Type | Library | License | +--------------------------+--------+--------------------+------------------------+---------+ ... | QUERY_CACHE_RESULTS | ACTIVE | INFORMATION SCHEMA | query_cache_results.so | GPL | | QUERY_CACHE_TABLES | ACTIVE | INFORMATION SCHEMA | query_cache_tables.so | GPL | +--------------------------+--------+--------------------+------------------------+---------+ 24 rows in set (0.00 sec) select * from pivot limit 10; ... 10 rows in set (0.00 sec) select * from pivot p1, pivot p2 limit 1000; ... 1000 rows in set (0.00 sec) select * from information_schema.query_cache_tables; +-------------+------------+ | SCHEMA_NAME | TABLE_NAME | +-------------+------------+ | test | pivot | +-------------+------------+ 1 row in set (0.00 sec) select * from information_schema.query_cache_results \G *************************** 1. row *************************** STATEMENT_TEXT: select * from pivot p1, pivot p2 limit 1000 FOUND_ROWS: 1000 RESULT_BLOCKS_COUNT: 1 RESULT_BLOCKS_SIZE: 9992 RESULT_BLOCKS_SIZE_USED: 9985 *************************** 2. row *************************** STATEMENT_TEXT: select * from pivot limit 10 FOUND_ROWS: 10 RESULT_BLOCKS_COUNT: 1 RESULT_BLOCKS_SIZE: 512 RESULT_BLOCKS_SIZE_USED: 208 2 rows in set (0.00 sec) 


PS Since I am a DB developer and writing C ++ code is not even my secondary specialty, the solution described can be quite crooked and non-optimal, for this reason I will be happy to hear your comments on the code in PM.

PPS For self-assembly and experiments, I post the source of the plugin on github, following the advice of users in the comments: github.com/mcshadow/mysql_query_cache

Source: https://habr.com/ru/post/165241/


All Articles