How to stop forgetting about indexes and start checking the execution plan in tests

Some time ago, an unpleasant story happened to me, which served as a trigger for a small project on the githaba and turned into this article.

An ordinary day, a regular release: all tasks are checked up and down by our QA-engineer, so with the calmness of a sacred cow we “roll up” to the stage. The application behaves well, in the logs - silence. We decide to make a switch (stage <-> prod). Switch, look at the devices ...
')
It takes a couple of minutes, the flight is stable. A QA engineer does a smoke test, notices that the application somehow slows down unnaturally. Write off on warming up the caches.

It takes another couple of minutes, the first complaint from the first line: the data is loaded for customers for a very long time, the application slows down, it takes a long time to respond, etc. We start to worry ... we look at logs, we look for the possible reasons.

It takes another couple of minutes, a letter arrives from DB-admins. They write that the execution time of queries to the database (hereinafter referred to as the DB) has broken all possible boundaries and tends to infinity.

I open monitoring (I use JavaMelody ), I find these requests. I launch PGAdmin, I reproduce. Really long. I add “explain”, I look at the execution plan ... so it is, we forgot about the indexes.

Why is code review not enough?

That case taught me a lot. Yes, I “put out the fire” for an hour, having created the necessary index directly on the sale, something like this (do not forget about the CONCURRENTLY option):

CREATE INDEX CONCURRENTLY IF NOT EXISTS ix_pets_name ON pets_table (name_column);

Agree, this was tantamount to deployment with downtime. For the application on which I work, this is unacceptable.

I made conclusions and added a special bold item to the checklist for code review: if I see that one of the Repository classes was added / changed during the development process, I check the sql migrations for the presence of a script that creates and modifies the index. If it is not there, I write the author a question: is he sure that an index is not needed here?

It is likely that the index is not needed if there is little data, but if we work with a table in which the number of rows is counted in millions, the error of the index can become fatal and lead to the history outlined at the beginning of the article.

In this case, I ask the author of the pull request (hereinafter PR) 100% to make sure that the request he wrote in HQL is at least partially covered by the index (Index Scan is used). For this developer:

starts the application
searches for converted (HQL -> SQL) query in logs
opens PGAdmin or another database administration tool
generates in the local database, so that no one interferes with his experiments, the amount of data acceptable for tests (minimum 10K - 20K records)
executes request
requests execution plan
carefully examines it and draws appropriate conclusions
Adds / changes an index, achieving that the execution plan arranged it
is written to the PR that the request coverage is checked
expertly assessing the risks and the seriousness of the request, I can double-check his actions

A lot of routine actions and the human factor, but for a while it suited me, and I lived with it.

On the way home

They say it is very useful to at least occasionally walk from work, without listening to music / podcasts along the way. At this time, just thinking about life, you can come to interesting conclusions and ideas.

One day I went home and thought about what was that day. There were several reviews, each one I checked with the checklist and did a number of actions described above. I was so tired that time I thought, what the hell? Is it impossible to do this automatically? .. I quickened my step, wanting to “smash” this idea as soon as possible.

Formulation of the problem

What is the most important thing for a developer in execution plan?
Of course, seq scan on large amounts of data, caused by the lack of an index.

Thus, it was necessary to make a test that:

It is executed on a DB with a configuration similar to Prodovsk
Intercepts a database query made by the JPA repository (Hibernate)
Get it Execution Plan
Parsing Execution Plan, decomposing it into a convenient data structure for checks
Using a convenient set of Assert methods, checks for expectations. For example, that seq scan is not used.

It was necessary to test this hypothesis rather by making a prototype.

Solution architecture

The first problem that had to be solved was the launch of a test on a real database, which coincides in the version and settings with the one used in the sale.

Thanks to Docker & TestContainers , they solve this problem.

SqlInterceptor, ExecutionPlanQuery, ExecutionPlanParse, and AssertService are the interfaces I currently implement for Postgres. Plans to implement for other databases. If you want to participate - welcome. The code is written in Kotlin.

I put all this together on GitHub and called checkinx-utils . You do not need to repeat this, it is enough to connect dependency to checkinx in maven / gradle and use convenient asserts. How to do this, I will describe in more detail below.

Description of the interaction of components CheckInx

ProxyDataSource

The first problem to be solved is interception of queries ready to be executed to the database. Already with the set parameters, without questions, etc.

To do this, you need to wrap the real dataSource into a certain Proxy, which would allow you to integrate into the query execution pipeline and, accordingly, intercept them.

Such ProxyDataSource has already been implemented by many. I used the ttddyy ready-made solution, which allows me to set my Listener to intercept the request I need.

I substitute the original DataSource using the DataSourceWrapper class (BeanPostProcessor).

SqlInterceptor

In essence, its start () method sets its Listener to the proxyDataSource and starts intercepting queries, saving them in the internal list of statements. The stop () method, respectively, removes the installed Listener.

ExecutionPlanQuery

Here, the original request is transformed into a request for a execution plan. In the case of Postgres, this is the addition to the query of the keyword "EXPLAIN".

Further, this request is executed on the same database from testcontainders and a “raw” execution plan (list of lines) is returned.

ExecutionPlanParser

It is inconvenient to work with a “raw” execution plan. Therefore, I parse it into a tree consisting of nodes (PlanNode).

Let us analyze the PlanNode fields using the example of a real ExecutionPlan:

 Index Scan using ix_pets_age on pets  (cost=0.29..8.77 rows=1 width=36) Index Cond: (age < 10) Filter: ((name)::text = 'Jack'::text)

Property	Example	Description
raw: string	Index Scan using ix_pets_age on pets (cost = 0.29..8.77 rows = 1 width = 36)	source line
table: String?	pets	table name
target: String?	ix_pets_age	index name
coverage: String?	Index Scan	coating
coverageLevel	HALF	abstraction over the coating (ZERO, HALF, FULL)
children: MutableList <PlanNode>	-	child nodes
properties: MutableList <Pair <String, String >>	key : Index Cond, value : (age <10); key : Filter, value : ((name) :: text = 'Jack' :: text)	properties
others: MutableList <String>	-	All that could not be recognized in the current version of checkinx

Assertservice

With the data structure returned by the parser it is already possible to work normally. CheckInxAssertService is a set of checks for the PlanNode tree described above. It allows you to set your own lambda checks or use the pre-specified, from my point of view, the most popular. For example, so that your request does not have a Seq Scan, or you want to make sure that a specific index is used / not used.

CoverageLevel

Very important Enum, I will describe it separately:

Value	Description
NOT_USING	checks that a specific target (index) is not used
ZERO	index is not used (Seq Scan)
HALF	partial scan of the query index (Index Scan). For example, the search is performed by index, but for the resulting data refers to the table
FULL	full index query (Index Only Scan)
UNKNOWN	unknown cover. For some reason, it was not possible to install it.

Further we will sort some examples of use.

Test examples using CheckInx

I made a separate project on the GitHub checkinx-demo , where I implemented the JPA repository to the table pets and tests to this repository that check coverage, indexes, etc. It will be useful to look there as a starting point.

You may have such a test:

 @Test fun testFindByLocation() {  // ARRANGE  val location = "Moscow"  //   ,      10-20.  //   TestNG      @BeforeClass  IntRange(1, 10000).forEach {      val pet = Pet()      pet.id = UUID.randomUUID()      pet.age = it      pet.location = "Saint Petersburg"      pet.name = "Jack-$it"      repository.save(pet)  }  // ACT  //     sqlInterceptor.startInterception()  //    val pets = repository.findByLocation(location)  //    sqlInterceptor.stopInterception()  // ASSERT  //         assertEquals(1, sqlInterceptor.statements.size.toLong())  // ,    ix_pets_location    (Index Scan)  checkInxAssertService.assertCoverage(CoverageLevel.HALF, "ix_pets_location", sqlInterceptor.statements[0])  //        ,      Seq Scan,        checkInxAssertService.assertCoverage(CoverageLevel.HALF, sqlInterceptor.statements[0])  // ...  ,      checkInxAssertService.assertPlan(plan) {          it.coverageLevel.level < CoverageLevel.FULL.level      } }

The execution plan could be as follows:

 Index Scan using ix_pets_location on pets pet0_  (cost=0.29..4.30 rows=1 width=46) Index Cond: ((location)::text = 'Moscow'::text)

... or like this, if we forgot about the index (the tests turn red):

 Seq Scan on pets pet0_  (cost=0.00..19.00 rows=4 width=84) Filter: ((location)::text = 'Moscow'::text)

In my project, I most of all use the simplest assert, which says that Seq Scan is missing in the execution plan:

 checkInxAssertService.assertCoverage(CoverageLevel.HALF, sqlInterceptor.statements[0])

The presence of such a test suggests that I, at a minimum, studied the execution plan.
It also makes project management more explicit, and the code's documentation and predictability increase.

Mode for experienced

I recommend using ScheckInxAssertService, but if there is a need, you can bypass the parse tree (ExecutionPlanParser) yourself or, in general, parse the “raw” execution plan (the result of the ExecutionPlanQuery).

 @Test fun testFindByLocation() {  // ARRANGE  val location = "Moscow"  // ACT  //     sqlInterceptor.startInterception()  //    val pets = repository.findByLocation(location)  //    sqlInterceptor.stopInterception()  // ASSERT  //  ""    val executionPlan = executionPlanQuery.execute(sqlInterceptor.statements[0])  //    -   val plan = executionPlanParser.parse(executionPlan)  assertNotNull(plan)  // ...     val rootNode = plan.rootPlanNode  assertEquals("Index Scan", rootNode.coverage)  assertEquals("ix_pets_location", rootNode.target)  assertEquals("pets pet0_", rootNode.table) }

Connect to the project

In my project, I singled out such tests into a separate group, calling it Intensive Integration Tests.

It is easy to connect and start using checkinx-utils. Let's start with the build script.

First connect the repository. Someday I will load checkinx into maven, but now I can only download artifact from GitHub via jitpack.

 repositories { // ...  maven { url 'https://jitpack.io' } }

Next, add the dependency:

 dependencies { // ...  implementation 'com.github.tinkoffcreditsystems:checkinx-utils:0.2.0' }

We finish the connection by adding a configuration. Now only Postgres is supported.

 @Profile("test") @ImportAutoConfiguration(classes = [PostgresConfig::class]) @Configuration open class CheckInxConfig

Pay attention to the test profile. Otherwise, you will find the ProxyDataSource in your product.

PostgresConfig connects several bins:

DataSourceWrapper
PostgresInterceptor
PostgresExecutionPlanParser
PostgresExecutionPlanQuery
CheckInxAssertServiceImpl

If you need some kind of customization that the current API does not provide, you can always replace one of the beans with your implementation.

Known Issues

Sometimes the DataSourceWrapper fails to spoof the original dataSource due to the Spring CGLIB proxy. In BeanPostProcessor, in this case, it is not the DataSource that comes, but the ScopedProxyFactoryBean and there are problems with type checking.

The easiest solution would be to create a HikariDataSource for tests manually. Then your configuration will be as follows:

 @Profile("test") @ImportAutoConfiguration(classes = [PostgresConfig::class]) @Configuration open class CheckInxConfig {  @Primary  @Bean  @ConfigurationProperties("spring.datasource")  open fun dataSource(): DataSource {      return DataSourceBuilder.create()          .type(HikariDataSource::class.<i>java</i>)          .build()  }  @Bean  @ConfigurationProperties("spring.datasource.configuration")  open fun dataSource(properties: DataSourceProperties): HikariDataSource {      return properties.initializeDataSourceBuilder()          .type(HikariDataSource::class.<i>java</i>)          .build()  } }

Development plans

I would like to understand whether it is necessary for someone other than me? To do this, create a survey. I will be glad to answer honestly.
See what you really need and expand the standard list of assert methods.
Write implementations for other databases.
The construction of sqlInterceptor.statements [0] does not look very obvious, I want to improve.

I would be glad if someone wants to join and log out, playing in Kotlin.

Conclusion

I am sure that there will be comments: it is impossible to predict how the query planner will behave on the sale, it all depends on the collected statistics .

Indeed, the planner. using the statistics collected earlier, can build a different plan from the one being tested. Meaning a little different.

The task of the scheduler is to improve rather than worsen the request. Therefore, for no apparent reason, he will not suddenly use Seq Scan, but you can unknowingly.

CheckInx you need to write a test, do not forget about studying the query execution plan and consider the possibility of creating an index, or vice versa, the test clearly shows that no indexes are needed here and you are satisfied with Seq Scan. This would save you from unnecessary questions on code review.

Links

Source: https://habr.com/ru/post/454066/

All Articles