Five ways to paginate in Postgres, from basic to outlandish

You may be surprised by the fact that pagination, which is widespread, as such, in web applications, can easily be implemented irrationally. In this article, we will try out various server-side pagination methods and discuss their convenience when used in PostgreSQL. The article will help you understand which technique is more appropriate in your situation, including some you may not have seen before, namely those that rely on physical clustering and the database statistics collector.

Before continuing, mention should be made of pagination on the application side. Some applications transfer all (or most) server information to the client and share it on the page there. For small amounts of data, application-side pagination can be a good choice, reducing the number of HTTP calls. This approach becomes impractical when records begin to number in the thousands. Pagination on the server side has the following advantages:

Faster loading start page
Higher accuracy when total data changes
Faster operations on large amounts of data
Business Logic Encapsulation
Better performance on clients with limited resources

PostgreSQL gives us a certain number of server-side pagination techniques that differ in speed, integrity (do not lose a record), as well as support for access patterns to specific pages. Not all methods work in all situations, some require special data or queries. Consider the methods in order of generality, starting with those that work with any queries, and continuing with those that require ordered data. We end with a few exotic methods that are based on PostgreSQL's internal structure.

Splitting arbitrary queries

Limit-offset

The easiest method of pagination, limit-offset, is also the most risky. Unfortunately, it is one of the foundations of tutorials on web development. Object-relational mapping (ORM) libraries make using this method easy and tempting, from SQLAlchemy .slice (1, 3) to ActiveRecord .limit (1) .offset (3) to Sequelize .findAll ({ offset: 3, limit: 1}) . It is not a coincidence that limit-offset is used everywhere; you can attach it to any request without further modification.
')
The ORM methods of limit and offset are one thing, pagination helper libraries can be even more deceptive. For example, the popular Ruby library Kaminari uses limit-offset by default, hiding it behind a high-level interface.

This technique has two big problems, inconsistency of result and inefficiency of bias. The consistency is related to the fact that the passage through the results must receive each element strictly once, without gaps or repetitions. The inefficiency of the bias is associated with the delay arising from the shift of the results by a large bias.

This is how limit-offset pagination can be inconsistent. Suppose the user goes from page n to page n + 1 , while at the same time a new element is inserted on page n . This will cause duplication (the last element from page n is pushed onto page n + 1 ) and skipping (new element). Alternatively, suppose that item n is deleted, at the moment when the user navigates to page n + 1 . The preloaded initial element of page n + 1 will move to page n and will be skipped.

Now on the topic of inefficiency. Big shifts are really expensive. Even if there is an index, the database will have to scan all the storage, counting the rows. To use an index, we must filter the column by value, but in this case we need a certain number of rows, regardless of their column values. In addition, strings are not required to be the same size when stored, and some of them may be present on the disk, but be marked as deleted so that the database cannot use simple arithmetic to find disk space and start reading the results. Let's measure the slowdown.

--        CREATE TABLE medley AS SELECT generate_series(1,10000000) AS n, substr(concat(md5(random()::text), md5(random()::text)), 1, (random() * 64)::integer + 1) AS description; --        VACUUM ANALYZE; --     EXPLAIN ANALYZE SELECT * FROM medley LIMIT 100;

Estimated cost is quite low:

  QUERY PLAN -------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..1.85 rows=100 width=38) (actual time=0.008..0.036 rows=100 loops=1) -> Seq Scan on medley (cost=0.00..185460.60 rows=9999660 width=38) (actual time=0.007..0.017 rows=100 loops=1) Planning time: 0.040 ms Execution time: 0.059 ms (4 rows)

The choice of offset = 1000, changes its value to 19 and the runtime to 0.609 ms. As soon as offset = 5000000, the cost becomes already 92734 and the runtime is 758.484 ms.

These problems do not necessarily mean that the limit-offset method is not applicable in your case. In some applications, users, as a rule, do not pass many pages in the results, and you can even use the limitation of the number of pages on the server side. If data inconsistency and limiting the number of pages are not a problem in your application, then the limit-offset method is quite suitable for you.

When to use: Limit-Offset. Applications with limited pagination depth and non-consistency tolerant results.

Cursors

Despite its shortcomings, the limit-offset method has a plus in the form of no impact on the server. In contrast to this approach, there is another method of dividing pages, cursors. Like offset, cursors can be used with any requests, but they differ in that they require a separate connection and transaction from the server via an HTTP client.

This is how cursors can be used:

 --      BEGIN; --     DECLARE medley_cur CURSOR FOR SELECT * FROM medley; --  10  FETCH 10 FROM medley_cur; -- ... --   10 ,   ,      FETCH 10 FROM medley_cur; --   COMMIT;

Cursors have the desired property of consistency of pagination on any requests, showing the results that exist in the database at the time of the start of the transaction. Transaction isolation level ensures that the paged result does not change.

Each pagination approach has its own weaknesses, and the cursors are no exception: they are dependent on the use of resources and the client-server bundle. Each open transaction consumes allocated base resources and does not scale for a large number of clients. Of course, there are "WITH HOLD" cursors that may exist outside the transaction, but they must materialize the data. Thus, these errors make the pagination with cursors suitable only for a narrow circle of situations, for example for an internal network.

Adding an HTTP connection to cursors brings with it complications. Servers must identify clients between requests, whether through a token or by storing an identifier, such as the client’s IP address during a session. Servers also have to decide when to release transactions due to their downtime. Finally, server load balancing becomes difficult, as the client must connect to a particular server each time.

When to use: Cursors. The application is within the network, on a single server, which should paginate requests with a variable and variable order, especially when consistency of the result is important.

Pagination ordered queries

Keyset pagination

The techniques listed above can paginate the results of queries of any type, including unordered queries. If we are ready to abandon this community, then we can reap the benefits of optimization. In particular, when organizing by indexed columns, the user can use the values from the current page to select which objects to show on the next page. This is called key set pagination.

For example, let's return to our example:

 --     (btrees  ) CREATE INDEX n_idx ON medley USING btree (n); SELECT * FROM medley ORDER BY n ASC LIMIT 5;

With my random data, it returns:

  n | description ---+------------------------------------------------------------- 1 | 74f70e009396 2 | 8dac5a085eb670a29058d 3 | fce303a32e89181bf5df1601487 4 | fddcced2c12e83516b3bd6cc94f23a012dfd 5 | f51ae548dd27f51147e53e839eeceb6b0c92922145276d668e73d4a6621 (5 rows)

Now the user can look at the maximum n from the result and use it to call the next page:

 SELECT * FROM medley WHERE n > 5 ORDER BY n ASC LIMIT 5;

Even when filtering with n> 5000000, it works faster than in the limit-offset example.

  QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.43..0.62 rows=5 width=38) (actual time=0.101..0.103 rows=5 loops=1) -> Index Scan using n_idx on medley (cost=0.43..185579.42 rows=5013485 width=38) (actual time=0.100..0.102 rows=5 loops=1) Index Cond: (n > 5000000) Planning time: 0.071 ms Execution time: 0.119 ms (5 rows)

This pagination works quickly and at the same time guarantees the integrity of the data. Any additions / deletions to the current page will leave the result unchanged. Two weak points of this method are the lack of random access and a possible connection between the client and the server.

In general, there is no way to move to the selected page without visiting the previous ones to determine their maximum elements. Under certain conditions, however, we can do better. If the values in the indexed column are evenly distributed (or even better, adjacent numbers without spaces), the user can perform some mathematical calculations to find the page of interest, because the index makes it cheaper to find the largest value:

 EXPLAIN ANALYZE SELECT max(n) FROM medley; QUERY PLAN ------------------------------------------------------------------------------------------------------------ Result (cost=0.46..0.47 rows=1 width=0) (actual time=0.021..0.021 rows=1 loops=1) InitPlan 1 (returns $0) -> Limit (cost=0.43..0.46 rows=1 width=4) (actual time=0.018..0.018 rows=1 loops=1) -> Index Only Scan Backward using n_idx on medley (cost=0.43..284688.43 rows=10000000 width=4) (actual time=0.017..0.017 rows=1 loops=1) Index Cond: (n IS NOT NULL) Heap Fetches: 0 Planning time: 0.087 ms Execution time: 0.042 ms (8 rows)

Another issue of pagination by key values, client / server connection, requires attention. Initially, the user does not know which columns are indexed. The server is likely to provide a fixed result endpoint, rather than allowing the client to change the order. The code provided to the client may not know which column is ordered, the server should provide a hint how to request the next page. RFC5988 defines the relationship between the previous and the next HTTP links to encode the links for the user to which he should go.

Since users tend to access information pages in a linear fashion, pagination by key values is usually preferable for pagination of records on high-load web servers.

When to use: Keyword pagination. Scalable applications that serve data sequentially from column (s) indexed for comparison. Supports filtering.

Strange, specialized pagination

Clustered TID Scan

We can get non-standard pagination methods for special situations using low-level PostgreSQL functions. For example, we can get really random access to data if we

Do not require pages to be the same size.
We support only a single order for paged lines.

The trick is to select returned pages that are associated with pages from a database on disk, or with certain parts of these pages on disk. Each table in the PostgreSQL database contains a secret column called ctid, which identifies its row:

 SELECT ctid, * FROM medley WHERE n <= 10; ctid | n | description --------+----+------------------------------------------------------------- (0,1) | 1 | 74f70e009396 (0,2) | 2 | 8dac5a085eb670a29058d (0,3) | 3 | fce303a32e89181bf5df1601487 (0,4) | 4 | fddcced2c12e83516b3bd6cc94f23a012dfd (0,5) | 5 | f51ae548dd27f51147e53e839eeceb6b0c92922145276d668e73d4a6621 (0,6) | 6 | eb9fe1dfe1e421903f96b3b5c5dfe1ee1253582d728c35b4ee7330b (0,7) | 7 | e95202d7f5c612f8523ae705d (0,8) | 8 | 6573b64aff262a2b940326 (0,9) | 9 | a0a43 (0,10) | 10 | 82cdc134bd249a612cfddd3088dd09e32de5f4fa33 (10 rows)

Each ctid is the following: (page, line). PostgreSQL can get strings very quickly via ctid, in fact, this is how indexes work - they associate column values with ctids.

Note that, although PostgreSQL determines the order relation based on the tid type, it cannot effectively get ctids from the inequality

 EXPLAIN ANALYZE SELECT count(1) FROM medley WHERE ctid >= '(0,1)'::tid AND ctid < '(1,0)'::tid; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------- Aggregate (cost=235589.00..235589.01 rows=1 width=0) (actual time=1241.851..1241.852 rows=1 loops=1) -> Seq Scan on medley (cost=0.00..235464.00 rows=50000 width=0) (actual time=477.933..1241.802 rows=116 loops=1) Filter: ((ctid >= '(0,1)'::tid) AND (ctid < '(1,0)'::tid)) Rows Removed by Filter: 9999884 Planning time: 0.047 ms Execution time: 1241.889 ms (6 rows)

The range query does not work, but there is still a way to efficiently query all lines from a page on disk. Each page contains current_setting ('block_size') data bytes (usually 8k). The strings are linked by a 32-bit pointer, so most of them are block_size / 4 lines per page. (Actually, the lines are usually wider than the minimum size and a quarter of the block size provides the upper limit of the lines on the page.) The next sequence will generate all possible ctids on the j-th page

 SELECT ('(' || j || ',' || si || ')')::tid FROM generate_series(0,current_setting('block_size')::int/4) AS s(i);

Let's use it to get all the lines in our example on page zero.

 SELECT * FROM medley WHERE ctid = ANY (ARRAY (SELECT ('(0,' || si || ')')::tid FROM generate_series(0,current_setting('block_size')::int/4) AS s(i) ) );

The scheduler determined the cost of executing this request to be equal to = 25.03..65.12 and executes it for 2.765ms. Request page number 10,000 has the same cost. So we get really random access, why not love it?

There are three bottlenecks:

When lines are deleted - they leave holes in the pages.
The order of the lines cannot be meaningful. The database inserts rows into holes left from deleting rows, causing the rows to fall out of sequence.
"Where" is not supported

In some situations, this is not a problem. One case is the data whose natural order is related to the order of addition to the database, such as only increasing data of time intervals. The other is data that does not change often. This is due to the fact that we have control over the arrangement of lines on the pages through the CLUSTER command.

Let's go back to our example. Its rows on the disk are ordered by column n in ascending order, since this is the order in which we added them to the database. But what if we want to sort them by the description column? To do this, we will have to physically rebuild the table by creating an index on the description column and perform clustering.

 CREATE INDEX description_idx ON medley USING btree (description); CLUSTER medley USING description_idx;

Now sampling of all lines from the first page returns us data sorted alphabetically by the description column. If the table changes, the new lines will drop out of the alphabetical list, but as long as the table is unchanged - the returned objects will be in perfect order. In addition, it can be periodically reclustered after changes, despite the fact that this operation locks the table and cannot be performed when people need access to it.

Finally, you can determine the total number of pages for a table using its total size in bytes.

 SELECT pg_relation_size('medley') / current_setting('block_size')::int;

When to use: TID Scan. When fast random access is required and no filtering is required. Works especially well with only increasing time data, with practically unchanged line width.

A set of keys with estimated bookmarks

As we have seen, the usual pagination on sets of keys does not allow you to move to specific pages, except when the user guesses. However, the PostgreSQL statistics collector supports histograms of the distribution of values across columns. We can use these estimates in conjunction with constraints and small offsets to get quick pagination with random access through a hybrid approach.

To begin, let's take a look at the statistics in our example:

 SELECT array_length(histogram_bounds, 1) - 1 FROM pg_stats WHERE tablename = 'medley' AND attname = 'n';

In my database, column n has 101 boundary exponents, i.e. 100 ranges between them. Specific values are not too knocked out, since the data is evenly distributed.

 {719,103188,193973,288794, … ,9690475,9791775,9905770,9999847}

Note that the values are approximate. The first number is not exactly 0, and the last is not exactly ten million. Ranges divide our information into a block of size B = 10,000,000 / 100 = 100,000 lines.

We can use PostgreSQL statistics collector histogram ranges to obtain probabilistically correct pages. If we choose the page size on the client side, equal to W, then how do we get the i-th page? It will be in the IW / B block, with an offset of IW% B.

Choosing W = 20, let's request a page of 270,000 from our test table.

 WITH bookmark AS ( SELECT (histogram_bounds::text::int[])[((270000 * 20) / 100000)+1] AS start, (histogram_bounds::text::int[])[((270000 * 20) / 100000)+2] AS stop FROM pg_stats WHERE tablename = 'medley' AND attname = 'n' LIMIT 1 ) SELECT * FROM medley WHERE n >= (select start from bookmark) AND n < (select stop from bookmark) ORDER BY n ASC LIMIT 20 OFFSET ((270000 * 20) % 100000);

This is performed super-fast (note that the shift occurs in a time close to zero in this case). The query returns rows with n = 5407259 to 5407278. The true values on page 270000 are equal to n = 5400001 to 5400020. The slip is 7239, or approximately 0.1%.

We were lucky with the choice of the page in this case. For contrast, page 74999 requires an offset of 99980. We know that our offset will be no more than 100,000. The upper limit is under our control if we want to reach a compromise. By customizing the PostgreSQL statistics collector, we can get a more accurate bar graph by column.

 ALTER TABLE medley ALTER COLUMN n SET statistics 1000; VACUUM ANALYZE;

Now we have 1000, instead of 100 histogram values. In my database, it looks like this:

 {10,10230,20863, …, 9980444,9989948,9999995}

In this case, the step of our shift will be no more than 10,000. The compromise is that the scheduler now scans more values, but slows down. So this is something between a bias inefficiency and a statistics collector overhead.

This hybrid “key / offset” method is probably not suitable for many real-life pagination applications. And the where clause will not work here. In addition, it is inaccurate and becomes only more and more inaccurate when the table is changed and if the statistics collector has not been launched for a long time.

When to use: Keyset with rating bookmarks. When the user needs deep, but approximate random access, without any additional filtering.

findings

Like many other engineering solutions, the choice of pagination technique requires compromises. It is safe to say that keying pagination is most applicable for an average site with an ordered linear access. However, even the limit / offset method has its strengths, and more exotic methods provide specific performance characteristics for certain types of data. As you can see, there are quite a few possibilities. Choose the right tool for the job and do not allow the pagination to be a closed book.

Source: https://habr.com/ru/post/301044/

All Articles