📜 ⬆️ ⬇️

How I wrote my “bike” for rotating tables in Oracle and Postgre

Greetings, Habr!

In this article I will talk about how we struggled with the problem of the rapid growth of the size of tables in a database in a high-load EMS system. It adds its zest to the fact that the problem was solved for two databases: Oracle and Postgre. Interested please under the cat.

Initial conditions


So, there is some kind of EMS-system that receives and processes messages from network elements. A record of each message is recorded in the database tables. According to customer requirements, the number of incoming messages (and, accordingly, the number of records in the table) is on average 100 per second, while the peak load can increase to 1500.
It is easy to calculate that, on average, over 8 million records per table are collected per day. The problem arose when it became clear that with data volumes of more than 20 million lines, some system queries begin to slow down and go beyond the time limits of work requested by the customer.

Task


Thus, it was necessary to figure out what to do with the data, so that no information was lost and the requests worked quickly. At the same time, the system initially worked on Postgre, but in the near future it was planned to switch to Oracle and would like the transition to have a minimum of problems with the transfer of functionality.
The option of using partitioning was dropped immediately, because it was known that Oracle Partitioning would definitely not be included in the license, and we didn’t really want to redo it from one partitioning to another, so we began to think about the implementation of some kind of bicycle.
The task was significantly simplified by the fact that logs older than a couple of days are not needed for display in the system, since to investigate the vast majority of the problems should have been enough messages two days ago. But to keep them "for every fireman" is still needed. It was then that the idea was born of implementing procedures for periodically “rooting” the data in the tables, i.e. transfer them from tables for display in some historical tables.
')

Solution and implementation


It was decided to keep 2 tables with the most relevant data for display (let's call them table - the main one and table_secondary - the additional one). On these two tables, the table_view view is hung from which the data was taken for display: it is necessary so that after the moment of data transfer all records in the UI do not disappear abruptly. Older records are transferred to historical tables with names of the type H $ table_NUM, where NUM is the number of the historical table (the higher the number in the table). Historical tables, in order not to litter the main tablespace, are also periodically dragged into a “cold” tablespace, which tables can be stored on slow disks. The operation is, generally speaking, difficult, so it is done less often as a separate procedure. In addition, the same procedure removes too old tables from a cold tablespace.
Regarding exactly how the data is transferred: due to the large number of indexes on the tables, transferring records directly using insert was slow, so an approach was chosen with renaming tables and re-creating indexes and triggers.
Schematically, the procedures are shown in the figure:
image

So, the work algorithm turned out to be approximately the same (the algorithm and examples of the procedure code I cite for oracle, for postgre you can look at github):
The rotate_table (primary_table_name) procedure. It is performed, say, every hour.
  1. We check that the number of rows in the main table has exceeded a certain limit;
  2. Check that there is a “cold” tablespace:

    SELECT COUNT(*) INTO if_cold_ts_exists FROM USER_TABLESPACES WHERE tablespace_name = 'EMS_HISTORICAL_DATA'; 

  3. Create an auxiliary empty table new_table based on the current main table. For this, postgre has a convenient CREATE TABLE ... functionality (LIKE ... INCLUDING ALL), but for Oracle, we had to write our own analogue - the create_tbl_like_including_all procedure (primary_table_name, new_table_name, new_idx_trg_postfix, new_idx_trg_prefix), which creates a similar empty workstation, new_idx_trg_postfix, new_idx_trg_prefix), which creates a similar empty workstation, new_idx_trg_postfix, new_idx_trg_prefix), which creates a similar empty workstation, new_idx_trg_postfix, new_idx_trg_prefix), which creates a similar empty worktable, new_idx_trg_postfix, new_idx_trg_prefix), which creates a similar empty worktable, new_idx_trg_postfix, new_idx_trg_prefix), which creates a similar empty workflow.

     SELECT replace(dbms_metadata.get_ddl('TABLE', primary_table_name), primary_table_name, new_table_name) INTO ddl_query FROM dual; ddl_query := substr(ddl_query, 1, length(ddl_query) - 1); EXECUTE IMMEDIATE ddl_query; 

    As well as triggers and indexes to it:

     FOR idx IN (SELECT idxs.index_name FROM user_indexes idxs WHERE idxs.table_name = primary_table_name) LOOP ddl_query := REPLACE( REPLACE(dbms_metadata.get_ddl('INDEX', idx.index_name), primary_table_name, new_table_name), idx.index_name, new_idx_trg_prefix || idx.index_name || new_idx_trg_postfix); ddl_query := substr(ddl_query, 1, length(ddl_query) - 1); EXECUTE IMMEDIATE ddl_query; END LOOP; 

  4. Rename tables:

     EXECUTE IMMEDIATE 'alter table ' || secondary_table_name || ' rename to ' || history_table_name; EXECUTE IMMEDIATE 'alter table ' || primary_table_name || ' rename to ' || secondary_table_name; EXECUTE IMMEDIATE 'alter table ' || new_table_name || ' rename to ' || primary_table_name; 

  5. Rename triggers and indices for them;
  6. If the cold tablespace does not exist, then we consider that there is no need to store historical data, and delete the corresponding table:

     EXECUTE IMMEDIATE 'drop table ' || history_table_name || ' cascade CONSTRAINTS'; 

  7. Perform a view rebuild (oracle only):

     EXECUTE IMMEDIATE 'select * from ' || view_name || ' where 1=0'; 



Procedure move_history_logs_to_cold_ts (primary_table_name). It is performed, for example, once a day.
  1. If a “cold” tablespace exists, look for all historical tables that are not in this tablespace:

     EXECUTE IMMEDIATE 'select table_name from user_tables where table_name like ''' || history_table_pattern || ''' and (tablespace_name != ''EMS_HISTORICAL_DATA'' or tablespace_name is null)' BULK COLLECT INTO history_tables; 

  2. Move each table to a “cold” tablespace:

     EXECUTE IMMEDIATE 'ALTER TABLE ' || history_tables(i) || ' MOVE TABLESPACE ems_historical_data'; 

  3. We delete for the moved tables triggers and indexes;
  4. Perform deletion of too old tables from the cold tablespace.


The launching of scheduled procedures was done using Quartz Sheduler in the case of Postgre, and using Oracle Scheduler in the case of Oracle, the configuration scripts for which are also in the source code.

Conclusion


The complete source of the procedures and scripts for configuring the scheduler can be viewed on GitHub .

Thanks for attention!

Source: https://habr.com/ru/post/240061/


All Articles