📜 ⬆️ ⬇️

Mysql PARTITION BY YEAR (date) / MONTH (date) / DAYOFWEEK (date)

Often I have to deal with tables that contain rarely or never even updated data. A good example of such data are various logs. Some tables are regularly cleared of obsolete data, and some have to keep records "forever." Therefore, such tables “swell up” and working with them becomes a difficult operation for the entire system.

In order to reduce the load on the disk and file system, partitioning was invented; The data file of the table is cut by some condition into several small files - partitions. For the case with logs, it is reasonable to partition the tables by the field containing the dates of the event. It is often reasonable to cut a partition table by year by month or by month / week days.

Something tells you that you will have to cut across the timestamp field.

')
We make a sign:
CREATE TABLE `foo` ( `id` int(11) NOT NULL AUTO_INCREMENT, `date_added` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP, `name` varchar(30) DEFAULT NULL, `email` varchar(30) DEFAULT NULL, PRIMARY KEY (`id`) ); 


But what if you need for example to expand the table with logs by the days of the month? That is, something is written into the table that is stored for a month or two, and then deleted. How to be if we want to cut it like this:

  ALTER TABLE foo PARTITION BY RANGE (YEAR(date_added)) ( PARTITION p2011 VALUES LESS THAN (2012) , PARTITION p2012 VALUES LESS THAN (2013) , PARTITION p2013 VALUES LESS THAN (2014) ); 


We get:
 ERROR 1486 (HY000): Constant, random or timezone-dependent expressions in (sub)partitioning function are not allowed 

The explanation for this is: "TIMESTAMP is internally converted to local sessions timezone."

Okay:
 SELECT UNIX_TIMESTAMP('2012-01-01 00:00:00'); +---------------------------------------+ | UNIX_TIMESTAMP('2012-01-01 00:00:00') | +---------------------------------------+ | 1325361600 | +---------------------------------------+ SELECT UNIX_TIMESTAMP('2013-01-01 00:00:00'); +---------------------------------------+ | UNIX_TIMESTAMP('2013-01-01 00:00:00') | +---------------------------------------+ | 1356984000 | +---------------------------------------+ SELECT UNIX_TIMESTAMP('2014-01-01 00:00:00'); +---------------------------------------+ | UNIX_TIMESTAMP('2014-01-01 00:00:00') | +---------------------------------------+ | 1388520000 | +---------------------------------------+ 


Now:
  ALTER TABLE foo PARTITION BY RANGE (UNIX_TIMESTAMP(date_added)) ( PARTITION p2011 VALUES LESS THAN (1325361600) , PARTITION p2012 VALUES LESS THAN (1356984000) , PARTITION p2013 VALUES LESS THAN (1388520000) , PARTITION pMAXVALUE VALUES LESS THAN (MAXVALUE) ); 


Here, now we get:
 ERROR 1503 (HY000): A PRIMARY KEY must include all columns in the table's partitioning function 


It is treated:
 ALTER table foo DROP PRIMARY KEY, add PRIMARY KEY (`id`,`date_added`); 


Once again:
  ALTER TABLE foo PARTITION BY RANGE (UNIX_TIMESTAMP(date_added)) ( PARTITION p2011 VALUES LESS THAN (1325361600) , PARTITION p2012 VALUES LESS THAN (1356984000) , PARTITION p2013 VALUES LESS THAN (1388520000) , PARTITION pMAXVALUE VALUES LESS THAN (MAXVALUE) ); 


All OK.

We get:
 CREATE TABLE `foo` ( `id` int(11) NOT NULL AUTO_INCREMENT, `date_added` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP, `name` varchar(30) DEFAULT NULL, `email` varchar(30) DEFAULT NULL, PRIMARY KEY (`id`,`date_added`) ) ENGINE=InnoDB PARTITION BY RANGE (UNIX_TIMESTAMP(date_added)) (PARTITION p2011 VALUES LESS THAN (1325361600) ENGINE = InnoDB, PARTITION p2012 VALUES LESS THAN (1356984000) ENGINE = InnoDB, PARTITION p2013 VALUES LESS THAN (1388520000) ENGINE = InnoDB, PARTITION pMAXVALUE VALUES LESS THAN MAXVALUE ENGINE = InnoDB ); 


Fine!

This type of "slicing" is suitable if you need to decompose the archived data by file "for a year" or by month.
But how to be if you need for example a table with logs to breed by days months. Ie in the table something is written that is stored a month or two then rubs.

Ie how to be if we want to cut it like this:

  PARTITION BY RANGE (MONTH(date)) 

Or so:

  PARTITION BY RANGE (DAY(date_add)) 

The timestamp type field does not match.

Google says that it is necessary datetime and a point.

OK, create a table:
 CREATE TABLE `foo` ( `id` int(11) NOT NULL AUTO_INCREMENT, `date_added` datetime DEFAULT NULL, `name` varchar(30) DEFAULT NULL, `email` varchar(30) DEFAULT NULL, PRIMARY KEY (`id`,`date_added`) ) ENGINE=InnoDB; 


Note:
  `date_added` datetime DEFAULT NULL 


The fact is that CURRENT_TIMESTAMP does not roll as a default value for a datetime type field, NOW () cannot be specified as a default value because function.

And it is necessary that the date_added be set automatically.

There are two ways out:
1. Or in all queries in INSERT add NOW ().
2. Or to hang up the trigger which at each insert will be date_added = NOW ();

When INSERT codes are made in many places and everywhere corrected for INSERT ... NOW () we will not be able to use a trigger.

Sort of:
 DELIMITER $$ USE `test_db`$$ CREATE TRIGGER `foo_add` BEFORE INSERT ON `foo` FOR EACH ROW BEGIN SET NEW.date_added = IFNULL(NEW.date_added, NOW()); END; $$ 

Now we have a table with the necessary types, keys and a trigger.

And we can easily cut the table by months:
  ALTER TABLE foo PARTITION BY RANGE (MONTH(date_added)) ( PARTITION p01 VALUES LESS THAN (02) , PARTITION p02 VALUES LESS THAN (03) , PARTITION p03 VALUES LESS THAN (04) , PARTITION p04 VALUES LESS THAN (05) , PARTITION p05 VALUES LESS THAN (06) , PARTITION p06 VALUES LESS THAN (07) , PARTITION p07 VALUES LESS THAN (08) , PARTITION p08 VALUES LESS THAN (09) , PARTITION p09 VALUES LESS THAN (10) , PARTITION p10 VALUES LESS THAN (11) , PARTITION p11 VALUES LESS THAN (12) , PARTITION p12 VALUES LESS THAN (13) , PARTITION pmaxval VALUES LESS THAN MAXVALUE ); 

Or even by the day of the week:
  ALTER TABLE foo PARTITION BY RANGE (DAYOFWEEK(date_added)) ( PARTITION p01 VALUES LESS THAN (2) , PARTITION p02 VALUES LESS THAN (3) , PARTITION p03 VALUES LESS THAN (4) , PARTITION p04 VALUES LESS THAN (5) , PARTITION p05 VALUES LESS THAN (6) , PARTITION p06 VALUES LESS THAN (7) , PARTITION p07 VALUES LESS THAN (8) , PARTITION pmaxval VALUES LESS THAN MAXVALUE ); 

Or even 2 days on the partition:
 ALTER TABLE foo PARTITION BY LIST (DAY(date_added)) ( PARTITION p00 VALUES IN (0,1) , PARTITION p02 VALUES IN (2,3) , PARTITION p04 VALUES IN (4,5) , PARTITION p06 VALUES IN (6,7) , PARTITION p08 VALUES IN (8,9) , PARTITION p10 VALUES IN (10,11), PARTITION p12 VALUES IN (12,13), PARTITION p14 VALUES IN (14,15), PARTITION p16 VALUES IN (16,17), PARTITION p18 VALUES IN (18,19), PARTITION p20 VALUES IN (20,21), PARTITION p22 VALUES IN (22,23), PARTITION p24 VALUES IN (24,25), PARTITION p26 VALUES IN (26,27), PARTITION p28 VALUES IN (28,29), PARTITION p30 VALUES IN (30,31) ); 


So now everything is in your hands.

Source: https://habr.com/ru/post/159131/


All Articles