在MySQL中,分区表的数据被分隔到不同的表空间中,但仍被视为单个表。在处理大量数据集的某些情况下,分区可能是一种有用的方法。在分区表中删除庞大的数据集可以加快速度,但是如果处理不当,它可能会把数据放在表的不对应的位置中。
测试环境:mysql 8.0.30
建表
master [localhost:22031] {msandbox} (test) > create table sbtest (
-> id bigint not null,
-> age int not null,
-> from_date date not null,
-> to_date date not null,
-> primary key (id,from_date)
-> )
-> partition by range columns(from_date)
-> (PARTITION p01 VALUES LESS THAN ('2014-12-31') ,
-> PARTITION p02 VALUES LESS THAN ('2015-12-31') ,
-> PARTITION p03 VALUES LESS THAN ('2016-12-31') ,
-> PARTITION p04 VALUES LESS THAN ('2017-12-31') ,
-> PARTITION p05 VALUES LESS THAN ('2018-12-31') ,
-> PARTITION p06 VALUES LESS THAN ('2019-12-31') ,
-> PARTITION p07 VALUES LESS THAN ('2020-12-31') ,
-> PARTITION p08 VALUES LESS THAN ('2021-12-31') ,
-> PARTITION p09 VALUES LESS THAN ('2022-12-31') ,
-> PARTITION p10 VALUES LESS THAN (MAXVALUE) );
Query OK, 0 rows affected (0.05 sec)
create table sbtest_ex_p20(
id bigint not null,
age int not null,
from_date date not null,
to_date date not null,
primary key (id,from_date)
);
在大数据量的情况下,为了更快的替换分区数据,我们会执行如下操作
master [localhost:22031] {msandbox} (test) > ALTER TABLE sbtest EXCHANGE PARTITION p08 WITH TABLE sbtest_ex_p20 WITHOUT VALIDATION;
Query OK, 0 rows affected (0.02 sec)
现在查表的总数据。
master [localhost:22031] {msandbox} (test) > select * from sbtest ;
+----+-----+------------+------------+
| id | age | from_date | to_date |
+----+-----+------------+------------+
| 1 | 10 | 2014-06-01 | 2014-06-01 |
| 2 | 20 | 2015-06-01 | 2015-06-01 |
| 3 | 30 | 2016-06-01 | 2016-06-01 |
| 4 | 40 | 2017-06-01 | 2014-06-01 |
| 5 | 50 | 2018-06-01 | 2014-06-01 |
| 6 | 40 | 2019-06-01 | 2014-06-01 |
| 7 | 50 | 2020-06-01 | 2014-06-01 |
| 8 | 40 | 2021-06-01 | 2014-06-01 |
| 9 | 50 | 2022-06-01 | 2014-06-01 |
| 10 | 50 | 2023-06-01 | 2014-06-01 |
| 11 | 50 | 2023-05-01 | 2014-06-01 |
+----+-----+------------+------------+
11 rows in set (0.00 sec)
当按条件查询时:
master [localhost:22031] {msandbox} (test) > select * from sbtest where id=5 and from_date='2018-06-01';
Empty set (0.00 sec)
而明明全表查询的时候有ID=5 from_date='2018-06-01'的数据,根据条件查询确没有,再看一下执行计划
master [localhost:22031] {msandbox} (test) > explain select * from sbtest where id=5 and from_date='2018-06-01';
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------+
| 1 | SIMPLE | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | no matching row in const table |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------+
1 row in set, 1 warning (0.01 sec)
执行计划显示no matching row in const table。
这种结果是因为在做表交换分区的时候用了WITHOUT VALIDATION,不加这个选项时,分区交换是会报错的。那出现这种分区数据不在对应分区表中如可解决呢。可以有2种方法。
第一种:check table;
master [localhost:22031] {msandbox} (test) > check table sbtest;
+-------------+-------+----------+-------------------------------------------------------------------------------------+
| Table | Op | Msg_type | Msg_text |
+-------------+-------+----------+-------------------------------------------------------------------------------------+
| test.sbtest | check | error | Found a misplaced row in part 7 should be in part 0:
from_date:2014-06-01 |
| test.sbtest | check | error | Partition p08 returned error |
| test.sbtest | check | error | Table upgrade required. Please do "REPAIR TABLE `sbtest`" or dump/reload to fix it! |
+-------------+-------+----------+-------------------------------------------------------------------------------------+
3 rows in set (0.00 sec)
按照提示方案解决
master [localhost:22031] {msandbox} (test) > REPAIR TABLE `sbtest`;
+-------------+--------+----------+-------------------------+
| Table | Op | Msg_type | Msg_text |
+-------------+--------+----------+-------------------------+
| test.sbtest | repair | warning | Moved 10 misplaced rows |
| test.sbtest | repair | status | OK |
+-------------+--------+----------+-------------------------+
2 rows in set (0.00 sec)
再执行查询看看
master [localhost:22031] {msandbox} (test) > select * from sbtest where id=5 and from_date='2018-06-01';
+----+-----+------------+------------+
| id | age | from_date | to_date |
+----+-----+------------+------------+
| 5 | 50 | 2018-06-01 | 2014-06-01 |
+----+-----+------------+------------+
1 row in set (0.00 sec)
查询有数据,显示正常。但是如果表的数据量很大,这个repair表操作会锁表,影响线上业务。
第二种:pt工具,最小可能影响业务。
pt-online-schema-change --execute --alter "ENGINE=InnoDB" h=localhost,D=test,t=sbtest
相当于是重建表,这种操作锁表的时间短,建议在业务低峰执行
标签:01,06,2014,test,分区表,sbtest,MYSQL,date,Found From: https://blog.51cto.com/u_13482808/7068076