在MySQL中,分区表的数据被分到不同的表空间中,但仍然被视为单个表。在处理大量数据集的某些情况下,分区可能是一种有用的方法。分区表可以加快删除大型数据集的速度,但如果处理不当,可能会将数据错误地放置在表中。
在这将分享如何检查和修复这样一个表中的数据,同时尽量减少对表的破坏。
分区表:
mysql> show create table salariesG *************************** 1. row *************************** Table: salaries Create Table: CREATE TABLE `salaries` ( `emp_no` int(11) NOT NULL, `salary` int(11) NOT NULL, `from_date` date NOT NULL, `to_date` date NOT NULL, PRIMARY KEY (`emp_no`,`from_date`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 /*!50500 PARTITION BY RANGE COLUMNS(from_date) (PARTITION p01 VALUES LESS THAN ('1985-12-31') ENGINE = InnoDB, PARTITION p02 VALUES LESS THAN ('1986-12-31') ENGINE = InnoDB, PARTITION p03 VALUES LESS THAN ('1987-12-31') ENGINE = InnoDB, PARTITION p04 VALUES LESS THAN ('1988-12-31') ENGINE = InnoDB, PARTITION p05 VALUES LESS THAN ('1989-12-31') ENGINE = InnoDB, PARTITION p06 VALUES LESS THAN ('1990-12-31') ENGINE = InnoDB, PARTITION p07 VALUES LESS THAN ('1991-12-31') ENGINE = InnoDB, PARTITION p08 VALUES LESS THAN ('1992-12-31') ENGINE = InnoDB, PARTITION p09 VALUES LESS THAN ('1993-12-31') ENGINE = InnoDB, PARTITION p10 VALUES LESS THAN ('1994-12-31') ENGINE = InnoDB, PARTITION p11 VALUES LESS THAN ('1995-12-31') ENGINE = InnoDB, PARTITION p12 VALUES LESS THAN ('1996-12-31') ENGINE = InnoDB, PARTITION p13 VALUES LESS THAN ('1997-12-31') ENGINE = InnoDB, PARTITION p14 VALUES LESS THAN ('1998-12-31') ENGINE = InnoDB, PARTITION p15 VALUES LESS THAN ('1999-12-31') ENGINE = InnoDB, PARTITION p16 VALUES LESS THAN ('2000-12-31') ENGINE = InnoDB, PARTITION p17 VALUES LESS THAN ('2001-12-31') ENGINE = InnoDB, PARTITION p18 VALUES LESS THAN ('2002-12-31') ENGINE = InnoDB, PARTITION p19 VALUES LESS THAN (MAXVALUE) ENGINE = InnoDB) */
查询任意一个数据集,MySQL可以快速地返回数据,没有任何错误:
mysql> SELECT * FROM salaries where emp_no=10001 and from_date='1999-06-23'; +--------+--------+------------+------------+ | emp_no | salary | from_date | to_date | +--------+--------+------------+------------+ | 10001 | 84917 | 1999-06-23 | 2000-06-22 | +--------+--------+------------+------------+ 1 row in set (0.00 sec)
但是如果数据发生了分区错位,就不能返回任何值了:
mysql> SELECT * FROM salaries where emp_no=10001 and from_date='1999-06-23';
乍看数据像是已经被删除了,即使使用explain查看:
mysql> EXPLAIN SELECT * FROM salaries where emp_no=10001 and from_date='1999-06-23'G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: NULL partitions: NULL type: NULL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: NULL filtered: NULL Extra: no matching row in const table 1 row in set, 1 warning (0.00 sec)
但是数据还在表中,只是在错误的分区中:
mysql> SELECT * FROM salaries PARTITION(p08) LIMIT 1; +--------+--------+------------+------------+ | emp_no | salary | from_date | to_date | +--------+--------+------------+------------+ | 10001 | 84917 | 1999-06-23 | 2000-06-22 | +--------+--------+------------+------------+ 1 row in set (0.00 sec) Empty set (0.00 sec)
如何检测是否有错位的行呢?
可以借助工具check,来检测:(大表需要的时间会很长)
mysql> CHECK TABLE salaries; +--------------------+-------+----------+---------------------------------------------------------------------------------------+ | Table | Op | Msg_type | Msg_text | +--------------------+-------+----------+---------------------------------------------------------------------------------------+ | employees.salaries | check | error | Found a misplaced row in part 7 should be in part 14: from_date:1999-06-23 | | employees.salaries | check | error | Partition p08 returned error | | employees.salaries | check | error | Table upgrade required. Please do "REPAIR TABLE `salaries`" or dump/reload to fix it! | +--------------------+-------+----------+---------------------------------------------------------------------------------------+ 3 rows in set (0.83 sec)
行是如何错位的?
上面的例子抛出了一个问题:MySQL怎么允许这种情况发生的,如何修复它?对于前者的答案是MySQL允许用户用一个单独的表交换一个分区;默认情况下,它在插入数据到表之前会检查每一行。
mysql> ALTER TABLE salaries EXCHANGE PARTITION p15 WITH TABLE salaries_p15; Query OK, 0 rows affected (0.04 sec)
这里,salaries_p15是一个空表,结构和salaries表相同,只是没有做分区而已。
mysql> show create table salaries_p15G *************************** 1. row *************************** Table: salaries_p15 Create Table: CREATE TABLE `salaries_p15` ( `emp_no` int(11) NOT NULL, `salary` int(11) NOT NULL, `from_date` date NOT NULL, `to_date` date NOT NULL, PRIMARY KEY (`emp_no`,`from_date`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 1 row in set (0.00 sec)
交换分区之后,分区p15就是空的,而salaries_p15包含了分区P15的数据。
mysql> SELECT count(*) from salaries_p15; +----------+ | count(*) | +----------+ | 260926 | +----------+ 1 row in set (0.02 sec) mysql> SELECT count(*) from salaries PARTITION(p15); +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (0.06 sec)
MySQL为了加快分区交换的速度,可以不做数据检查:
mysql> ALTER TABLE salaries EXCHANGE PARTITION p08 WITH TABLE salaries_p15 WITHOUT VALIDATION; Query OK, 0 rows affected (0.04 sec)
这里分区p8可以和salaries_p15快速地交换,从而就导致了记录的错位!在生产环境下,建议谨慎使用without validation选项。
mysql> SELECT * FROM salaries where emp_no=10001 and from_date='1999-06-23'; +--------+--------+------------+------------+ | emp_no | salary | from_date | to_date | +--------+--------+------------+------------+ | 10001 | 84917 | 1999-06-23 | 2000-06-22 | +--------+--------+------------+------------+ 1 row in set (0.00 sec)
如何修复?
有两种修复方法:
1.使用repair table命令
2.执行一次空的alter交换操作
repair操作会将错位的记录安排到正确的分区,但是需要在表上加上system lock。
mysql> repair table salaries; +--------------------+--------+----------+-----------------------------+ | Table | Op | Msg_type | Msg_text | +--------------------+--------+----------+-----------------------------+ | employees.salaries | repair | warning | Moved 260926 misplaced rows | | employees.salaries | repair | status | OK | +--------------------+--------+----------+-----------------------------+ 2 rows in set (5.30 sec)
第二种方法也要在表上加锁,直到运行结束:
ALTER employees.salaries ENGINE=Innodb;
此外,可以借助pt-online工具,在不加锁的情况下重建表。比如:
pt-online-schema-change --execute --alter "ENGINE=InnoDb" h=localhost,D=employees,t=salaries
标签:ENGINE,错位,修复,PARTITION,salaries,分区表,InnoDB,date,THAN From: https://www.cnblogs.com/abclife/p/17283604.html