适用范围
1.适用于MySQL 或者Oceanbase for MySQL
2.适用于两表或多表join 的字段字符序不同的场景。
如本例:
COLLATE utf8mb4_general_ci 和 COLLATE utf8mb4_bin
`cust_no` varchar(30) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci
`cust_no` varchar(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin
问题概述
两表或者多表Join 因Join字段的字符序不同,导致产生隐式转换而无法使用到index。详细的原理内容请参考文章末尾的参考文档。以供大家参考。(早前有写过Oceanbase For MySQL隐式转换的类似案例。)
问题原因
对两表或多表Join的连接字段各自指定了字符序,但指定的不一致,导致产生隐式转换。(本例就属于这类。)
也有在我服务的客户这边,因MySQL 或Oceanbase For MySQL的开发规范规定表的字符序需要为utf8mb4_bin,偏偏这两类数据库如果表的字符集选择utf8mb4字符序不指定,则默认的字符序为:utf8mb4_general_ci,由于有的开发人员会落实开发规范的差异,导致两表或多表Join的字段因字符序的不同而产生隐式转换成为一项比较普遍存在的问题。
详细过程
原始SQL
SELECT count(1) locationCount
FROM yhtest_base_addrs_info a
LEFT JOIN wyh_testinfo_location b
ON a.cust_no = b.cust_no
WHERE b.location IS NULL
AND a.address IS NOT NULL
原始SQL 的执行时间及执行计划
-- 原始SQL 执行时间
用户反馈长时间执行不出结果。
-- 原始SQL 执行计划
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+------------------------------------------------+
| 1 | SIMPLE | a | NULL | ALL | NULL | NULL | NULL | NULL | 1880706 | 90.00 | Using where |
| 1 | SIMPLE | b | NULL | ALL | PRIMARY | NULL | NULL | NULL | 1828366 | 10.00 | Range checked for each record (index map: 0x1) |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+------------------------------------------------+
2 rows in set, 2 warnings (0.01 sec)
-- 原始SQL 执行计划warning信息。
root@localhost [yhtest]> show warnings;
+---------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message |
+---------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Warning | 1739 | Cannot use ref access on index 'PRIMARY' due to type or collation conversion on field 'cust_no' |
| Note | 1003 | /* select#1 */ select count(1) AS `locationCount` from `yhtest`.`yhtest_base_addrs_info` `a` left join `yhtest`.`wyh_testinfo_location` `b` on((`yhtest`.`a`.`cust_no` = `yhtest`.`b`.`cust_no`)) where ((`yhtest`.`b`.`location` is null) and (`yhtest`.`a`.`address` is not null)) |
+---------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)
从执行计划看a,b 两表Join没什么index可使用到。但执行计划有warnings,(通常有隐式转换的执行计划下面均有warnings,需要show warnings可以看到详细的隐式转换字段。)
通过show warnings可见 cust_no 字段有隐式转换。Cannot use ref access on index ‘PRIMARY’ due to type or collation conversion on field ‘cust_no’
另外关于 Range checked for each record (index map: 0x1) 的描述参考官网: EXPLAIN Output Format
相关表基础信息
-- 表信息
CREATE TABLE `yhtest_base_addrs_info` (
`cust_no` varchar(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL COMMENT '客户编号',
`address` varchar(2000) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL COMMENT '地址信息',
PRIMARY KEY (`cust_no`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin;
CREATE TABLE `wyh_testinfo_location` (
`cust_no` varchar(30) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NOT NULL COMMENT '客户编号',
`location` varchar(500) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL COMMENT '经纬度(根据地址通过百度地图api获得)',
PRIMARY KEY (`cust_no`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci ;
-- 表数据量
root@localhost [yhtest]> select count(*) from yhtest_base_addrs_info where address IS NOT NULL;
+----------+
| count(*) |
+----------+
| 1774412 |
+----------+
1 row in set (0.69 sec)
root@localhost [yhtest]> select count(*) from wyh_testinfo_location where location IS NULL;
+----------+
| count(*) |
+----------+
| 1151313 |
+----------+
1 row in set (0.44 sec)
发现表cust_no为两表的主键。
但 两表collation不同
`cust_no` varchar(30) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci
`cust_no` varchar(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin
修改字符序消除隐式转换
alter table wyh_testinfo_location modify cust_no varchar(30) character set utf8mb4 collate utf8mb4_bin;
消除隐式转换后的执行时间及执行计划
-- 执行时间
+---------------+
| locationCount |
+---------------+
| 1202357 |
+---------------+
1 row in set (5.78 sec)
-- 执行计划
``` language
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+---------------+---------+---------+-----------------+---------+----------+-------------+
| 1 | SIMPLE | a | NULL | ALL | NULL | NULL | NULL | NULL | 1880706 | 90.00 | Using where |
| 1 | SIMPLE | b | NULL | eq_ref | PRIMARY | PRIMARY | 122 | yhtest.a.cust_no | 1 | 10.00 | Using where |
+----+-------------+-------+------------+--------+---------------+---------+---------+-----------------+---------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)
结论:
该案例两表Join 的连接字段因字符序不同引起的隐式转换,通过修改统一连接字段的字符序而使长时间执行不出结果的SQL 的执行时间提升到了5.78 sec出结果。
标签:cust,utf8mb4,no,yhtest,COLLATE,mysql,collation,NULL,隐式 From: https://blog.51cto.com/u_13482808/8305502