mysql 索引知识梳理
如果说数据是字典的话,索引就是字典的目录。mysql常用的索引有两种 B-Tree(实际上是B+Tree),和 Hash,默认是前者(实际上InnoDB只支持BTree)
哪些列适合创建索引
哪些列不适合创建索引
主键索引
It has an associated index, for fast query performance. Query performance benefits from the NOT NULL optimization, because it cannot include any NULL values. With the InnoDB storage engine, the table data is physically organized to do ultra-fast lookups and sorts based on the primary key column or columns.
主键有唯一
外键优化
大表拆成多张单表,然后通过外键关联起来,这样可以减少查询的数据量,提高查询效率。
If a table has many columns, and you query many different combinations of columns, it might be efficient to split the less-frequently used data into separate tables with a few columns each, and relate them back to the main table by duplicating the numeric ID column from the main table.
底层原理:每次尽可能加载少量的数据块,减少磁盘IO,占用更少的缓存内存
the queries might perform less I/O and take up less cache memory because the relevant columns are packed together on disk. (To maximize performance, queries try to read as few data blocks as possible from disk; tables with only a few columns can fit more rows in each data block.)
多列索引
composite indexes 或者称为 组合索引
1. 基础概念
-
sorted array
-
leftmost prefix
-
Index Merge Optimization
-
alternative
As an alternative to a composite index, you can introduce a column that is “hashed” based on information from other columns. If this column is short, reasonably unique, and indexed, it might be faster than a “wide” index on many columns. In MySQL, it is very easy to use this extra column:
SELECT * FROM tbl_name
WHERE hash_col=MD5(CONCAT(val1,val2))
AND col1=val1 AND col2=val2;
2. 官网摘要
-
A multiple-column index can be considered a sorted array, the rows of which contain values that are created by concatenating the values of the indexed columns.
-
The first column in the index is the most significant column, the second column is the next most significant column, and so on.
-
If a multiple-column index exists on col1 and col2, the appropriate rows can be fetched directly. If separate single-column indexes exist on col1 and col2, the optimizer attempts to use the Index Merge optimization (see Section 8.2.1.3, “Index Merge Optimization”), or attempts to find the most restrictive index by deciding which index excludes more rows and using that index to fetch the rows.
单列索引,sql优化器会尝试使用
Index Merge优化
,或者尝试找到最具限制性的索引
,通过决定哪个索引排除更多行并使用该索引来获取行。 -
demo
SELECT * FROM tbl_name WHERE col1=val1; SELECT * FROM tbl_name WHERE col1=val1 AND col2=val2; SELECT * FROM tbl_name WHERE col2=val2; SELECT * FROM tbl_name WHERE col2=val2 AND col3=val3;
If the table has a multiple-column index, any leftmost prefix of the index can be used by the optimizer to look up rows. For example, if you have a three-column index on (col1, col2, col3), you have indexed search capabilities on (col1), (col1, col2), and (col1, col2, col3).
leftmost prefix
指的是索引的最左前缀,如果表有一个多列索引,优化器可以使用索引的任何最左前缀来查找行。例如,如果您在(col1,col2,col3)上有一个三列索引,则可以在(col1),(col1,col2)和(col1,col2,col3)上进行索引搜索。If an index exists on (col1, col2, col3), only the first two queries use the index. The third and fourth queries do involve indexed columns, but do not use an index to perform lookups because (col2) and (col2, col3) are not leftmost prefixes of (col1, col2, col3).
参考链接
- MySQL索引背后的数据结构及算法原理
- 线上高并发应用重构(写)填坑经验分享(一)
通过工程实践讲述了 MySQL分库分表 MySQL索引、字段设计的一些问题