《MySQL高级》索引分析和优化笔记 - 按分析类型排序

terry 2年前 (2023-09-26) 阅读数 78 #数据库

学习《MySQL高级》高阳老师关于索引课程的笔记。本文重点介绍order by type分析

建表

# 建表
CREATE TABLE tblA(
#id int primary key not null autp_increment,
age int,
birth timestamp not null
);

insert into tblA(age,birth) values(22,now());
insert into tblA(age,birth) values(23,now());
insert into tblA(age,birth) values(24,now());

# 建立复合索引
CREATE INDEX idx_A_ageBirth on tblA(age,birth);

select * from tblA;

order by optimization（索引分析）

由于这张表只有age和birth两个字段，被复合索引覆盖，所以选择*相当于年龄、出生的选择。查询直接到索引，不回表。 这里我们只关注发生排序（order by）时文件的排序（filesort）。

1.1 声明 select * from tblA where age > 20 按年龄排序；

+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type  | possible_keys  | key            | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+--------------------------+
|  1 | SIMPLE      | tblA  | NULL       | index | idx_A_ageBirth | idx_A_ageBirth | 9       | NULL |    3 |   100.00 | Using where; Using index |
+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+--------------------------+

使用age字段的索引进行排序，不出现文件类型。

1.2声明select * from tblA where age > 20 按年龄、出生排序；

+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type  | possible_keys  | key            | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+--------------------------+
|  1 | SIMPLE      | tblA  | NULL       | index | idx_A_ageBirth | idx_A_ageBirth | 9       | NULL |    3 |   100.00 | Using where; Using index |
+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+--------------------------+

排序，年龄、出生对应复合索引顺序，因此排序使用年龄、出生这两个字段的索引，文件排序器不会出现。

1.3声明select from tblA where Age > 20 order bybirth；

+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+------------------------------------------+
| id | select_type | table | partitions | type  | possible_keys  | key            | key_len | ref  | rows | filtered | Extra                                    |
+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+------------------------------------------+
|  1 | SIMPLE      | tblA  | NULL       | index | idx_A_ageBirth | idx_A_ageBirth | 9       | NULL |    3 |   100.00 | Using where; Using index; Using filesort |
+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+------------------------------------------+

查询时使用age字段的索引；
排序时，因为age是一个范围，范围后面的内容都是无效的，所以age不能用，出生索引排序，然后出现文件排序。

如果age是等价查询，则排序时不会出现文件类型。参见如下语句：

mysql> explain select * from tblA where age = 22 order by birth;
+----+-------------+-------+------------+------+----------------+----------------+---------+-------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys  | key            | key_len | ref   | rows | filtered | Extra                    |
+----+-------------+-------+------------+------+----------------+----------------+---------+-------+------+----------+--------------------------+
|  1 | SIMPLE      | tblA  | NULL       | ref  | idx_A_ageBirth | idx_A_ageBirth | 5       | const |    1 |   100.00 | Using where; Using index |
+----+-------------+-------+------------+------+----------------+----------------+---------+-------+------+----------+--------------------------+

1.4 声明 select * from tblA where Age > 20 按出生、年龄排序；

+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+------------------------------------------+
| id | select_type | table | partitions | type  | possible_keys  | key            | key_len | ref  | rows | filtered | Extra                                    |
+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+------------------------------------------+
|  1 | SIMPLE      | tblA  | NULL       | index | idx_A_ageBirth | idx_A_ageBirth | 9       | NULL |    3 |   100.00 | Using where; Using index; Using filesort |
+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+------------------------------------------+

排序时，因为出生、年龄与复合索引的顺序不符，会出现Filesort。

2.1解释select * from tblA order bybirth；

+----+-------------+-------+------------+-------+---------------+----------------+---------+------+------+----------+-----------------------------+
| id | select_type | table | partitions | type  | possible_keys | key            | key_len | ref  | rows | filtered | Extra                       |
+----+-------------+-------+------------+-------+---------------+----------------+---------+------+------+----------+-----------------------------+
|  1 | SIMPLE      | tblA  | NULL       | index | NULL          | idx_A_ageBirth | 9       | NULL |    3 |   100.00 | Using index; Using filesort |
+----+-------------+-------+------------+-------+---------------+----------------+---------+------+------+----------+-----------------------------+

排序时，只会出现birth，Filesort才会出现。

2.2 声明 select * from tblA wherebirth > '2016-01-28 00:00:00' order bybirth；

+----+-------------+-------+------------+-------+---------------+----------------+---------+------+------+----------+------------------------------------------+
| id | select_type | table | partitions | type  | possible_keys | key            | key_len | ref  | rows | filtered | Extra                                    |
+----+-------------+-------+------------+-------+---------------+----------------+---------+------+------+----------+------------------------------------------+
|  1 | SIMPLE      | tblA  | NULL       | index | NULL          | idx_A_ageBirth | 9       | NULL |    3 |    33.33 | Using where; Using index; Using filesort |
+----+-------------+-------+------------+-------+---------------+----------------+---------+------+------+----------+------------------------------------------+

排序时，where和order的字段顺序与协作索引order不匹配，并且文件出现排序。

2.3声明 select from tblA wherebirth > '2016-01-28 00:00:00' order by Age;

+----+-------------+-------+------------+-------+---------------+----------------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type  | possible_keys | key            | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+-------+------------+-------+---------------+----------------+---------+------+------+----------+--------------------------+
|  1 | SIMPLE      | tblA  | NULL       | index | NULL          | idx_A_ageBirth | 9       | NULL |    3 |    33.33 | Using where; Using index |
+----+-------------+-------+------------+-------+---------------+----------------+---------+------+------+----------+--------------------------+

排序时，字段顺序由优化器从where in order开始优化，匹配指数。订单、文件排序不会发生。

2.4 声明 select from tblA order by age asc,birth desc;

+----+-------------+-------+------------+-------+---------------+----------------+---------+------+------+----------+-----------------------------+
| id | select_type | table | partitions | type  | possible_keys | key            | key_len | ref  | rows | filtered | Extra                       |
+----+-------------+-------+------------+-------+---------------+----------------+---------+------+------+----------+-----------------------------+
|  1 | SIMPLE      | tblA  | NULL       | index | NULL          | idx_A_ageBirth | 9       | NULL |    3 |   100.00 | Using index; Using filesort |
+----+-------------+-------+------------+-------+---------------+----------------+---------+------+------+----------+-----------------------------+

排序时，虽然字段age和birth按照索引顺序匹配，但age是升序排列，birth是降序排列，所以索引无法使用排序，出现文件排序。如果年龄和出生同时增加或减少，文件排序将不会发生。

按索引排序无法使用时的优化

如果排序无法使用索引，则会出现文件类型。 filesort 有两种算法：双向排序和单向排序。

双路排序

MySQL 4.1使用了双路排序，字面意思就是扫描磁盘两次，最终得到数据。

按列顺序读取行字符，排序，然后扫描已经排序的列表，根据列表中的值从列表中读取对应的数据输出。从磁盘获取排序字段，在缓冲区中排序，然后从磁盘获取其他字段。

要获取一批数据，必须扫描磁盘两次。众所周知，I/O是非常耗时的，所以在mysql4.1之后，出现了第二种改进算法，就是单向排序。

单向排序

从磁盘读取查询所需的所有列，在缓冲区中按列顺序排序，然后扫描排序后的列表输出。效率更高，避免了二次读取数据的次数。并且将随机IO变成了顺序IO，但是会使用更多的空间，因为它把每一行都保存在内存中。

总体来说，单向排序比较好，但也存在问题。为了什么？

在sort_buffer中单路排序比多路取出所有字段，因此有可能提取的数据d '超过了sort_buffer的容量。导致每次只能采集sort_buffer容量的数据，进行排序（创建tmp文件，多路合并），然后排序后sort_buffer容量减少。 size，然后排序...从而多次I/O。

本来想节省一次I/O操作，结果却导致大量的I/O操作，得不偿失。

如何优化？通过增加sort_buffer_size容量和max_length_for_sort_data

提高排序No

1的速度。订购时选择*是禁忌，仅检查必填字段非常重要。这里的影响是：

1.1 如果查询的字段大小总和小于 max_length_for_sort_data 并且排序字段不是 TEXT|BLOB 类型，则使用改进算法——单向排序，否则使用旧算法- 使用多路排序。

1.2 两种算法的数据都可能超出sort_buffer的容量。覆盖后，会创建一个 tmp 文件来执行合并排序，从而产生多个 I/O。但是使用单向排序算法的风险会比较大，所以需要增大sort_buffer_size。

2。尝试增大sort_buffer_size

无论使用什么算法，增大该参数都会提高效率。当然，必须根据系统的能力进行改进，因为这个参数是针对每个进程的。

3。尝试 max_length_for_sort_data

增加此参数会增加使用改进算法的概率。但如果设置得太高，总数据容量超过sort_buffer_size的概率就会增加。明显的症状是磁盘 I/O 活动高和处理器利用率低。

优化后的分组

1. Gourp By 实际上是先排序后分组，秉承复合索引的最佳左前缀原则；

2。如果索引列不能使用，也必须进行排序，增加max_length_for_sort_data和sort_buffer_size；

3。 where之前，尽量使用where进行条件过滤。

SQL调优顺序

启用慢查询以捕获慢日志
解释+分析慢SQL
Profile显示查询执行细节和SQL参数的生命周期MySQL Server DBA中的QL参数）

版权声明

本文仅代表作者观点，不代表Code前端网立场。
本文系作者Code前端网发表，如需转载，请注明页面地址。