MySQL性能优化

为了保持生产环境中数据库的稳定性和性能，增强用户体验。同时也为了避免因数据库连接超时产生页面5xx的错误，有时候我们需要对数据库进行某些方面的优化。主要包括以下几个方面：

SQL及索引优化
数据库表结构
数据库系统配置参数
操作系统及硬件

它们具体的优化效果及成本关系如下图所示：

在生产环境下，SQL及索引优化占有比较大的比重，而且效果比较明显，根据左边的箭头，越往上优化的成本越高；效果却越来越不明显。

【说明】：本实验环境基于mysql5.6.21版本。

一、SQL优化

1、如何发现有问题的SQL?

通过慢查询日志可以发现，在使用慢查询日志之前，需要设置以下变量参数：

(product)root@localhost [(none)]> show variables like 'slow_query_log';   #开启慢查询
+----------------+-------+
| Variable_name  | Value |
+----------------+-------+
| slow_query_log | ON    |
+----------------+-------+
1 row in set (0.01 sec)

(product)root@localhost [(none)]> show variables like 'long_query_time';  #记录超过多少秒的sql将记录到慢查询日志中。为0表示记录所有sql
+-----------------+----------+
| Variable_name   | Value    |
+-----------------+----------+
| long_query_time | 1.000000 |
+-----------------+----------+
1 row in set (0.00 sec)
(product)root@localhost [(none)]> show variables like 'log_queries_not_using_indexes'; #记录没有使用索引的sql到日志中
+-------------------------------+-------+
| Variable_name                 | Value |
+-------------------------------+-------+
| log_queries_not_using_indexes | ON    |
+-------------------------------+-------+
1 row in set (0.00 sec)

1）慢查询日志的存储格式?

### 开始时间
# Time: 151224 17:27:43
### 执行SQL的主机信息
# User@Host: root[root] @ localhost []  Id:     4
### SQL的执行信息
# Query_time: 0.125071  Lock_time: 0.122781 Rows_sent: 10  Rows_examined: 10
use sakila;
### 执行时间
SET timestamp=1450949263;
### SQL的内容
select * from actor limit 10;

2）使用官方mysqldumpslow工具

[root@node1 ~]# mysqldumpslow -t 1 /data/mysql/mysql_3306/data/slow.log -a

Reading mysql slow query log from /data/mysql/mysql_3306/data/slow.log
Count: 1  Time=0.00s (0s)  Lock=0.00s (0s)  Rows=0.0 (0), 0users@0hosts
  # Time: 151224 17:27:43
  # User@Host: root[root] @ localhost []  Id:     4
  # Query_time: 0.125071  Lock_time: 0.122781 Rows_sent: 10  Rows_examined: 10
  use sakila;
  SET timestamp=1450949263;
  select * from actor limit 10

官方提供的慢查询分析工具比较方便快捷，但是提供的分析数据比较有限，下面介绍一款更强大的工具

3） pt-query-digest工具

[root@node1 bin]# pt-query-digest /data/mysql/mysql_3306/data/slow.log 
#############################第一部分######################################
# 190ms user time, 490ms system time, 25.93M rss, 213.35M vsz
# Current date: Mon Dec 28 15:00:03 2015
# Hostname: node1
# Files: /data/mysql/mysql_3306/data/slow.log
# Overall: 1 total, 1 unique, 0 QPS, 0x concurrency ______________________
# Time range: all events occurred at 2015-12-24 17:27:43
# Attribute          total     min     max     avg     95%  stddev  median
# ============     ======= ======= ======= ======= ======= ======= =======
# Exec time          125ms   125ms   125ms   125ms   125ms       0   125ms
# Lock time          123ms   123ms   123ms   123ms   123ms       0   123ms
# Rows sent             10      10      10      10      10       0      10
# Rows examine          10      10      10      10      10       0      10
# Query size            28      28      28      28      28       0      28

###############################第二部分####################################
# Profile
# Rank Query ID           Response time Calls R/Call V/M   Item
# ==== ================== ============= ===== ====== ===== ============
#    1 0x5665CD6BAE86EAEC 0.1251 100.0%     1 0.1251  0.00 SELECT actor

###############################第三部分####################################
# Query 1: 0 QPS, 0x concurrency, ID 0x5665CD6BAE86EAEC at byte 0 ________
# This item is included in the report because it matches --limit.
# Scores: V/M = 0.00
# Time range: all events occurred at 2015-12-24 17:27:43
# Attribute    pct   total     min     max     avg     95%  stddev  median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count        100       1
# Exec time    100   125ms   125ms   125ms   125ms   125ms       0   125ms
# Lock time    100   123ms   123ms   123ms   123ms   123ms       0   123ms
# Rows sent    100      10      10      10      10      10       0      10
# Rows examine 100      10      10      10      10      10       0      10
# Query size   100      28      28      28      28      28       0      28
# String:
# Databases    sakila
# Hosts        localhost
# Users        root
# Query_time distribution
#   1us
#  10us
# 100us
#   1ms
#  10ms
# 100ms  ################################################################
#    1s
#  10s+
# Tables
#    SHOW TABLE STATUS FROM `sakila` LIKE 'actor'G
#    SHOW CREATE TABLE `sakila`.`actor`G
# EXPLAIN /*!50100 PARTITIONS*/
select * from actor limit 10G

pt-query-digest输出内容大体分三部分，那么如何通过这个结果找到问题SQL？

查询次数多且查询时间长的SQL，一般pt-query-digest输出内容的前几个查询
IO大的SQL:第三部分的Rows examine项
未命中索引的项：注意Rows examine和Rows sent项对比

4） explain工具

explain可以分析某个查询的执行计划，通过执行计划分析查询是否利用索引。具体的输出参数如下：

(product)root@localhost [sakila]> explain select * from actor order by last_update;
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra          |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
|  1 | SIMPLE      | actor | ALL  | NULL          | NULL | NULL    | NULL |  200 | Using filesort |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
1 row in set (0.00 sec)

select_type:表示查询类型，有简单查询、连接查询等。
type：重要的列，从优到差的类型为：const、eq_reg、ref、range、index、all。
possible_keys:可能用到的索引有哪些。为空表示没有可利用的索引
key:实际利用的索引，为空表示没有使用索引。
key_len：索引的长度，越短越好
ref：显示索引哪一列被使用了。常数是最佳值
rows:扫描的行数
extra：重点关注using_filesort和using_temporary两项。using_filesort表示用到了排序，一般在order by情境下；using_temporary表示用到额外的临时表来存储查询的数据，一般在order by和group by情况下，出现以上两个选项表示该查询需要优化了。

2、count()和max()的优化

1）Max()优化：

(product)root@localhost [sakila]> explain select max(payment_date) from payment;
+----+-------------+---------+------+---------------+------+---------+------+-------+-------+
| id | select_type | table   | type | possible_keys | key  | key_len | ref  | rows  | Extra |
+----+-------------+---------+------+---------------+------+---------+------+-------+-------+
|  1 | SIMPLE      | payment | ALL  | NULL          | NULL | NULL    | NULL | 16088 | NULL  |
+----+-------------+---------+------+---------------+------+---------+------+-------+-------+
1 row in set (0.00 sec)
########################################################################################################################################
以上payment_date字段没有索引，故进行全表扫描后获得最大值
########################################################################################################################################

下面为payment_date字段添加索引，再看看效果：

(product)root@localhost [sakila]> create index idx_payment_date on payment(payment_date);
Query OK, 0 rows affected (0.23 sec)
Records: 0  Duplicates: 0  Warnings: 0

(product)root@localhost [sakila]> explain select max(payment_date) from payment;+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                        |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
|  1 | SIMPLE      | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL | Select tables optimized away |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
1 row in set (0.00 sec)

########################################################################################################################################
不用全表扫描了，直接从mysql内部一次读取结果，不需要再优化了。
########################################################################################################################################

2）count()优化：

提到count()，一般都会有人误解，有人说count(*)比count(column)慢，下面实验说明。

count(*)与count(column)区别

(product)root@localhost [sakila]> select * from test1;
+----+--------+
| id | name   |
+----+--------+
|  1 | NULL   |
|  2 | darren |
+----+--------+

(product)root@localhost [sakila]> select count(*) from test1;
+----------+
| count(*) |
+----------+
|        2 |
+----------+
1 row in set (0.00 sec)

(product)root@localhost [sakila]> select count(name) from test1;
+-------------+
| count(name) |
+-------------+
|           1 |
+-------------+
1 row in set (0.00 sec)

########################################################################################################################################
如果字段的值有NULL，那么count(*)会包括该行，而count(有空值col)会忽略该行；
########################################################################################################################################

而在性能上，不带where条件的情况下，count(*)比count(column)快点：

View Code

带where条件时，两者基本上差不多，可以视为无差别：

View Code

当带有where条件时，将条件写到count括号里效率较高，比一般写到where条件后性能好：

View Code

【总结】：

count(*)和count(column)在性能上差别不是很大，count(*)稍微好些，它们的主要差异在于统计数据准确性上，count(*)包括null行，所以在无特殊需求下建议使用count(*)
如果带有where条件，建议将条件写到前面count括号中，注意加上or null，这种写法在效率上比where条件写法高效。

3、子查询的优化

(product)root@localhost [sakila]> select count(*) from book where user_id in(select id from test); #未使用索引
+----------+
| count(*) |
+----------+
|   129987 |
+----------+
1 row in set (0.13 sec)

(product)root@localhost [sakila]> select count(*) from book where user_id in(select id from test); #使用索引后
 +----------+ 
 | count(*) | 
 +----------+ 
 | 129987 | +
 ----------+ 
1 row in set (0.00 sec)

对于子查询优化，首先要确保用上索引，这是关键。以前常用的有两种方案，一种是转换为join连接查询，另一种是采用exsits取代in，但是听说mysql新版本正在对子查询进行优化，下面是5.6执行比较，这里不评价：

View Code

4、limit的优化

limit给分页带来了很大的方便，但是随着数据量增加，分页的性能也会越来越慢，比如下面的limit取数，同样获取10行数据，但是数据量不同的情况下，性能也是不同的：

(product)root@localhost [sakila]> select * from test order by name limit 1,10;+-------+------------+
| id    | name       |
+-------+------------+
|     2 | darren1    |
|    11 | darren10   |
|   101 | darren100  |
| 10006 | darren1000 |
| 10005 | darren1000 |
| 10004 | darren1000 |
| 10003 | darren1000 |
| 10002 | darren1000 |
| 10001 | darren1000 |
|  1001 | darren1000 |
+-------+------------+
10 rows in set (5.69 sec)

(product)root@localhost [sakila]> select * from test order by name limit 10000,10;
+---------+------------+
| id      | name       |
+---------+------------+
| 1008998 | darren1008 |
| 1008999 | darren1008 |
| 1009000 | darren1008 |
| 1009001 | darren1009 |
| 1009002 | darren1009 |
| 1009003 | darren1009 |
| 1009004 | darren1009 |
| 1009005 | darren1009 |
| 1009006 | darren1009 |
| 1009007 | darren1009 |
+---------+------------+
10 rows in set (38.49 sec)

(product)root@localhost [sakila]> explain select * from test order by name limit 1,10;   #排序没有利用索引
+----+-------------+-------+------+---------------+------+---------+------+---------+----------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows    | Extra          |
+----+-------------+-------+------+---------------+------+---------+------+---------+----------------+
|  1 | SIMPLE      | test  | ALL  | NULL          | NULL | NULL    | NULL | 9730146 | Using filesort |
+----+-------------+-------+------+---------------+------+---------+------+---------+----------------+
1 row in set (0.00 sec)

那么怎么进行优化呢？这里分两个步骤：

order by后的字段需要利用索引，建议使用主键或索引键

这里换成主键或者索引字段排序，再来看下效果：

(product)root@localhost [sakila]> select * from test order by id  limit 1,10;
+----+----------+
| id | name     |
+----+----------+
|  2 | darren1  |
|  3 | darren2  |
|  4 | darren3  |
|  5 | darren4  |
|  6 | darren5  |
|  7 | darren6  |
|  8 | darren7  |
|  9 | darren8  |
| 10 | darren9  |
| 11 | darren10 |
+----+----------+
10 rows in set (0.00 sec)

(product)root@localhost [sakila]> select * from test order by id  limit 10000,10;
+-------+------------+
| id    | name       |
+-------+------------+
| 10001 | darren1000 |
| 10002 | darren1000 |
| 10003 | darren1000 |
| 10004 | darren1000 |
| 10005 | darren1000 |
| 10006 | darren1000 |
| 10007 | darren1000 |
| 10008 | darren1000 |
| 10009 | darren1000 |
| 10010 | darren1000 |
+-------+------------+
10 rows in set (0.00 sec)

这时看下执行计划：

(product)root@localhost [sakila]> explain select * from test order by id  limit 1,10;     #跟页数有关，页数比较少时扫描的行数较少
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------+
|  1 | SIMPLE      | test  | index | NULL          | PRIMARY | 4       | NULL |   11 | NULL  |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------+
1 row in set (0.00 sec)

(product)root@localhost [sakila]> explain select * from test order by id  limit 10000,10; #跟页数有关，页数比较多时扫描的行数也越来越多了，随着数据量增加，性能也肯定下降   
+----+-------------+-------+-------+---------------+---------+---------+------+-------+-------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows  | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+-------+-------+
|  1 | SIMPLE      | test  | index | NULL          | PRIMARY | 4       | NULL | 10010 | NULL  |
+----+-------------+-------+-------+---------------+---------+---------+------+-------+-------+
1 row in set (0.00 sec)

跟页数有关，页数比较少时扫描的行数较少；页数比较多时扫描的行数也越来越多了，随着数据量增加，性能也肯定下降；下面进一步进行优化：

采用子查询方式记住上次offset位置，取页的固定大小

View Code

效果：取900万行以后的10行，直接用limit 9000000，10执行多次，大约耗时2.10秒；而采用子查询修改后，执行多次，大约耗时1.77秒；

二、索引优化

索引优化设计内容很多，这里不多详述。

三、数据库结构优化

1、选择合适的数据类型

数据类型的选择重点在于“合适”二字，如何确定数据类型是否合适呢？

使用可以存下你数据的最小数据类型
使用简单的数据类型，如int比varchar处理简单
尽可能使用not null定义字段
尽量少用text字段，如要使用事前考虑分表

如存储ip地址，一般我们都会选择char或者varchar来存储，但是建议使用bigint来存储，使用bigint只需要8个字节，而使用varchar最多占用15字节，而且整数比较效率也高。INET_ATON()和INET_NTOA()进行ip转换。

四、系统配置优化

第三方工具https://tools.percona.com/wizard

相关阅读:
1107 Social Clusters (30)
1066 Root of AVL Tree (25）
1099 Build A Binary Search Tree (30)
1064 Complete Binary Search Tree (30)
1043 Is It a Binary Search Tree (25)
Hadoop学习总结（1）——大数据以及Hadoop相关概念介绍
 MyBatis学习总结（10）——批量操作
 MyBatis学习总结（10）——批量操作
 Java基础学习总结（23）——GUI编程
 Java基础学习总结（23）——GUI编程
原文地址：https://www.cnblogs.com/henrylinux/p/9926374.html