一直听说exists性能比in快,但为啥快,一直不明白,乘着今天重点研究mysql,看到底是不是这么回事,原因又是在哪里。
1、我们先准备2张表和数据,人员表插入100W条数据,部门表插入5条数据。
DROP TABLE IF EXISTS `bd_dept`; CREATE TABLE `bd_dept` ( `id` int(0) NOT NULL, `dept_name` varchar(20) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT '部门名称',PRIMARY KEY (`id`) USING BTREE ) ENGINE = InnoDB CHARACTER SET = utf8 COLLATE = utf8_general_ci COMMENT = '部门' ROW_FORMAT = Dynamic; DROP TABLE IF EXISTS `bd_user`; CREATE TABLE `bd_user` ( `id` bigint(0) NOT NULL, `dept_id` int(0) NULL DEFAULT NULL COMMENT '部门ID', `user_name` varchar(20) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT '用户名称',PRIMARY KEY (`id`) USING BTREE,INDEX `dept_id`(`dept_id`) USING BTREE, CONSTRAINT `bd_user_ibfk_1` FOREIGN KEY (`dept_id`) REFERENCES `bd_dept` (`id`) ON DELETE RESTRICT ON UPDATE RESTRICT ) ENGINE = InnoDB CHARACTER SET = utf8 COLLATE = utf8_general_ci COMMENT = '人员' ROW_FORMAT = DYNAMIC;
2、我们测试一下结果
1)根据部门名称找所有该部门下的人员信息:
select * from bd_user a where exists (select 1 from bd_dept b where b.dept_name='研发中心' and b.id=a.dept_id) > OK > Time: 0.814s select * from bd_user a where a.dept_id in (select id from bd_dept b where b.dept_name='研发中心') > OK > Time: 0.8s
从结果上看,in和exists查询效率基本一致,多次执行执行查询,看不出来谁快谁慢。我们看下他们的执行计划:
mysql> explain select * from bd_user a where exists (select 1 from bd_dept b where b.dept_name='研发中心' and b.id=a.dept_id);
+----+-------------+-------+------------+------+---------------+---------+---------+---------------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+---------+---------+---------------+--------+----------+-------------+
| 1 | SIMPLE | b | NULL | ALL | PRIMARY | NULL | NULL | NULL | 4 | 25.00 | Using where |
| 1 | SIMPLE | a | NULL | ref | dept_id | dept_id | 5 | zhi_test.b.id | 332010 | 100.00 | NULL |
+----+-------------+-------+------------+------+---------------+---------+---------+---------------+--------+----------+-------------+
mysql> explain select * from bd_user a where a.dept_id in (select id from bd_dept b where b.dept_name='研发中心');
+----+-------------+-------+------------+------+---------------+---------+---------+---------------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+---------+---------+---------------+--------+----------+-------------+
| 1 | SIMPLE | b | NULL | ALL | PRIMARY | NULL | NULL | NULL | 4 | 25.00 | Using where |
| 1 | SIMPLE | a | NULL | ref | dept_id | dept_id | 5 | zhi_test.b.id | 332010 | 100.00 | NULL |
+----+-------------+-------+------------+------+---------------+---------+---------+---------------+--------+----------+-------------+
他们的执行计划完全相同。
2)根据人员名称找他所在部门的信息:
select * from bd_dept a where exists (select 1 from bd_user b where b.user_name='测试0000000031' and a.id=b.dept_id) > OK > Time: 0.19s select * from bd_dept a where a.id in (select b.dept_id from bd_user b where b.user_name='测试0000000031') > OK > Time: 0.19s
从结果上看in和exists效率还是一样的,我们再看看执行计划
mysql> explain select * from bd_dept a where exists (select 1 from bd_user b where b.user_name='测试0000000031' and a.id=b.dept_id); +----+--------------+-------------+------------+--------+---------------------+---------------------+---------+---------------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------+-------------+------------+--------+---------------------+---------------------+---------+---------------+--------+----------+-------------+ | 1 | SIMPLE | a | NULL | ALL | PRIMARY | NULL | NULL | NULL | 4 | 100.00 | Using where | | 1 | SIMPLE | <subquery2> | NULL | eq_ref | <auto_distinct_key> | <auto_distinct_key> | 5 | zhi_test.a.id | 1 | 100.00 | NULL | | 2 | MATERIALIZED | b | NULL | ALL | dept_id | NULL | NULL | NULL | 996030 | 10.00 | Using where | +----+--------------+-------------+------------+--------+---------------------+---------------------+---------+---------------+--------+----------+-------------+
mysql> explain select * from bd_dept a where a.id in (select b.dept_id from bd_user b where b.user_name='测试0000000031'); +----+--------------+-------------+------------+--------+---------------------+---------------------+---------+---------------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------+-------------+------------+--------+---------------------+---------------------+---------+---------------+--------+----------+-------------+ | 1 | SIMPLE | a | NULL | ALL | PRIMARY | NULL | NULL | NULL | 4 | 100.00 | Using where | | 1 | SIMPLE | <subquery2> | NULL | eq_ref | <auto_distinct_key> | <auto_distinct_key> | 5 | zhi_test.a.id | 1 | 100.00 | NULL | | 2 | MATERIALIZED | b | NULL | ALL | dept_id | NULL | NULL | NULL | 996030 | 10.00 | Using where | +----+--------------+-------------+------------+--------+---------------------+---------------------+---------+---------------+--------+----------+-------------+
这2个SQL的执行计划也完全相同。
3、结论
实践出真知,in和exists的性能在MySQL8.x中是一样的。