【Oracle SQL】两百万数据的表中名称存在重复，直接删除方案和借助临时表方案比较（删除比例约七成，前者比后者慢）

【Oracle SQL】两百万数据的表中名称存在重复，直接删除方案和借助临时表方案比较（删除比例约七成，前者比后者慢）
【实验环境】

Oracle11g

【实验对象表及数据】
```
create table test05(
    id number(10),
    name nvarchar2(5),
    primary key(id)
)

insert into test05
select
       rownum,
       dbms_random.string('*',dbms_random.value(1,5))
from dual
connect by level<2000001;
```
约耗时15秒

【需求】

如果名称字段存在重复，则删除重复的记录

【备份数据以方便二次实验】

create table test06 as select * from test05;

【第一方案】

select count(*) from (select name from test05 group by name having count(id)>1);
发现有157551个name有重复值。

delete from test05 a where exists (select null from test05 b where b.name=a.name and b.id<a.id)
已删除1431272行。

已用时间: 00: 00: 50.04

SQL> select count(*) from test05;

COUNT(*)
----------
568728

已用时间: 00: 00: 00.03

【倒数据】
truncate table test05;
insert into test05 select * from test06;
用时约16秒

【第二方案】
select count(*) from (select name from test05 group by name having count(id)>1);
还是发现157551个name有重复值。这里确认了条件一致。

create table test07 as select * from test05 a where not exists (select null from test05 b where b.name=a.name and b.id<a.id)
用时约一秒半

SQL> select count(*) from test07;

COUNT(*)
----------
568728

已用时间: 00: 00: 00.04

truncate table test05;
insert into test05 select * from test07;
用时约12秒

由上可知，直接删除耗时约50秒，借助临时表耗时约1.5+12约14秒，这说明删除比例高时临时表方案胜出。

END
相关阅读:
为什么CAP不能同时满足？
多线程模式下高并发的环境中唯一确保单例模式---DLC双端锁
 有道词典命令行查询工具（Mac/Ubuntu）
CentOS 6.9配置EPEL源
 GitHub官方Markdown语法教程
 CentOS 6.9设置阿里云源/163源
 Ubuntu 16.04安装Wine版的微信（deepin-wechat）
普通主板设置BIOS实现电脑插电自动启动
 IntelliJ IDEA导出设置
 Linux下swap分区多大才合适的问题探讨
原文地址：https://www.cnblogs.com/heyang78/p/15964524.html