• mysql 删除重复数据


    如下表:

    单字段重复:

    例如:删除仅stu_name一个字段重复的数据

    全部删除

    -- 查询全部重复数据
    select * from tb_4 where stu_name in (select stu_name from tb_4 group by stu_name having count(1) > 1)
    -- 删除全部重复数据
    delete from tb_4 where stu_name in (select a.stu_name from (select stu_name from tb_4 group by stu_name having count(1) > 1)a)

    部分删除(保留一行)

    -- 查询多余重复数据
    select * from tb_4 where id not in (select min(id) from tb_4 group by stu_name)
    -- 删除多余重复数据(保留一行)
    delete from tb_4 where id not in (select a.min_id from (select min(id) as min_id from tb_4 group by stu_name)a)

      

    多字段重复:

     在单字段的基础上,将group by中的字段修改为你想要的即可

     例如:删除stu_name、score两个字段重复的数据

    全部删除

    -- 查询全部重复数据
    select * from tb_4 where  stu_name in (select a.stu_name from (select stu_name from tb_4 group by stu_name, score having count(1) > 1)a)
    and score in (select a.score from (select score from tb_4 group by stu_name, score having count(1) > 1)a)
    -- 删除全部重复数据
    delete from tb_4 where stu_name in (select a.stu_name from (select stu_name from tb_4 group by stu_name, score having count(1) > 1)a)
    and score in (select a.score from (select score from tb_4 group by stu_name, score having count(1) > 1)a)

     

    部分删除(保留一行)

    -- 查询多余重复数据
    select * from tb_4 where id not in (select min(id) from tb_4 group by stu_name, score)
    -- 删除多余重复数据(保留一行)
    delete from tb_4 where id not in (select a.min_id from (select min(id) as min_id from tb_4 group by stu_name, score)a)

    保留多行

    如下表:删除相同医院、相同科室5个以上的医生数据(删除红框中的数据)

     结合sql和python实现

    1. 以下sql可以将相同医院相同科室符合条件的数据删除一行

    delete from tb_test_doctor_1 where phone in 
    (
    select b.phone from 
    (select phone from tb_test_doctor_1 group by hospital, department having count(1) > 5 ORDER BY count(1) desc) as b 
    )

    2. 使用python循环,直到删完

    import MySQLdb
    
    conn = MySQLdb.connect(
        host='192.168.1.0',
        port=3306,
        user='root',
        passwd='123',
        db='test'
    )
    cur = conn.cursor()
    
    # sql = 'select version()'
    #
    # cur.execute(sql)
    # a = cur.fetchall()
    # print(a)
    
    
    sql_sum = 'select count(1) from tb_test_doctor_1 group by hospital, department having count(1) > 5 ORDER BY count(1) DESC limit 1'
    cur.execute(sql_sum)
    sum_tuple = cur.fetchone()
    sum = sum_tuple[0]
    print(sum)
    
    sql2 = """
    delete from tb_test_doctor_1 where phone in 
    (
    select b.phone from 
    (select phone from tb_test_doctor_1 group by hospital, department having count(1) > 5 ORDER BY count(1) desc) as b 
    
    )
    """
    for n in range(sum):
        cur.execute(sql2)
        conn.commit()
    
    
    cur.close()
    conn.close()
  • 相关阅读:
    NodeJS学习笔记 (7)网络服务-http-client(ok)
    NodeJS学习笔记 (6)网络服务-http-res(ok)
    NodeJS学习笔记 (5)网络服务-http-req(ok)
    NodeJS学习笔记 (4)网络服务-http(ok)
    NodeJS学习笔记 (3)域名解析-dns(ok)
    NodeJS学习笔记 (2)文件系统操作-fs(ok)
    NodeJS学习笔记 (1)资源压缩-zlib(ok)
    indexedDB介绍
    React diff机制(介绍虚拟DOM的机制)
    区间dp总结
  • 原文地址:https://www.cnblogs.com/xiaochongc/p/15621469.html
Copyright © 2020-2023  润新知