• sql优化实战:把full join改为left join +union all(从5分钟降为10秒)


    今天收到一个需求,要改写一个报表的逻辑,当改完之后,再次运行,发现运行超时。

    因为特殊原因,无法访问客户的服务器,没办法查看sql的执行计划、没办法知道表中的索引情况,所以,尝试从语句的改写上来优化。

    一、原始语句如下:

    select  isnull(vv.customer_id,v.customer_id) as customer_id,
    		isnull(vv.business_date,replace(v.business_date,'-','')) as business_date,
    		v.prod_id,
    		v.sales,
    		vv.visit_count,
            v.all_sales
    from 
    (
        SELECT  a.customer_id ,
    	        max(month)+'-01' as business_date,
                a.PROD_ID ,
                SUM(CAST(VALUE AS NUMERIC(38, 3))) sales,
                sum(SUM(CAST(VALUE AS NUMERIC(38, 3)))) over(partition by a.customer_id) as all_sales
    							
        FROM    TB_IMPORT_SALES a 
        WHERE   a.customer_id IS NOT NULL
                AND a.PROD_ID IS NOT NULL
    			and a.month='2016-11'
        GROUP BY a.customer_id ,
                a.PROD_ID
    )v
    full join
    (
        SELECT customer_id, 
    	       max(a.business_date) as business_date,
               COUNT(*) AS VISIT_COUNT 
    	FROM TB_CALL_STORE a WITH(NOLOCK)
    	inner join TB_TIME d
    	on a.business_date = d.t_date 
    	where d.section ='2016-11'
    	GROUP BY customer_id
    )vv
    on v.customer_id = vv.customer_id

    原来是left join,虽然查询比较慢,但是2分钟能查出来,现在按照业务要求,需要看到所有数据,所以改成了full join,改了之后5分钟都查不出结果。


    二、改写后的代码

    select  v.customer_id,
    		replace(max(v.business_date),'-','') as business_date,
    		v.prod_id,
    		max(v.sales_volume) sales_volume ,
    		max(v.visit_count) visit_count,
                    max(v.all_sales_volume) all_sales_volume
    from 
    (
        SELECT  a.customer_id ,
    	        max(biz_month)+'-01' as business_date,
                a.PROD_ID ,
                SUM(CAST(VALUE1 AS NUMERIC(38, 8))) sales_volume,
                sum(SUM(CAST(VALUE1 AS NUMERIC(38, 8)))) over(partition by a.customer_id) as all_sales_volume,
    			null as visit_count
    							
        FROM    TB_IMPORT_SALES a 
        WHERE   a.customer_id IS NOT NULL
                AND a.PROD_ID IS NOT NULL
    			and a.month='2016-11'
        GROUP BY a.customer_id ,
                 a.PROD_ID
        union all
    
        SELECT customer_id, 
    	       max(a.business_date) as business_date,
    		   p.prod_id,
    		   null,
    		   null,
               COUNT(*) AS VISIT_COUNT 
    	FROM TB_CALL_STORE a WITH(NOLOCK)
    	cross apply
    	(
    		select top 1 prod_id from TB_PRODUCT with(nolock)
    	)p
    	inner join TB_TIME d
    	on a.business_date = d.t_date 
    	where d.section ='2016-11'
    	GROUP BY customer_id,p.prod_id
    )v
    group by v.customer_id,
             v.prod_id

    由于代码本身比较简单,没办法再进一步简化,而由于连接不了服务器,其他的方法也用不上,甚至没办法分析到底是什么导致运行这么慢。

    想了想,full join 本质上就是 2次left join+union ,无非就是合并数据,于是尝试一下用union all来直接合并数据,现在改成unoin all最后,就不需要full join。

    但是考虑到第2段代码中并没有prod_id这个字段,所以这里在第2段代码加上了cross apply随便取出一个产品的id,这样就有prod_id这个字段,可以合并了。

    修改之后,果然速度降到了10多秒。


  • 相关阅读:
    在linux下的使用复制命令cp,不让出现“overwrite”(文件覆盖)提示的方法。【转】
    Java 学习 day05
    Java 学习 day04
    Java 学习 day03
    Java 学习 day02
    Java 学习 day01
    学习TensorFlow,TensorBoard可视化网络结构和参数
    自编码器及相关变种算法简介
    自编码器(autoencoder)
    卷积神经网络
  • 原文地址:https://www.cnblogs.com/momogua/p/8304398.html
Copyright © 2020-2023  润新知