• UNION or OR in SQL Server Queries


    Recently I came across with an article on DB2 about using Union instead of OR. So I thought of carrying out a research on SQL Server on what scenarios UNION is optimal in and which scenarios OR would be best. I will analyze this with a few scenarios using samples taken  from the AdventureWorks database Sales.SalesOrderDetail table.

    Scenario 1: Selecting all columns

    So we are going to select all columns and you have a non-clustered index on the ProductID column.

    --Query 1 : OR
    SELECT * FROM Sales.SalesOrderDetail
    WHERE ProductID = 714 OR ProductID =709
         OR ProductID =998 OR ProductID =875
         OR ProductID =976 OR ProductID =874
    --Query 2 : UNION
    SELECT * FROM Sales.SalesOrderDetail
    WHERE ProductID = 714
    UNION
    SELECT * FROM Sales.SalesOrderDetail
    WHERE ProductID = 709
    UNION
    SELECT * FROM Sales.SalesOrderDetail
    WHERE ProductID = 998
    UNION
    SELECT * FROM Sales.SalesOrderDetail
    WHERE ProductID = 875
    UNION
    SELECT * FROM Sales.SalesOrderDetail
    WHERE ProductID = 976
    UNION
    SELECT * FROM Sales.SalesOrderDetail
    WHERE ProductID = 874

    So query 1 is using OR and the later is using UNION. Let us analyze the execution plans for these queries.

    Query 1

    Query 2

    As expected Query 1 will use Clustered Index Scan but Query 2, uses all sorts of things. In this case, since it is using multiple CPUs you might have CX_PACKET waits as well.

    Let’s look at the profiler results for these two queries:

    CPU

    Reads

    Duration

    Row Counts

    OR

    78

    1252

    389

    3854

    UNION

    250

    7495

    660

    3854

    You can see from the above table the UNION query is not performing well as the  OR query though both are retuning same no of rows (3854).These results indicate that, for the above scenario UNION should be used.

    Scenario 2: Non-Clustered and Clustered Index Columns only



    --Query 1 : OR
    SELECT ProductID,SalesOrderID, SalesOrderDetailID FROM Sales.SalesOrderDetail
    WHERE ProductID = 714 OR ProductID =709
         OR ProductID =998 OR ProductID =875
         OR ProductID =976 OR ProductID =874
    GO
    --Query 2 : UNION
    SELECT ProductID,SalesOrderID, SalesOrderDetailID FROM Sales.SalesOrderDetail
    WHERE ProductID = 714
    UNION
    SELECT ProductID,SalesOrderID, SalesOrderDetailID FROM Sales.SalesOrderDetail
    WHERE ProductID = 709
    UNION
    SELECT ProductID,SalesOrderID, SalesOrderDetailID FROM Sales.SalesOrderDetail
    WHERE ProductID = 998
    UNION
    SELECT ProductID,SalesOrderID, SalesOrderDetailID FROM Sales.SalesOrderDetail
    WHERE ProductID = 875
    UNION
    SELECT ProductID,SalesOrderID, SalesOrderDetailID FROM Sales.SalesOrderDetail
    WHERE ProductID = 976
    UNION
    SELECT ProductID,SalesOrderID, SalesOrderDetailID FROM Sales.SalesOrderDetail
    WHERE ProductID = 874
    GO

    So this time, we will be selecting only index columns, which means these queries will avoid a data page lookup. As in the previous case we will analyze the execution plans:

    Query 1

    Query 2

    Again, Query 2 is more complex than Query 1. Let us look at the profile analysis:

    CPU

    Reads

    Duration

    Row

    Counts

    OR

    0

    24

    208

    3854

    UNION

    0

    38

    193

    3854

    In this analyzis, there is only slight difference between OR and UNION.

    Scenario 3: Selecting all columns for different fields

    Up to now, we were using only one column (ProductID) in the where clause.  What if we have two columns for where clauses and let us assume both are covered by non-clustered indexes?

    --Query 1 : OR
    SELECT *
    FROM Sales.SalesOrderDetail
    WHERE ProductID = 714
          OR CarrierTrackingNumber LIKE 'D0B8%'
    --Query 2 : UNION
    SELECT *
    FROM Sales.SalesOrderDetail
    WHERE ProductID = 714
    UNION
    SELECT *
    FROM Sales.SalesOrderDetail
    WHERE CarrierTrackingNumber
     LIKE 'D0B8%'

    Query 1

    Query 2:

    As we can see, the query plan for the second query has improved. Let us see the profiler results.

    CPU

    Reads

    Duration

    Row

    Counts

    OR

    47

    1278

    443

    1228

    UNION

    31

    1334

    400

    1228

    So in this case too, there is little difference between OR and UNION.

    Scenario 4: Selecting Clustered index columns for different fields

    Now let us go only with clustered indexes:

    --Query 1 : OR
    SELECT *
    FROM Sales.SalesOrderDetail
    WHERE ProductID = 714
          OR CarrierTrackingNumber LIKE 'D0B8%'
    --Query 2 : UNION
    SELECT *
    FROM Sales.SalesOrderDetail
    WHERE ProductID = 714
    UNION
    SELECT *
    FROM Sales.SalesOrderDetail
    WHERE CarrierTrackingNumber
     LIKE 'D0B8%'

    Query 1

    Query 2

    Now both execution plans are almost identical except is an additional Stream Aggregate is used in the first query. This means UNION has advantage over OR in this scenario. Let us see profiler results for these queries again.

    CPU

    Reads

    Duration

    Row

    Counts

    OR

    0

    319

    366

    1228

    UNION

    0

    50

    193

    1228

    Now see the differences, in this scenario UNION has somewhat of an advantage over OR.

    Conclusion

    Using UNION or OR depends on the scenario you are faced with. So you need to do your analyzing before selecting the appropriate method. Also, above the four scenarios are not all an exhaustive list of scenarios, I selected those for the broad description purposes only.

  • 相关阅读:
    yourphp常用标签
    如何访问他人电脑上的共享文件夹
    Win7如何分享局域网并设置共享文件夹账户和密码
    CLR Via CSharp读书笔记(21):自动内存管理(垃圾回收)
    《Java编程思想》之I/O系统
    WebCore::Node Dump
    java中的IO整理(3)数据操作流合并流压缩流输入输出重定向 老秋的日志 网易博客
    WebKit 分析–for android Braincol 博客园
    JavaScript EE,第 2 部分: 用 Ajax 调用远程 JavaScript 函数
    java中System重定向输出流
  • 原文地址:https://www.cnblogs.com/qanholas/p/2199646.html
Copyright © 2020-2023  润新知