• Consider using EXISTS instead of IN


    redgate给出的提示

    https://www.red-gate.com/hub/product-learning/sql-prompt/consider-using-not-exists-instead-not-subquery

    Phil Factor explains why you should prefer use of [NOT] EXISTS over [NOT] IN, when comparing data sets using a subquery.

    While there is no longer any significant performance advantage, using NOT EXISTS will avoid unexpected results when the subquery’s source data contains NULL values.

    It used to be that the EXISTS logical operator was faster than IN, when comparing data sets using a subquery. For example, in cases where the query had to perform a certain task, but only if the subquery returned any rows, then when evaluating WHERE [NOT] EXISTS (subquery), the database engine could quit searching as soon as it had found just one row, whereas WHERE [NOT] IN (subquery) would always collect all the results from the sub-query, before further processing.

    However, the query optimizer now treats EXISTS and IN the same way, whenever it can, so you’re unlikely to see any significant performance differences. Nevertheless, you need to be cautious when using the NOT IN operator if the subquery’s source data contains NULL values. If so, you should consider using a NOT EXISTS operator instead of NOT IN, or recast the statement as a left outer join.

    A recommendation to prefer use of [NOT] EXISTS over [NOT] IN is included as a code analysis rule in SQL Prompt

    https://documentation.red-gate.com/codeanalysis/performance-rules/pe019

    In theory, EXISTS is faster because the search stops as soon as the condition is true, whereas IN has to collect all sub-query results before testing the condition.

    In practice, the query optimizer treats EXISTS and IN the same way whenever it can.

    Stack Overflow上的解释

    https://stackoverflow.com/questions/24929/difference-between-exists-and-in-in-sql

    I'm assuming you know what they do, and thus are used differently, so I'm going to understand your question as: When would it be a good idea to rewrite the SQL to use IN instead of EXISTS, or vice versa.

    Is that a fair assumption?


    Edit: The reason I'm asking is that in many cases you can rewrite an SQL based on IN to use an EXISTS instead, and vice versa, and for some database engines, the query optimizer will treat the two differently.

    For instance:

    SELECT *
    FROM Customers
    WHERE EXISTS (
        SELECT *
        FROM Orders
        WHERE Orders.CustomerID = Customers.ID
    )

    can be rewritten to:

    SELECT *
    FROM Customers
    WHERE ID IN (
        SELECT CustomerID
        FROM Orders
    )

    or with a join:

    SELECT Customers.*
    FROM Customers
        INNER JOIN Orders ON Customers.ID = Orders.CustomerID

    So my question still stands, is the original poster wondering about what IN and EXISTS does, and thus how to use it, or does he ask wether rewriting an SQL using IN to use EXISTS instead, or vice versa, will be a good idea?

    SQL Server IN vs. EXISTS Performance

    EXISTS will be faster because once the engine has found a hit, it will quit looking as the condition has proved true.

    With IN, it will collect all the results from the sub-query before further processing.

    • This used to be true but in current versions (at least 2008) the optimizer is much smarter... it actually treats IN () just like an EXISTS (). – Aaron Bertrand Jan 14 '10 at 16:51
    • @Aaron - yes, typically the optimzer will internally produce a better plan. However, relying on internal shortcuts could be detrimental in more complex scenarios.

    Example

    使用In

    SELECT *
    FROM   dbo.CMS_Transformation
    WHERE  TransformationClassID IN (   SELECT ClassID
                                        FROM   dbo.CMS_Class
                                        WHERE  ClassName LIKE '%lisa%' );

    使用Exists

    SELECT *
    FROM   dbo.CMS_Transformation
    WHERE  EXISTS (   SELECT *
                      FROM   dbo.CMS_Class
                      WHERE  ClassName LIKE '%lisa%'
                             AND ClassID = CMS_Transformation.TransformationClassID );

     对比

     执行结果,因为数据量较少,看起来似乎是一样的

    扩展

    Not Exists 和Not In是完全不同的东西

     https://stackoverflow.com/questions/129077/not-in-clause-and-null-values

    To state it simply, why does query A return a result but B doesn't?

    A: select 'true' where 3 in (1, 2, 3, null)

    B: select 'true' where 3 not in (1, 2, null)

    This was on SQL Server 2005. I also found that calling set ansi_nulls off causes B to return a result.

    Query A is the same as:

    select 'true' where 3 = 1 or 3 = 2 or 3 = 3 or 3 = null

    Since 3 = 3 is true, you get a result.

    Query B is the same as:

    select 'true' where 3 <> 1 and 3 <> 2 and 3 <> null

    When ansi_nulls is on, 3 <> null is UNKNOWN, so the predicate evaluates to UNKNOWN, and you don't get any rows.

    When ansi_nulls is off, 3 <> null is true, so the predicate evaluates to true, and you get a row.

  • 相关阅读:
    logstash收集nginx日志写入kafka
    Logstash收集日志写入Redis
    Nginx反向代理kibana实现认证访问
    logstash收集nginx日志写入kafka1
    201999:渗透测试,基础学习,windows基础命令,笔记
    2019910:渗透测试,基础学习
    中华吸血鬼恶意病毒分析
    磁碟机病毒分析
    利用SQL语句对不同数据库进行高效果分页
    TimeBased Blind SQL Injection with Heavy Queries
  • 原文地址:https://www.cnblogs.com/chucklu/p/9046347.html
Copyright © 2020-2023  润新知