• Using more than one index per table is dangerous?


    Using more than one index per table is dangerous?

    问题

    In a former company I worked at, the rule of thumb was that a table should have no more than one index (allowing the odd exception, and certain parent-tables holding references to nearly all other tables and thus are updated very frequently).

    The idea being that often, indexes cost the same or more to uphold than they gain. Note that this question is different to indexed-view-vs-indexes-on-table as the motivation is not only reporting.

    Is this true? Is this index-purism worth it?

    In your career do you generally avoid using indexes?

    What are the general large-scale recommendations regarding indexes?

    Currently and at the last company we use SQL Server, so any product specific guidelines are welcome too.

    回答1

    You need to create exactly as many indexes as you need to create. No more, no less. It is as simple as that.

    Everybody "knows" that an index will slow down DML statements on a table. But for some reason very few people actually bother to test just how "slow" it becomes in their context. Sometimes I get the impression that people think that adding another index will add several seconds to each inserted row, making it a game changing business tradeoff that some fictive hotshot user should decide in a board room.

    I'd like to share an example that I just created on my 2 year old pc, using a standard MySQL installation. I know you tagged the question SQL Server, but the example should be easily converted. I insert 1,000,000 rows into three tables. One table without indexes, one table with one index and one table with nine indexes.

    drop table numbers;
    drop table one_million_rows;
    drop table one_million_one_index;
    drop table one_million_nine_index;
    
    /*
    || Create a dummy table to assist in generating rows
    */
    create table numbers(n int);
    
    insert into numbers(n) values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
    
    /*
    || Create a table consisting of 1,000,000 consecutive integers
    */   
    create table one_million_rows as
        select d1.n + (d2.n * 10)
                    + (d3.n * 100)
                    + (d4.n * 1000)
                    + (d5.n * 10000)
                    + (d6.n * 100000) as n
          from numbers d1
              ,numbers d2
              ,numbers d3
              ,numbers d4
              ,numbers d5
              ,numbers d6;
    
    
    /*
    || Create an empty table with 9 integer columns.
    || One column will be indexed
    */
    create table one_million_one_index(
       c1 int, c2 int, c3 int
      ,c4 int, c5 int, c6 int
      ,c7 int, c8 int, c9 int
      ,index(c1)
    );
    
    /*
    || Create an empty table with 9 integer columns.
    || All nine columns will be indexed
    */
    create table one_million_nine_index(
       c1 int, c2 int, c3 int
      ,c4 int, c5 int, c6 int
      ,c7 int, c8 int, c9 int
      ,index(c1), index(c2), index(c3)
      ,index(c4), index(c5), index(c6)
      ,index(c7), index(c8), index(c9)
    );
    
    
    /*
    || Insert 1,000,000 rows in the table with one index
    */
    insert into one_million_one_index(c1,c2,c3,c4,c5,c6,c7,c8,c9)
    select n, n, n, n, n, n, n, n, n
      from one_million_rows;
    
    /*
    || Insert 1,000,000 rows in the table with nine indexes
    */
    insert into one_million_nine_index(c1,c2,c3,c4,c5,c6,c7,c8,c9)
    select n, n, n, n, n, n, n, n, n
      from one_million_rows;
    

    My timings are:

    • 1m rows into table without indexes: 0,45 seconds
    • 1m rows into table with 1 index: 1,5 seconds
    • 1m rows into table with 9 indexes: 6,98 seconds

    I'm better with SQL than statistics and math, but I'd like to think that: Adding 8 indexes to my table, added (6,98-1,5) 5,48 seconds in total. Each index would then have contributed 0,685 seconds (5,48 / 8) for all 1,000,000 rows. That would mean that the added overhead per row per index would have been 0,000000685 seconds. SOMEBODY CALL THE BOARD OF DIRECTORS!

    In conclusion, I'd like to say that the above test case doesn't prove a shit. It just shows that tonight, I was able to insert 1,000,000 consecutive integers into in a table in a single user environment. Your results will be different.

  • 相关阅读:
    Linux 进程通信之:内存共享(Shared Memory)(转,好文章)
    进程和线程的对比和区别(转)
    CentOS下安装gdb的方法
    有关职业发展的帖子(整理)
    阿里钉钉面试题
    SSIS添加分区-动态
    java内存占用问题(一)
    网络网关TCP/IP
    java代码数组求平均值,最大值,最小值
    java代码。。。圆的面积好搞人。。。不是一般的搞人。。。欢迎指点指点
  • 原文地址:https://www.cnblogs.com/chucklu/p/14838121.html
Copyright © 2020-2023  润新知