假如有一张大表,现在需要增加一个非聚集索引,对于DBA来说,要有预估其大小以及执行时间的估算能力,尤其对一些企业使用SSD硬盘,其硬盘空间很是宝贵,增加索引如果错误预估其大小,很有可能导致硬盘资源超出预期使用量,造成没必要的麻烦,这里只针对其预估硬盘占用空间展开讨论,行为标准8060 in_row_data,不涉及行溢出,大对象等情况。
举个例子给大家
create table Index_test (id int,a char(10)) go insert into Index_test select 100,'aaaaa' go 4000 create nonclustered index ix_id_a on Index_test (id,a) go
如果认为索引行的单行大小为4 + 10 = 14字节,那么最后的计算结果应该是
索引占用了(4000/(8096/14))*8192/1024 = 48KB。实际呢?
索引占用了(14+1)*8192/1024 = 120KB
与预期的值相差了2倍以上,设想一下,如果你的某张大表的索引错误的预测为50G,实际则会占用100G以上。
网络上很多资料给出了非聚集索引的存储格式,这里我简单明了的说明下非聚集索引的内部结构和预测其大小的方法。:
(非聚集索引分为 根叶,中间级页面,叶级页面,实际在物理存储上,我们可以将根叶和中间级页面合并看成中间级页面,因为两者的存储格式是相同的。我后面的讨论也都是分为叶级页面和中间级页面)
非聚集索引叶级页面单行:存储格式
-
状态位(1字节),标识此行是否有变长,空值,以及此行是否为索引行等等
-
非聚集索引(定长列)键值大小
-
如果是堆表,则是Rowid(8字节),如果是聚集索引,则是聚集索引键值(定长部分)大小。
-
包行列(定长列)长度
-
索引列数(2字节)
-
Null位(1字节)用来标识哪列值为null
-
变长列数量(2字节)
-
变长列1偏移长度(2字节)+变长列2偏移长度(2字节).. 变长列n偏移长度(2字节)
-
包行列(变长列)长度 2字节
-
非聚集索引(变长列)键值大小 + 聚集索引(变长列)键值大小 +包行列(变长列)键值大小
-
行偏移量(2字节)
非聚集索引中间级页面单行:存储格式
-
状态位(1字节),标识此行是否有变长,空值,以及此行是否为索引行等等
-
非聚集索引(定长列)键值大小
-
如果是堆表,则是Rowid(8字节),如果是聚集索引,则是聚集索引键值(定长部分)大小。
-
非聚集索引键值所在的pageid(4字节) + 非聚集索引键值所在的页面的文件id(2字节)
-
索引列数(2字节)
-
Null位(1字节)用来标识哪列值为null
-
变长列数量(2字节)
-
变长列1偏移长度(2字节)+变长列2偏移长度(2字节).. 变长列n偏移长度(2字节)
-
包行列(变长列)长度
-
非聚集索引(变长列)键值大小 + 聚集索引(变长列)键值大小 +包行列(变长列)键值大小
-
行偏移量(2字节)
下面总结了几条规律,方便理解上面的结构:
-
如果索引涉及的列有一个允许null,则索引行会包含索引列数(2字节)和Null位(1字节)
-
如果索引包含聚集索引,则需将rowid替换为聚集索引键值
-
如果索引包含唯一约束,则中间层页面不会包含rowid或者聚集索引键值
-
如果聚集索引不是唯一索引,而且存在重复值,则重复的聚集键值为(指定列 + 内部4字节整数列)来标识唯一性,要点:内部4字节整数列也属于聚集索引键值,并且是变长列类型
-
如果没有变长列,则在叶子页面和中间层页面不会包含:变长列数(2字节) +变长列长度(2字节)*变长列数 + 变长列键值。
结论
所以,对于一个非聚集索引来说,如果想确保内部系统开销最小,索引行最节省空间,除了限制索引引用没必要的列以外,还要考虑所有引用列均为not null,并且设置为unique唯一约束,同时最好具有聚集索引。
预估非聚集索引行大小
我根据以上规律总结了一个预估非聚集索引大小的脚本,因为索引填充率、变长列需要预先用最大值考虑等关系,最终结果会有稍微的误差,只能当做最小预估空间的参考值。脚本并不能保证很完善,大家可以自行改良。
文章开头的例子,如果使用这个脚本,结果为:
和DMV输出结果一样
declare @fix_length int --定长字段长度(byte) declare @columns_count int --字段数量 declare @variable_length int --变长字段长度(byte) declare @variable_count int --变长字段数量 declare @Pri_Key_Length int --聚集索引长度(byte) declare @is_primarykey bit --是否存在聚集索引,0不存在,1存在 declare @is_unique bit --是否指定唯一约束,0没有指定,1为指定 declare @is_Null bit --是否允许为null,0为允许,1为不允许 declare @Num_Rows int --记录数 declare @fillfactor float --填充因子,默认为100 set @fillfactor = 100 set @fix_length = 14 set @columns_count = 2 set @variable_length =0 set @variable_count = 0 set @Pri_Key_Length = 0 set @is_primarykey =0 set @is_unique =0 set @is_Null =0 set @Num_Rows = 4000 declare @yezi int declare @zhongjian int declare @ye_r_length int declare @ye_r_count int declare @zhong_r_length int declare @zhong_r_count int --1. 无聚集索引,无唯一约束,允许null if @is_primarykey = 0 and @is_unique = 0 and @is_Null = 0 begin if @variable_count =0 begin set @ye_r_length= @fix_length + CEILING(@columns_count/8.0) + (1 + 8 + 2+2) set @ye_r_count = (8192-96)*(@fillfactor/100.00) / @ye_r_length set @yezi = ceiling(@Num_Rows / convert(float,convert(int,@ye_r_count))) set @zhong_r_length = @fix_length + CEILING(@columns_count/8.0) + (1 + 8 + 6 + 2 + 2) set @zhong_r_count = (8192-96)/convert(int,@zhong_r_length) set @zhongjian = ceiling(@yezi/convert(float,(@zhong_r_count))) end else begin set @ye_r_length= @fix_length + CEILING(@columns_count/8.0) + (1 + 8 + 2 + 2 + 2*@variable_count + @variable_length + 2) set @ye_r_count = (8192-96) *(@fillfactor/100.00)/ @ye_r_length set @yezi = ceiling(@Num_Rows / convert(float,convert(int,@ye_r_count))) set @zhong_r_length = @fix_length + CEILING(@columns_count/8.0) + (1 + 8 + 6 + 2+2 + 2*@variable_count + @variable_length + 2) set @zhong_r_count = (8192-96)/convert(int,@zhong_r_length) set @zhongjian = ceiling(@yezi/convert(float,(@zhong_r_count))) end end --2. 无聚集索引,无唯一约束,不允许null if @is_primarykey = 0 and @is_unique = 0 and @is_Null = 1 begin if @variable_count = 0 begin set @ye_r_length= @fix_length + (1 + 8 +2) set @ye_r_count = (8192-96) *(@fillfactor/100.00) / @ye_r_length set @yezi = ceiling(@Num_Rows / convert(float,convert(int,@ye_r_count))) set @zhong_r_length = @fix_length + (1 + 8 + 6 + 2) set @zhong_r_count = (8192-96)/convert(int,@zhong_r_length) set @zhongjian = ceiling(@yezi/convert(float,(@zhong_r_count))) end else begin set @ye_r_length= @fix_length + (1 + 8 + 2 + 2*@variable_count + @variable_length + 2) set @ye_r_count = (8192-96)*(@fillfactor/100.00) / @ye_r_length set @yezi = ceiling(@Num_Rows / convert(float,convert(int,@ye_r_count))) set @zhong_r_length = @fix_length + (1 + 8 + 6 + 2*@variable_count + @variable_length + 2) set @zhong_r_count = (8192-96) /convert(int,@zhong_r_length) set @zhongjian = ceiling(@yezi/convert(float,(@zhong_r_count))) end end --3. 无聚集索引,有唯一约束,允许null if @is_primarykey = 0 and @is_unique = 1 and @is_Null = 0 begin if @variable_count = 0 begin set @ye_r_length= @fix_length + CEILING(@columns_count/8.0) + (1 + 8 + 2+2) set @ye_r_count = (8192-96) *(@fillfactor/100.00)/ @ye_r_length set @yezi = ceiling(@Num_Rows / convert(float,convert(int,@ye_r_count))) set @zhong_r_length = @fix_length + CEILING(@columns_count/8.0) + (1 + 6 + 2 + 2) set @zhong_r_count = (8192-96)*0.97/convert(int,@zhong_r_length) set @zhongjian = ceiling(@yezi/convert(float,(@zhong_r_count))) end else begin set @ye_r_length= @fix_length + CEILING(@columns_count/8.0) + (1 + 8 + 2 + 2 + 2*@variable_count + @variable_length +2) set @ye_r_count = (8192-96) *(@fillfactor/100.00)/ @ye_r_length set @yezi = ceiling(@Num_Rows / convert(float,convert(int,@ye_r_count))) set @zhong_r_length = @fix_length + CEILING(@columns_count/8.0) + (1 + 6 + 2 +2 + 2*@variable_count + @variable_length + 2) set @zhong_r_count = (8192-96)*0.86/convert(int,@zhong_r_length) set @zhongjian = ceiling(@yezi/convert(float,(@zhong_r_count))) end end --4. 无聚集索引,有唯一约束,不允许null if @is_primarykey = 0 and @is_unique = 1 and @is_Null = 1 begin if @variable_count = 0 begin set @ye_r_length= @fix_length + (1 + 8 +2) set @ye_r_count = (8192-96*(@fillfactor/100.00)) / @ye_r_length set @yezi = ceiling(@Num_Rows / convert(float,convert(int,@ye_r_count))) set @zhong_r_length = @fix_length + (1 + 6 + 2) set @zhong_r_count = (8192-96)/convert(int,@zhong_r_length) set @zhongjian = ceiling(@yezi/convert(float,(@zhong_r_count))) end else begin set @ye_r_length= @fix_length + (1 + 8 + 2 + 2*@variable_count + @variable_length + 2) set @ye_r_count = (8192-96)*(@fillfactor/100.00) / @ye_r_length set @yezi = ceiling(@Num_Rows / convert(float,convert(int,@ye_r_count))) set @zhong_r_length = @fix_length + (1 + 6 + 2 + 2*@variable_count + @variable_length + 2) set @zhong_r_count = (8192-96)/convert(int,@zhong_r_length) set @zhongjian = ceiling(@yezi/convert(float,(@zhong_r_count))) end end --5. 有聚集索引,无唯一约束,允许null if @is_primarykey = 1 and @is_unique = 0 and @is_Null = 0 begin if @variable_count = 0 begin set @ye_r_length= @fix_length + @Pri_Key_Length+CEILING(@columns_count/8.0) + (1 + 2+2) set @ye_r_count = (8192-96)*(@fillfactor/100.00) / @ye_r_length set @yezi = ceiling(@Num_Rows / convert(float,convert(int,@ye_r_count))) set @zhong_r_length = @fix_length +@Pri_Key_Length+ CEILING(@columns_count/8.0) + (1 + 6 + 2 + 2) set @zhong_r_count = (8192-96)/convert(int,@zhong_r_length) set @zhongjian = ceiling(@yezi/convert(float,(@zhong_r_count))) end else begin set @ye_r_length= @fix_length + @Pri_Key_Length+CEILING(@columns_count/8.0) + (1 + 2+ 2 + 2*@variable_count +@variable_length +2) set @ye_r_count = (8192-96)*(@fillfactor/100.00) / @ye_r_length set @yezi = ceiling(@Num_Rows / convert(float,convert(int,@ye_r_count))) set @zhong_r_length = @fix_length +@Pri_Key_Length+ CEILING(@columns_count/8.0) + (1 + 6 + 2 + 2 + 2*@variable_count +@fix_length + 2) set @zhong_r_count = (8192-96)/convert(int,@zhong_r_length) set @zhongjian = ceiling(@yezi/convert(float,(@zhong_r_count))) end end --6. 有聚集索引,无唯一约束,不允许null if @is_primarykey = 1 and @is_unique = 0 and @is_Null = 1 begin if @variable_count = 0 begin set @ye_r_length= @fix_length +@Pri_Key_Length+ (1 +2) set @ye_r_count = (8192-96)*(@fillfactor/100.00) / @ye_r_length set @yezi = ceiling(@Num_Rows / convert(float,convert(int,@ye_r_count))) set @zhong_r_length = @fix_length +@Pri_Key_Length+ (1 + 6 + 2) set @zhong_r_count = (8192-96)/convert(int,@zhong_r_length) set @zhongjian = ceiling(@yezi/convert(float,(@zhong_r_count))) end else begin set @ye_r_length= @fix_length +@Pri_Key_Length+ (1 + 2 + 2*@variable_count + @variable_length +2) set @ye_r_count = (8192-96)*(@fillfactor/100.00) / @ye_r_length set @yezi = ceiling(@Num_Rows / convert(float,convert(int,@ye_r_count))) set @zhong_r_length = @fix_length +@Pri_Key_Length+ (1 + 6+ 2 + 2*@variable_count + @variable_length + 2) set @zhong_r_count = (8192-96)/convert(int,@zhong_r_length) set @zhongjian = ceiling(@yezi/convert(float,(@zhong_r_count))) end end --7. 有聚集索引,有唯一约束,允许null if @is_primarykey = 1 and @is_unique = 1 and @is_Null = 0 begin if @variable_count = 0 begin set @ye_r_length= @fix_length +@Pri_Key_Length+ CEILING(@columns_count/8.0) + (1 + 2+2) set @ye_r_count = (8192-96)*(@fillfactor/100.00) / @ye_r_length set @yezi = ceiling(@Num_Rows / convert(float,convert(int,@ye_r_count))) set @zhong_r_length = @fix_length + CEILING(@columns_count/8.0) + (1 + 6 + 2 + 2) set @zhong_r_count = (8192-96)/convert(int,@zhong_r_length) set @zhongjian = ceiling(@yezi/convert(float,(@zhong_r_count))) end else begin set @ye_r_length= @fix_length +@Pri_Key_Length+ CEILING(@columns_count/8.0) + (1 + 2+ 2 + 2 *@variable_count + @variable_length +2) set @ye_r_count = (8192-96) *(@fillfactor/100.00)/ @ye_r_length set @yezi = ceiling(@Num_Rows / convert(float,convert(int,@ye_r_count))) set @zhong_r_length = @fix_length + CEILING(@columns_count/8.0) + (1 + 6 + 2+ 2 + 2 *@variable_count + @variable_length + 2) set @zhong_r_count = (8192-96)/convert(int,@zhong_r_length) set @zhongjian = ceiling(@yezi/convert(float,(@zhong_r_count))) end end --8. 有聚集索引,无唯一约束,不允许null if @is_primarykey = 1 and @is_unique = 1 and @is_Null = 1 begin --计算叶级页面数量 if @variable_count = 0 begin set @ye_r_length= @fix_length +@Pri_Key_Length+ (1 +2) set @ye_r_count = (8192-96)*(@fillfactor/100.00) / @ye_r_length set @yezi = ceiling(@Num_Rows / convert(float,convert(int,@ye_r_count))) set @zhong_r_length = @fix_length + (1 + 6 + 2) set @zhong_r_count = (8192-96)/convert(int,@zhong_r_length) set @zhongjian = ceiling(@yezi/convert(float,(@zhong_r_count))) end else begin set @ye_r_length= @fix_length +@Pri_Key_Length+ (1 +2 + 2 *@variable_count + @variable_length+2) set @ye_r_count = (8192-96 )*(@fillfactor/100.00)/ @ye_r_length set @yezi = ceiling(@Num_Rows / convert(float,convert(int,@ye_r_count))) set @zhong_r_length = @fix_length + (1 + 6 + 2 + 2 *@variable_count + @variable_length+ 2) set @zhong_r_count = (8192-96)/convert(int,@zhong_r_length) set @zhongjian = ceiling(@yezi/convert(float,(@zhong_r_count))) end end select @ye_r_length as 叶子页面记录长度, @ye_r_count as 叶子页面行数, @yezi as 叶页面数量, @zhong_r_length as 中间页面单条记录长度, @zhong_r_count as 中间页行数,@zhongjian as 中间页面数量 select (@yezi + @zhongjian) * 8192 / 1024.00 as '至少需要物理空间(KB)', (@yezi + @zhongjian) * 8192 / 1024.00/1024.00 as '至少需要物理空间(MB)', (@yezi + @zhongjian) * 8192 / 1024.00/1024.00/1024.00 as '至少需要物理空间(GB)'
微软technet,预估非聚集索引空间地址:http://technet.microsoft.com/zh-cn/library/ms190620