• MySQL Innodb 存储结构 & 存储Null值 解析


    背景:
    表空间:INNODB 所有数据都存在表空间当中(共享表空间),要是开启innodb_file_per_table,则每张表的数据会存到单独的一个表空间内(独享表空间)。
    独享表空间包括:数据,索引,插入缓存,数据字典。共享表空间包括:Undo信息(不会回收<物理空间上>),双写缓存信息,事务信息等。
    段(segment):组成表空间,有区组成。
    区(extent):有64个连续的页组成。每个页16K,总共1M。对于大的数据段,每次最后可申请4个区。
    页(page):是INNODB 磁盘管理的单位,有行组成。
    行(row):包括事务ID,回滚指针,列信息等。

    目的1:
    了解表空间各个页的信息和溢出行数据存储的信息。通过该书作者蒋承尧编写的工具:http://code.google.com/p/david-mysql-tools/source/browse/trunk/py_innodb_page_type/
    3个脚本:
    py_innodb_page_info.py

    View Code
    #! /usr/bin/env python 
    #encoding=utf-8
    import mylib
    from sys import argv
    from mylib import myargv
    
    if __name__ == '__main__':
        myargv = myargv(argv)
        if myargv.parse_cmdline() == 0:
            pass
        else:
            mylib.get_innodb_page_type(myargv)

    mylib.py

    View Code
    encoding=utf-8
    import os
    import include
    from include import *
    
    TABLESPACE_NAME='D:\\mysql_data\\test\\t.ibd'
    VARIABLE_FIELD_COUNT = 1
    NULL_FIELD_COUNT = 0
    
    class myargv(object):
        def __init__(self, argv):
            self.argv = argv
            self.parms = {}
            self.tablespace = ''
    
        def parse_cmdline(self):
            argv = self.argv
            if len(argv) == 1:
                print 'Usage: python py_innodb_page_info.py [OPTIONS] tablespace_file'
                print 'For more options, use python py_innodb_page_info.py -h'
                return 0
            while argv:
                if argv[0][0] == '-':
                    if argv[0][1] == 'h':
                        self.parms[argv[0]] = ''
                        argv = argv[1:]
                        break
                    if argv[0][1] == 'v':
                        self.parms[argv[0]] = ''
                        argv = argv[1:]
                    else:
                        self.parms[argv[0]] = argv[1]
                        argv = argv[2:]
                else:
                    self.tablespace = argv[0]
                    argv = argv[1:]
            if self.parms.has_key('-h'):
                print 'Get InnoDB Page Info'
                print 'Usage: python py_innodb_page_info.py [OPTIONS] tablespace_file\n'
                print 'The following options may be given as the first argument:'
                print '-h        help '
                print '-o output put the result to file'
                print '-t number thread to anayle the tablespace file'
                print '-v        verbose mode'
                return 0
            return 1
    
    def mach_read_from_n(page,start_offset,length):
        ret = page[start_offset:start_offset+length]
        return ret.encode('hex')
    
    def get_innodb_page_type(myargv):
        f=file(myargv.tablespace,'rb')
        fsize = os.path.getsize(f.name)/INNODB_PAGE_SIZE
        ret = {}
        for i in range(fsize):
            page = f.read(INNODB_PAGE_SIZE)
            page_offset = mach_read_from_n(page,FIL_PAGE_OFFSET,4)
            page_type = mach_read_from_n(page,FIL_PAGE_TYPE,2)
            if myargv.parms.has_key('-v'):
                if page_type == '45bf':
                    page_level = mach_read_from_n(page,FIL_PAGE_DATA+PAGE_LEVEL,2)
                    print "page offset %s, page type <%s>, page level <%s>"%(page_offset,innodb_page_type[page_type],page_level)
                else:
                    print "page offset %s, page type <%s>"%(page_offset,innodb_page_type[page_type])
            if not ret.has_key(page_type):
                ret[page_type] = 1
            else:
                ret[page_type] = ret[page_type] + 1
        print "Total number of page: %d:"%fsize
        for type in ret:
            print "%s: %s"%(innodb_page_type[type],ret[type])

    include.py

    View Code
    #encoding=utf-8
    INNODB_PAGE_SIZE = 16*1024*1024
    
    # Start of the data on the page
    FIL_PAGE_DATA = 38
    
    
    FIL_PAGE_OFFSET = 4 # page offset inside space
    FIL_PAGE_TYPE = 24 # File page type
    
    # Types of an undo log segment */
    TRX_UNDO_INSERT = 1
    TRX_UNDO_UPDATE = 2
    
    # On a page of any file segment, data may be put starting from this offset
    FSEG_PAGE_DATA = FIL_PAGE_DATA
    
    # The offset of the undo log page header on pages of the undo log
    TRX_UNDO_PAGE_HDR = FSEG_PAGE_DATA
    
    PAGE_LEVEL = 26 #level of the node in an index tree; the leaf level is the level 0 */
    
    innodb_page_type={
        '0000':u'Freshly Allocated Page',
        '0002':u'Undo Log Page',
        '0003':u'File Segment inode',
        '0004':u'Insert Buffer Free List',
        '0005':u'Insert Buffer Bitmap',
        '0006':u'System Page',
        '0007':u'Transaction system Page',
        '0008':u'File Space Header',
        '0009':u'扩展描述页',
        '000a':u'Uncompressed BLOB Page',
        '000b':u'1st compressed BLOB Page',
        '000c':u'Subsequent compressed BLOB Page',
        '45bf':u'B-tree Node'
        }
    
    innodb_page_direction={
        '0000': 'Unknown(0x0000)',
        '0001': 'Page Left',
        '0002': 'Page Right',
        '0003': 'Page Same Rec',
        '0004': 'Page Same Page',
        '0005': 'Page No Direction',
        'ffff': 'Unkown2(0xffff)'
    }
    
    
    INNODB_PAGE_SIZE=1024*16 # InnoDB Page 16K

    测试1:

    root@localhost : test 02:26:13>create table tt(id int auto_increment,name varchar(10),age int,address varchar(20),primary key (id))engine=innodb;
    Query OK, 0 rows affected (0.17 sec)
    root@zhoujy:/var/lib/mysql/test# ls -lh tt.ibd 
    -rw-rw---- 1 mysql mysql 96K 2012-10-17 14:26 tt.ibd

    查看ibd:

    root@zhoujy:/home/zhoujy/jiaoben/read_ibd# python py_innodb_page_info.py /var/lib/mysql/test/tt.ibd -v
    page offset 00000000, page type <File Space Header>
    page offset 00000001, page type <Insert Buffer Bitmap>
    page offset 00000002, page type <File Segment inode>
    page offset 00000003, page type <B-tree Node>, page level <0000> ---叶子节点
    page offset 00000000, page type <Freshly Allocated Page>
    page offset 00000000, page type <Freshly Allocated Page>
    Total number of page: 6: 
    Freshly Allocated Page: 2
    Insert Buffer Bitmap: 1
    File Space Header: 1
    B-tree Node: 1
    File Segment inode: 1

    解释:
    Total number of page: 总页数
    Freshly Allocated Page:可用页
    Insert Buffer Bitmap:插入缓存位图页
    Insert Buffer Free List:插入缓存空闲列表页
    B-tree Node:数据页

    Uncompressed BLOB Page:二进制大对象页,存放溢出行的页,即溢出页
    上面得到的信息是表初始化大小为96K,他是有 Total number of page * 16 得来的。1个数据页,2个可用页面。

    root@localhost : test 02:42:58>insert into tt values(name,age,address) values('aaa',23,'HZZZ');

    疑惑:为什么没有申请区?区是64个连续的页,大小1M。那么表大小也应该是至少1M。但是现在只有96K(默认)。原因是因为每个段开始的时候,先有32个页大小的碎片页存放数据,使用
    完之后才是64页的连续申请,最多每次可以申请4个区,保证数据的顺序。这里看出表大小增加是按照至少64页的大小的空间来增加的,即1M增加。
    验证:
    填充数据,写满这32个碎片页,32*16 = 512K。看看是否能申请大于1M的空间。

    View Code
    root@zhoujy:/home/zhoujy/jiaoben/read_ibd# ls -lh /var/lib/mysql/test/tt.ibd 
    -rw-rw---- 1 mysql mysql 576K 2012-10-17 15:30 /var/lib/mysql/test/tt.ibd
    root@zhoujy:/home/zhoujy/jiaoben/read_ibd# python py_innodb_page_info.py /var/lib/mysql/test/tt.ibd -v
    page offset 00000000, page type <File Space Header>
    page offset 00000001, page type <Insert Buffer Bitmap>
    page offset 00000002, page type <File Segment inode>
    page offset 00000003, page type <B-tree Node>, page level <0001>
    page offset 00000004, page type <B-tree Node>, page level <0000>
    page offset 00000005, page type <B-tree Node>, page level <0000>
    page offset 00000006, page type <B-tree Node>, page level <0000>
    page offset 00000007, page type <B-tree Node>, page level <0000>
    page offset 00000008, page type <B-tree Node>, page level <0000>
    page offset 00000009, page type <B-tree Node>, page level <0000>
    page offset 0000000a, page type <B-tree Node>, page level <0000>
    page offset 0000000b, page type <B-tree Node>, page level <0000>
    page offset 0000000c, page type <B-tree Node>, page level <0000>
    page offset 0000000d, page type <B-tree Node>, page level <0000>
    page offset 0000000e, page type <B-tree Node>, page level <0000>
    page offset 0000000f, page type <B-tree Node>, page level <0000>
    page offset 00000010, page type <B-tree Node>, page level <0000>
    page offset 00000011, page type <B-tree Node>, page level <0000>
    page offset 00000012, page type <B-tree Node>, page level <0000>
    page offset 00000013, page type <B-tree Node>, page level <0000>
    page offset 00000014, page type <B-tree Node>, page level <0000>
    page offset 00000015, page type <B-tree Node>, page level <0000>
    page offset 00000016, page type <B-tree Node>, page level <0000>
    page offset 00000017, page type <B-tree Node>, page level <0000>
    page offset 00000018, page type <B-tree Node>, page level <0000>
    page offset 00000019, page type <B-tree Node>, page level <0000>
    page offset 0000001a, page type <B-tree Node>, page level <0000>
    page offset 0000001b, page type <B-tree Node>, page level <0000>
    page offset 0000001c, page type <B-tree Node>, page level <0000>
    page offset 0000001d, page type <B-tree Node>, page level <0000>
    page offset 0000001e, page type <B-tree Node>, page level <0000>
    page offset 0000001f, page type <B-tree Node>, page level <0000>
    page offset 00000020, page type <B-tree Node>, page level <0000>
    page offset 00000021, page type <B-tree Node>, page level <0000>
    page offset 00000022, page type <B-tree Node>, page level <0000>
    page offset 00000023, page type <B-tree Node>, page level <0000>
    Total number of page: 36:
    Insert Buffer Bitmap: 1
    File Space Header: 1
    B-tree Node: 33
    File Segment inode: 1

    "额外"页:4个
    page offset 00000000, page type <File Space Header> :文件头空间页
    page offset 00000001, page type <Insert Buffer Bitmap>:插入缓存位图页
    page offset 00000002, page type <File Segment inode>:文件段节点
    page offset 00000003, page type <B-tree Node>, page level <0001>:根页
    碎片页:32个
    page type <B-tree Node>, page level <0000>
    总共36个页,ibd大小 576K的由来:32*16=512K(碎片页)+ 4*16=64(额外页),这里开始要是再插入的话,应该申请最少1M的页:

    root@zhoujy:/home/zhoujy/jiaoben/read_ibd# ls -lh /var/lib/mysql/test/tt.ibd 
    -rw-rw---- 1 mysql mysql 2.0M 2012-10-17 16:10 /var/lib/mysql/test/tt.ibd
    root@zhoujy:/home/zhoujy/jiaoben/read_ibd# python py_innodb_page_info.py /var/lib/mysql/test/tt.ibd
    Total number of page: 128:
    Freshly Allocated Page: 91
    Insert Buffer Bitmap: 1
    File Space Header: 1
    B-tree Node: 34
    File Segment inode: 1

    页从36跳到了128,因为已经用完了32个碎片页,新的页会采用区的方式进行空间申请。信息中看到有很多可用页,正好说明这点。

     ▲溢出行数据存放:INNODB存储引擎是索引组织的,每页中至少有两行记录,如果页中只能存放一行记录,INNODB会自动将行数据放到溢出页中。当发生溢出行的时候,实际数据保存在BLOB页中,数据页只保存数据的前768字节(老的文件格式),新的文件格式(Barracuda)采用完全行溢出的方式,数据页只保存20个字节的指针,BLOB也保存所有数据。如何查看表中有溢出行数据呢?

    root@localhost : test 04:52:34>create table t1 (id int,name varchar(10),memo varchar(8000))engine =innodb default charset utf8;
    Query OK, 0 rows affected (0.16 sec)
    
    root@localhost : test 04:53:10>insert into t1 values(1,'zjy',repeat('',8000));
    Query OK, 1 row affected (0.00 sec)

     查看ibd:

    root@zhoujy:/home/zhoujy/jiaoben/read_ibd# python py_innodb_page_info.py /var/lib/mysql/test/t1.ibd -v
    page offset 00000000, page type <File Space Header>
    page offset 00000001, page type <Insert Buffer Bitmap>
    page offset 00000002, page type <File Segment inode>
    page offset 00000003, page type <B-tree Node>, page level <0000>
    page offset 00000004, page type <Uncompressed BLOB Page>
    page offset 00000005, page type <Uncompressed BLOB Page>
    Total number of page: 6:
    Insert Buffer Bitmap: 1
    Uncompressed BLOB Page: 2
    File Space Header: 1
    B-tree Node: 1
    File Segment inode: 1

    从信息中看到,刚才插入的一行记录,已经溢出了,保存到了2个BLOB页中(<Uncompressed BLOB Page>)。因为1页只有16K,又要存2行数据,所以每行记录最好小于8K,而上面的远远大于8K,所以被溢出了。当然这个也不是包括特大字段,要是一张表里面有5个字段都是varchar(512)【多个varchar的总和大于8K就可以】,也会溢出:

    root@localhost : test 05:08:39>create table t2 (id int,name varchar(1000),address varchar(512),company varchar(200),xx varchar(512),memo varchar(512),dem varchar(1000))engine =innodb default charset utf8;
    Query OK, 0 rows affected (0.17 sec)
    root@localhost : test 05:08:43>insert into t2 values(1,repeat('',1000),repeat('',500),repeat('',500),repeat('',500),repeat('',500),repeat('阿a',500));

    1000+500+500+500+500+500=3500*3>8000字节;行会被溢出:

    root@zhoujy:/home/zhoujy/jiaoben/read_ibd# python py_innodb_page_info.py /var/lib/mysql/test/t2.ibd -v
    page offset 00000000, page type <File Space Header>
    page offset 00000001, page type <Insert Buffer Bitmap>
    page offset 00000002, page type <File Segment inode>
    page offset 00000003, page type <B-tree Node>, page level <0000>
    page offset 00000004, page type <Uncompressed BLOB Page>
    page offset 00000000, page type <Freshly Allocated Page>
    Total number of page: 6:
    Insert Buffer Bitmap: 1
    Freshly Allocated Page: 1
    File Segment inode: 1
    B-tree Node: 1
    File Space Header: 1
    Uncompressed BLOB Page: 1

    <Uncompressed BLOB Page> 页存放真正的数据,那数据页到底存放什么?用hexdump查看:

    root@zhoujy:/home/zhoujy/jiaoben/read_ibd# hexdump -C -v  /var/lib/mysql/test/t1.ibd  > t1.txt

    查看ibd:

    View Code
    3082 0000c090  00 32 01 10 80 00 00 01  7a 6a 79 e6 88 91 e6 88  |.2......zjy.....|
    3083 0000c0a0  91 e6 88 91 e6 88 91 e6  88 91 e6 88 91 e6 88 91  |................|
    3084 0000c0b0  e6 88 91 e6 88 91 e6 88  91 e6 88 91 e6 88 91 e6  |................|
    3085 0000c0c0  88 91 e6 88 91 e6 88 91  e6 88 91 e6 88 91 e6 88  |................|
    3086 0000c0d0  91 e6 88 91 e6 88 91 e6  88 91 e6 88 91 e6 88 91  |................|
    3087 0000c0e0  e6 88 91 e6 88 91 e6 88  91 e6 88 91 e6 88 91 e6  |................|
    3088 0000c0f0  88 91 e6 88 91 e6 88 91  e6 88 91 e6 88 91 e6 88  |................|
    3089 0000c100  91 e6 88 91 e6 88 91 e6  88 91 e6 88 91 e6 88 91  |................|
    3090 0000c110  e6 88 91 e6 88 91 e6 88  91 e6 88 91 e6 88 91 e6  |................|
    3091 0000c120  88 91 e6 88 91 e6 88 91  e6 88 91 e6 88 91 e6 88  |................|
    3092 0000c130  91 e6 88 91 e6 88 91 e6  88 91 e6 88 91 e6 88 91  |................|
    3093 0000c140  e6 88 91 e6 88 91 e6 88  91 e6 88 91 e6 88 91 e6  |................|
    3094 0000c150  88 91 e6 88 91 e6 88 91  e6 88 91 e6 88 91 e6 88  |................|
    3095 0000c160  91 e6 88 91 e6 88 91 e6  88 91 e6 88 91 e6 88 91  |................|
    3096 0000c170  e6 88 91 e6 88 91 e6 88  91 e6 88 91 e6 88 91 e6  |................|
    3097 0000c180  88 91 e6 88 91 e6 88 91  e6 88 91 e6 88 91 e6 88  |................|
    3098 0000c190  91 e6 88 91 e6 88 91 e6  88 91 e6 88 91 e6 88 91  |................|
    3099 0000c1a0  e6 88 91 e6 88 91 e6 88  91 e6 88 91 e6 88 91 e6  |................|
    3100 0000c1b0  88 91 e6 88 91 e6 88 91  e6 88 91 e6 88 91 e6 88  |................|
    3101 0000c1c0  91 e6 88 91 e6 88 91 e6  88 91 e6 88 91 e6 88 91  |................|
    3102 0000c1d0  e6 88 91 e6 88 91 e6 88  91 e6 88 91 e6 88 91 e6  |................|
    3103 0000c1e0  88 91 e6 88 91 e6 88 91  e6 88 91 e6 88 91 e6 88  |................|
    3104 0000c1f0  91 e6 88 91 e6 88 91 e6  88 91 e6 88 91 e6 88 91  |................|
    3105 0000c200  e6 88 91 e6 88 91 e6 88  91 e6 88 91 e6 88 91 e6  |................|
    3106 0000c210  88 91 e6 88 91 e6 88 91  e6 88 91 e6 88 91 e6 88  |................|
    3107 0000c220  91 e6 88 91 e6 88 91 e6  88 91 e6 88 91 e6 88 91  |................|
    3108 0000c230  e6 88 91 e6 88 91 e6 88  91 e6 88 91 e6 88 91 e6  |................|
    3109 0000c240  88 91 e6 88 91 e6 88 91  e6 88 91 e6 88 91 e6 88  |................|
    3110 0000c250  91 e6 88 91 e6 88 91 e6  88 91 e6 88 91 e6 88 91  |................|
    3111 0000c260  e6 88 91 e6 88 91 e6 88  91 e6 88 91 e6 88 91 e6  |................|
    3112 0000c270  88 91 e6 88 91 e6 88 91  e6 88 91 e6 88 91 e6 88  |................|
    3113 0000c280  91 e6 88 91 e6 88 91 e6  88 91 e6 88 91 e6 88 91  |................|
    3114 0000c290  e6 88 91 e6 88 91 e6 88  91 e6 88 91 e6 88 91 e6  |................|
    3115 0000c2a0  88 91 e6 88 91 e6 88 91  e6 88 91 e6 88 91 e6 88  |................|
    3116 0000c2b0  91 e6 88 91 e6 88 91 e6  88 91 e6 88 91 e6 88 91  |................|
    3117 0000c2c0  e6 88 91 e6 88 91 e6 88  91 e6 88 91 e6 88 91 e6  |................|
    3118 0000c2d0  88 91 e6 88 91 e6 88 91  e6 88 91 e6 88 91 e6 88  |................|
    3119 0000c2e0  91 e6 88 91 e6 88 91 e6  88 91 e6 88 91 e6 88 91  |................|
    3120 0000c2f0  e6 88 91 e6 88 91 e6 88  91 e6 88 91 e6 88 91 e6  |................|
    3121 0000c300  88 91 e6 88 91 e6 88 91  e6 88 91 e6 88 91 e6 88  |................|
    3122 0000c310  91 e6 88 91 e6 88 91 e6  88 91 e6 88 91 e6 88 91  |................|
    3123 0000c320  e6 88 91 e6 88 91 e6 88  91 e6 88 91 e6 88 91 e6  |................|
    3124 0000c330  88 91 e6 88 91 e6 88 91  e6 88 91 e6 88 91 e6 88  |................|
    3125 0000c340  91 e6 88 91 e6 88 91 e6  88 91 e6 88 91 e6 88 91  |................|
    3126 0000c350  e6 88 91 e6 88 91 e6 88  91 e6 88 91 e6 88 91 e6  |................|
    3127 0000c360  88 91 e6 88 91 e6 88 91  e6 88 91 e6 88 91 e6 88  |................|
    3128 0000c370  91 e6 88 91 e6 88 91 e6  88 91 e6 88 91 e6 88 91  |................|
    3129 0000c380  e6 88 91 e6 88 91 e6 88  91 e6 88 91 e6 88 91 e6  |................|
    3130 0000c390  88 91 e6 88 91 e6 88 91  e6 88 91 00 00 02 1c 00  |................|

    文本中刚好是48行,每行16字节。48*16=768字节,刚好验证了之前说的:数据页只保存数据的前768字节(老的文件格式)。

    总结1:
         通过上面的信息,可以能清楚的知道ibd表空间各个页的分布和利用信息以及表空间大小增加的步长;特别注意的是溢出行,一个页中至少包含2行数据,如果页中存放的行数越多,性能就越好。

    ************************************
    ************************************
    目的2:
         
    了解表空间如何存储数据,以及对NULL值的存储。

    测试2:
    在测试前先了解INNODB的存储格式(row_format)。老格式(Antelope):Compact<默认>,Redumdant;新格式(Barracuda):Compressed ,Dynamic。

    这里测试指针对默认的存储格式。
    Compact行记录方式如下:

       |变长字段长度列表(1~2字节)|NULL标志位(1字节)|记录头信息(5字节)|RowID(6字节)|事务ID(6字节)|回滚指针(7字节)|

    上面信息除了 "NULL标志位"[表中所有字段都定义为NOT NULL],"RowID"[表中有主键] ,"变长字段长度列表" [没有变长字段] 可能不存在外,其他信息都会出现。所以一行数据除了列数据所占用的字段外,还需要额外18字节。

    一:字段全NULL

    mysql> create table mytest(t1 varchar(10),t2 varchar(10),t3 varchar(10) ,t4 varchar(10))engine=innodb charset = latin1 row_format=compact;
    Query OK, 0 rows affected (0.08 sec)
    
    mysql> insert into mytest values('a','bb','bb','ccc');
    Query OK, 1 row affected (0.02 sec)
    
    mysql> insert into mytest values('a','ee','ee','fff');
    Query OK, 1 row affected (0.01 sec)
    
    mysql> insert into mytest values('a',NULL,NULL,'fff');
    Query OK, 1 row affected (0.00 sec)

    测试数据准备完之后,执行shell命令:

    root@zhoujy:/usr/local/mysql/test# hexdump -C -v mytest.ibd > /home/zhoujy/mytest.txt

    打开mytest.txt文件找到supremum这一行:

    0000c070  73 75 70 72 65 6d 75 6d  03 02 02 01 00 00 00 10  |supremum........|   ----------->一行,16字节
    0000c080  00 25 00 00 00 03 b9 00  00 00 00 02 49 01 82 00  |.%..........I...|
    0000c090  00 01 4a 01 10 61 62 62  62 62 63 63 63 03 02 02  |..J..abbbbccc...|
    0000c0a0  01 00 00 00 18 00 23 00  00 00 03 b9 01 00 00 00  |......#.........|
    0000c0b0  02 49 02 83 00 00 01 4b  01 10 61 65 65 65 65 66  |.I.....K..aeeeef|
    0000c0c0  66 66 03 01 06 00 00 20  ff a6 00 00 00 03 b9 02  |ff..... ........|
    0000c0d0  00 00 00 02 49 03 84 00  00 01 4c 01 10 61 66 66  |....I.....L..aff|
    0000c0e0  66 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |f...............|

    释:
    第一行数据:
    03 02 02 01 /*变长字段*/ ---- 表中4个字段类型为varchar,并且没有NULL数据,而且每个字段君小于255。
    00 /*NULL标志位,第一行没有null的数据*/
    00 00 10 00 25 /*记录头信息,固定5个字节*/
    00 00 00 03 b9 00 /*RowID,固定6个字节,表没有主键*/
    00 00 00 02 49 01 /*事务ID,固定6个字节*/
    82 00 00 01 4a 01 10 /*回滚指针,固定7个字节*/
    61 62 62 62 62 63 63 63 /*列的数据*/
    第二行数据和第一行数据一样(颜色匹配)。
    第三行数据(有NULL值)和第一行的解释的颜色对应起来比较差别:

    03 02 02 01  VS  03 01   ----------当值为NULL时,变长字段列表不会占用存储空间。
    61 62 62  62 62 63 63 63 VS 61 66 66 66  --------- NULL值没有存储,不占空间

    结论:当值为NULL时,变长字段列表不会占用存储空间。NULL值没有存储,不占空间,但是需要一个标志位(一行一个)。

    二:字段全NOT NULL

    mysql> create table mytest(t1 varchar(10) NOT NULL,t2 varchar(10) NOT NULL,t3 varchar(10) NOT NULL,t4 varchar(10) NOT NULL)engine=innodb charset = latin1 row_format=compact;
    Query OK, 0 rows affected (0.03 sec)
    
    mysql> insert into mytest values('a','bb','bb','ccc');
    Query OK, 1 row affected (0.01 sec)
    
    mysql> insert into mytest values('a','ee','ee','fff');
    Query OK, 1 row affected (0.01 sec)
    
    mysql> insert into mytest values('a',NULL,NULL,'fff');
    ERROR 1048 (23000): Column 't2' cannot be null

     步骤和上面一样,得到的ibd的结果是:

    0000c070  73 75 70 72 65 6d 75 6d  03 02 02 01 00 00 10 00  |supremum........|
    0000c080  24 00 00 00 03 b9 03 00  00 00 02 49 07 87 00 00  |$..........I....|
    0000c090  01 4f 01 10 61 62 62 62  62 63 63 63 03 02 02 01  |.O..abbbbccc....|
    0000c0a0  00 00 18 ff cb 00 00 00  03 b9 04 00 00 00 02 49  |...............I|
    0000c0b0  08 88 00 00 01 50 01 10  61 65 65 65 65 66 66 66  |.....P..aeeeefff|

    和上面比较,发现少了NULL的标志位信息。
    结论:  NULL值会有额外的空间来存储,即每行1字节的大小。对于相同数据的表,字段中有NULL值的表比NOT NULL的大。

    三:1个NULL,和1个''的数据:

    mysql> create table mytest(t1 varchar(10) NOT NULL,t2 varchar(10) NOT NULL DEFAULT '',t3 varchar(10) NOT NULL ,t4 varchar(10))engine=innodb charset = latin1 row_format=compact;
    Query OK, 0 rows affected (0.02 sec)
    mysql> insert into mytest(t1,t2) values('A','BB');
    Query OK, 1 row affected, 1 warning (0.01 sec)

     步骤和上面一样,得到的ibd的结果是:

    0000c070  73 75 70 72 65 6d 75 6d  00 02 01 01 00 00 10 ff  |supremum........|
    0000c080  ef 00 00 00 43 b9 03 00  00 00 02 4a 15 90 00 00  |....C......J....|
    0000c090  01 c2 01 10 41 42 42 00  00 00 00 00 00 00 00 00  |....ABB.........|

    和上面2个区别主要在于变长列表和列数据这里。

    结论:列数据信息里表明了 NULL数据和''数据都不占用任何空间,对于变长字段列表的信息,和一对比得出:‘’数据虽然不需要占用任何存储空间,但是在变长字段列表里面还是需要占用一个字节<毕竟还是一个‘’值>,NULL值不需要占用”,只是NULL会有额外的一个标志位,所以能有个优化的说法:“数据库表中能设置NOT NULL的就尽量设置为NOT NULL,除非确实需要NULL值得。” 在此得到了证明。

    上面的测试都是针对VARCHAR的变长类型,那对于CHAR呢?

    CHAR 测试:

    root@localhost : test 10:33:35>create table mytest(t1 char(10),t2 char(10),t3 char(10) ,t4 char(10))engine=innodb charset = latin1 row_format=compact;Query OK, 0 rows affected (0.16 sec)
    
    root@localhost : test 10:33:59>insert into mytest values('a','bb','bb','ccc');
    Query OK, 1 row affected (0.00 sec)
    
    root@localhost : test 10:34:09>insert into mytest values('a','ee','ee','fff');
    Query OK, 1 row affected (0.00 sec)
    
    root@localhost : test 10:34:19>insert into mytest values('a',NULL,NULL,'fff');
    Query OK, 1 row affected (0.00 sec)

    打开ibd生成的文件:

    0000c060  02 00 1b 69 6e 66 69 6d  75 6d 00 04 00 0b 00 00  |...infimum......|
    0000c070  73 75 70 72 65 6d 75 6d  00 00 00 10 00 41 00 00  |supremum.....A..|
    0000c080  00 0a f5 00 00 00 00 81  2d 07 80 00 00 00 32 01  |........-.....2.|
    0000c090  10 61 20 20 20 20 20 20  20 20 20 62 62 20 20 20  |.a         bb   |
    0000c0a0  20 20 20 20 20 62 62 20  20 20 20 20 20 20 20 63  |     bb        c|
    0000c0b0  63 63 20 20 20 20 20 20  20 00 00 00 18 00 41 00  |cc       .....A.|
    0000c0c0  00 00 0a f5 01 00 00 00  81 2d 08 80 00 00 00 32  |.........-.....2|
    0000c0d0  01 10 61 20 20 20 20 20  20 20 20 20 65 65 20 20  |..a         ee  |
    0000c0e0  20 20 20 20 20 20 65 65  20 20 20 20 20 20 20 20  |      ee        |
    0000c0f0  66 66 66 20 20 20 20 20  20 20 06 00 00 20 ff 70  |fff       ... .p|
    0000c100  00 00 00 0a f5 02 00 00  00 81 2d 09 80 00 00 00  |..........-.....|
    0000c110  32 01 10 61 20 20 20 20  20 20 20 20 20 66 66 66  |2..a         fff|
    0000c120  20 20 20 20 20 20 20 00  00 00 00 00 00 00 00 00  |       .........|

    和一的varchar比较发现:少了变长字段列表,但是对于char来讲,需要固定长度来存储的,存不到固定长度,也会被填充满。如:20;并且NULL值也不需要占用存储空间。

    混合(varchar,char):

    root@localhost : test 11:21:48>create table mytest(t1 int,t2 char(10),t3 varchar(10) ,t4 char(10))engine=innodb charset = latin1 row_format=compact;
    Query OK, 0 rows affected (0.17 sec)
    
    root@localhost : test 11:21:50>insert into mytest values(1,'a','b','c');
    Query OK, 1 row affected (0.00 sec)
    
    root@localhost : test 11:22:06>insert into mytest values(11,'aa','bb','cc');
    Query OK, 1 row affected (0.00 sec)

     从上面的表结构中看出:
    1,变长字段列表长度:1
    2,NULL标志位:1
    3,记录头信息:5
    4,RowID:6
    5,事务ID:6
    6,回滚指针:7

    idb的信息

    0000c070  73 75 70 72 65 6d 75 6d  01 00 00 00 10 00 33 00  |supremum......3.| 
    0000c080  00 00 0a f5 07 00 00 00  81 2d 1a 80 00 00 00 32  |.........-.....2|
    0000c090  01 10 80 00 00 01 61 20  20 20 20 20 20 20 20 20  |......a         |
    0000c0a0  62 63 20 20 20 20 20 20  20 20 20 02 00 00 00 18  |bc         .....|
    0000c0b0  ff be 00 00 00 0a f5 08  00 00 00 81 2d 1b 80 00  |............-...|
    0000c0c0  00 00 32 01 10 80 00 00  0b 61 61 20 20 20 20 20  |..2......aa     |
    0000c0d0  20 20 20 62 62 63 63 20  20 20 20 20 20 20 20 00  |   bbcc        .|

    从上信息得出和之前预料的一样:因为表中只有一个varchar字段,所以,变长列表长度就只有:01 
    特别注意的是:各个列数据存储的信息:t1字段为int 类型,占用4个字节的大小。第一行:80 00 00 01 就是表示 1 数字;第二行:80 00 00   0b 表示了11的数字。[select hex(11)  == B ],其他的和上面的例子一样。

    上面都是latin1单字节字符集的说明,那对于多字节字符集的情况怎么样?

    root@localhost : test 11:52:10>create table mytest(id int auto_increment,t2 varchar(10),t3 varchar(10) ,t4 char(10),primary key(id))engine=innodb charset = utf8 row_format=compact;
    Query OK, 0 rows affected (0.17 sec)
    
    root@localhost : test 11:52:11>insert into mytest(t2,t3,t4) values('bb','bb','ccc');
    Query OK, 1 row affected (0.00 sec)
    
    root@localhost : test 11:55:34>insert into mytest(t2,t3,t4) values('我们','他们','我们的');
    Query OK, 1 row affected (0.00 sec)

     ibd信息如下:

    0000c070  73 75 70 72 65 6d 75 6d  0a 02 02 00 00 00 10 00  |supremum........|
    0000c080  28 80 00 00 01 00 00 00  81 2d 27 80 00 00 00 32  |(........-'....2|
    0000c090  01 10 62 62 62 62 63 63  63 20 20 20 20 20 20 20  |..bbbbccc       |
    0000c0a0  0a 06 06 00 00 00 18 ff  c7 80 00 00 02 00 00 00  |................|
    0000c0b0  81 2d 28 80 00 00 00 32  01 10 e6 88 91 e4 bb ac  |.-(....2........|
    0000c0c0  e4 bb 96 e4 bb ac e6 88  91 e4 bb ac e7 9a 84 20  |............... |

    因为表有了主键,所以ROWID(6字节)不见了。
    特别注意的是:变长字段列表是3?表里面的varchar类型的列只有2个啊。经测试得出:在多字节字符集的条件下,char类型被当成可变长度的类型来处理,他们的行存储基本没有区别,所以这个就出现变长列表是3了,因为是utf8字符集,占用三个字节。所以一个汉字均占用了一个页中3个字节的空间(”我们“ :e6 88 91 e4 bb ac)。
    数据列的信息:
    id列的1值,应该是
    80 00 00 01,为什么这个显示00 32 01 10,而且所有的id都是00 32 01 10。测试发现,id为自增主键的时候,id的4个字节长度都是以00 32 01 10 表示。否则和前面一个例子里说的,用select HEX(X) 表示。

    总结2:
         上面的测试都是基于COMPACT存储格式的,不管是varchar还是char,NULL值是不需要占用存储空间的;特别需要注意的是Redumdant的记录头信息需要6个固定字节;在多字节字符集的条件下,CHAR和VARCHAR的行存储基本是没有区别的

    ~~~~~~~~~~~~~~~ 万物之中,希望至美 ~~~~~~~~~~~~~~~
  • 相关阅读:
    BlockingQueue(阻塞队列)详解
    异步远程调用线程池队列
    JVM中的五大内存区域划分详解及快速扫盲
    http链接管理器
    influxDB 增加
    ArcGIS的sde数据库连接
    arcgis server注册文件夹
    arcmap编辑oracle数据库
    create database connection
    发布gp服务失败
  • 原文地址:https://www.cnblogs.com/zhoujinyi/p/2726462.html
Copyright © 2020-2023  润新知