接前面,看 SeqNext 函数:
/* ---------------------------------------------------------------- * SeqNext * * This is a workhorse for ExecSeqScan * ---------------------------------------------------------------- */ static TupleTableSlot * SeqNext(SeqScanState *node) { HeapTuple tuple; HeapScanDesc scandesc; EState *estate; ScanDirection direction; TupleTableSlot *slot; /* * get information from the estate and scan state */ scandesc = node->ss_currentScanDesc; estate = node->ps.state; direction = estate->es_direction; slot = node->ss_ScanTupleSlot; /* * get the next tuple from the table */ tuple = heap_getnext(scandesc, direction); /* * save the tuple and the buffer returned to us by the access methods in * our scan tuple slot and return the slot. Note: we pass 'false' because * tuples returned by heap_getnext() are pointers onto disk pages and were * not created with palloc() and so should not be pfree()'d. Note also * that ExecStoreTuple will increment the refcount of the buffer; the * refcount will not be dropped until the tuple table slot is cleared. */ if (tuple) ExecStoreTuple(tuple, /* tuple to store */ slot, /* slot to store in */ scandesc->rs_cbuf, /* buffer associated with this * tuple */ false); /* don't pfree this pointer */ else ExecClearTuple(slot); return slot; }
先来看看 tuple 的数据结构:
/* * HeapTupleData is an in-memory data structure that points to a tuple. * * There are several ways in which this data structure is used: * * * Pointer to a tuple in a disk buffer: t_data points directly into the * buffer (which the code had better be holding a pin on, but this is not * reflected in HeapTupleData itself). * * * Pointer to nothing: t_data is NULL. This is used as a failure indication * in some functions. * * * Part of a palloc'd tuple: the HeapTupleData itself and the tuple * form a single palloc'd chunk. t_data points to the memory location * immediately following the HeapTupleData struct (at offset HEAPTUPLESIZE). * This is the output format of heap_form_tuple and related routines. * * * Separately allocated tuple: t_data points to a palloc'd chunk that * is not adjacent to the HeapTupleData. (This case is deprecated since * it's difficult to tell apart from case #1. It should be used only in * limited contexts where the code knows that case #1 will never apply.) * * * Separately allocated minimal tuple: t_data points MINIMAL_TUPLE_OFFSET * bytes before the start of a MinimalTuple. As with the previous case, * this can't be told apart from case #1 by inspection; code setting up * or destroying this representation has to know what it's doing. * * t_len should always be valid, except in the pointer-to-nothing case. * t_self and t_tableOid should be valid if the HeapTupleData points to * a disk buffer, or if it represents a copy of a tuple on disk. They * should be explicitly set invalid in manufactured tuples. */ typedef struct HeapTupleData { uint32 t_len; /* length of *t_data */ ItemPointerData t_self; /* SelfItemPointer */ Oid t_tableOid; /* table the tuple came from */ HeapTupleHeader t_data; /* -> tuple header and data */ } HeapTupleData; typedef HeapTupleData *HeapTuple;
实际测试 ,
select id from tst04 where id>1, tuple 长度 32。
select id from tst01 where id<10 , tuple 长度 28。
运行如下查询可以看到同样的结果:
postgres=# select pg_column_size(t) from tst01 t limit 1; pg_column_size ---------------- 28 (1 row) postgres=# select pg_column_size(t) from tst04 t limit 1; pg_column_size ---------------- 32 (1 row) postgres=#
也就是说,tuple 的长度包含了 附属信息的长度和字段本身内容所占的长度(integer 4字节)。
看下一层面的结构:
/* * ItemPointer: * * This is a pointer to an item within a disk page of a known file * (for example, a cross-link from an index to its parent table). * blkid tells us which block, posid tells us which entry in the linp * (ItemIdData) array we want. * * Note: because there is an item pointer in each tuple header and index * tuple header on disk, it's very important not to waste space with * structure padding bytes. The struct is designed to be six bytes long * (it contains three int16 fields) but a few compilers will pad it to * eight bytes unless coerced. We apply appropriate persuasion where * possible, and to cope with unpersuadable compilers, we try to use * "SizeOfIptrData" rather than "sizeof(ItemPointerData)" when computing * on-disk sizes. */ typedef struct ItemPointerData { BlockIdData ip_blkid; OffsetNumber ip_posid; } #ifdef __arm__ __attribute__((packed)) /* Appropriate whack upside the head for ARM */ #endif ItemPointerData;
在 ItemPointerData中,估计 ip_blkid 是块号, ip_posid是 块内的本条记录所在序号。
再下一层:
typedef struct BlockIdData { uint16 bi_hi; uint16 bi_lo; } BlockIdData;
/* * OffsetNumber: * * this is a 1-based index into the linp (ItemIdData) array in the * header of each disk page. */ typedef uint16 OffsetNumber;
在上面的SeqNext里,加入下列调试内容:
... /* * get the next tuple from the table */ tuple = heap_getnext(scandesc, direction); if (tuple) { fprintf(stderr,"tuple's length is: %d\n",(int)(tuple->t_len)); fprintf(stderr,"tuple's block number hi is: %d...lo is %d \n", tuple->t_self.ip_blkid.bi_hi, tuple->t_self.ip_blkid.bi_lo); fprintf(stderr, "tuple's offset number is : %d\n", tuple->t_self.ip_posid); fprintf(stderr, "-------------------------------\n\n\n"); } ...
看看结果:
postgres=# select id from tst04 where id>1; id ---- 4 3 2 (3 rows) postgres=#
tuple's length is: 32 tuple's block number hi is: 0...lo is 0 tuple's offset number is : 1 ------------------------------- tuple's length is: 32 tuple's block number hi is: 0...lo is 0 tuple's offset number is : 2 ------------------------------- tuple's length is: 32 tuple's block number hi is: 0...lo is 0 tuple's offset number is : 3 ------------------------------- tuple's length is: 32 tuple's block number hi is: 0...lo is 0 tuple's offset number is : 4 -------------------------------
再看 HeapTupleData 的 t_tableoid:
先看 tst04 的 oid:
postgres=# select oid from pg_class where relname='tst04'; oid ------- 16393 (1 row) postgres=#
再加入调试信息:
...
/* * get the next tuple from the table */ tuple = heap_getnext(scandesc, direction); if (tuple) { fprintf(stderr,"tuple's length is: %d\n",(int)(tuple->t_len)); fprintf(stderr,"tuple's block number hi is: %d...lo is %d \n", tuple->t_self.ip_blkid.bi_hi, tuple->t_self.ip_blkid.bi_lo); fprintf(stderr, "tuple's offset number is : %d\n", tuple->t_self.ip_posid); fprintf(stderr, "tuple's table oid is: %d\n",tuple->t_tableOid); fprintf(stderr, "-------------------------------\n\n\n"); }
...
然后,我再看运行的情况:
postgres=# select id from tst04 where id>1; id ---- 4 3 2 (3 rows) postgres=#
tuple's length is: 32 tuple's block number hi is: 0...lo is 0 tuple's offset number is : 1 tuple's table oid is: 16393 ------------------------------- tuple's length is: 32 tuple's block number hi is: 0...lo is 0 tuple's offset number is : 2 tuple's table oid is: 16393 ------------------------------- tuple's length is: 32 tuple's block number hi is: 0...lo is 0 tuple's offset number is : 3 tuple's table oid is: 16393 ------------------------------- tuple's length is: 32 tuple's block number hi is: 0...lo is 0 tuple's offset number is : 4 tuple's table oid is: 16393 -------------------------------