转载请附原文链接:http://www.cnblogs.com/wingsless/p/5582063.html
昨天写到了InnoDB缓冲池的预读:《InnoDB源码分析--缓冲池(二)》,最后因为着急看欧洲杯,没有把线性预读写完,今天接着写。
线性预读是由这个函数实现的:buf_read_ahead_linear,和随机预读一样,首先是要确定区域边界,这个边界内被访问过的page如果达到一个阈值(BUF_READ_AHEAD_LINEAR_THRESHOLD),就会触发预读操作。边界的算法由BUF_READ_AHEAD_LINEAR_AREA决定:
low = (offset / BUF_READ_AHEAD_LINEAR_AREA) * BUF_READ_AHEAD_LINEAR_AREA; high = (offset / BUF_READ_AHEAD_LINEAR_AREA + 1) * BUF_READ_AHEAD_LINEAR_AREA; if ((offset != low) && (offset != high - 1)) { /* This is not a border page of the area: return */ return(0); }
注意,如果offset不在边界上,就不会进行预读了,这一点和随机预读是不一样的。线性预读其实是顺序性读取的,如果offset在low位置,逆序读取page,如果offset在high位置,正序读取page。读取的每个页,都要进行判断,如果被访问过的页的数量到达上面提到的阈值,就满足了线性预读的条件,达不到阈值,就不进行预读,代码如下:
asc_or_desc = 1; //默认正序 if (offset == low) { asc_or_desc = -1; //如果offset在low位置,变成逆序 } fail_count = 0; for (i = low; i < high; i++) { block = buf_page_hash_get(space, i); //遍历边界范围内的页 if ((block == NULL) || !block->accessed) { /* Not accessed */ fail_count++; //未读取的页计数 } else if (pred_block && (ut_ulint_cmp(block->LRU_position, pred_block->LRU_position) != asc_or_desc)) { /* Accesses not in the right order */ fail_count++; pred_block = block; } } if (fail_count > BUF_READ_AHEAD_LINEAR_AREA - BUF_READ_AHEAD_LINEAR_THRESHOLD) { //不满足预读条件,退出 /* Too many failures: return */ mutex_exit(&(buf_pool->mutex)); return(0); }
我之前在一本书上看到过一句话,大概意思是内存里的页可以不是物理上连续的,逻辑上却是连续的。这里的线性预读要求这些页在物理上也是必须连续的:
pred_offset = fil_page_get_prev(frame); succ_offset = fil_page_get_next(frame); mutex_exit(&(buf_pool->mutex)); if ((offset == low) && (succ_offset == offset + 1)) { /* This is ok, we can continue */
new_offset = pred_offset; //满足了条件,继续 } else if ((offset == high - 1) && (pred_offset == offset - 1)) { /* This is ok, we can continue */ new_offset = succ_offset; //这是正序情况下,满足条件 } else { /* Successor or predecessor not in the right order */ return(0); }
这个地方是这样的,首先利用fil_page_get_prev和fil_page_get_next函数读取offset->frame之后或者之前的4个bytes,如果结果满足顺序条件,可以继续进行线性预读。
for (i = low; i < high; i++) { /* It is only sensible to do read-ahead in the non-sync aio mode: hence FALSE as the first parameter */ if (!ibuf_bitmap_page(i)) { count += buf_read_page_low( &err, FALSE, ibuf_mode | OS_AIO_SIMULATED_WAKE_LATER, space, tablespace_version, i); if (err == DB_TABLESPACE_DELETED) { ut_print_timestamp(stderr); fprintf(stderr, " InnoDB: Warning: in" " linear readahead trying to access " "InnoDB: tablespace %lu page %lu, " "InnoDB: but the tablespace does not" " exist or is just being dropped. ", (ulong) space, (ulong) i); } } }
线性预读还是利用了buf_read_page_low函数,这一点和随机预读一样,而且是异步方式。
至此便完成了线性预读。
不管是随机预读还是线性预读,都会有一些条件不进行预读,比如系统压力大的时候不预读,这个的实现:
if (buf_pool->n_pend_reads > buf_pool->curr_size / BUF_READ_AHEAD_PEND_LIMIT) { mutex_exit(&(buf_pool->mutex)); return(0); }
这里规定了pend读取数大于buf_pool->curr_size一半的时候,就不预读了,相似的还有很多条件,都在代码里,这里就不写了。