转载请附原文链接:http://www.cnblogs.com/wingsless/p/5582063.html
昨天写到了InnoDB缓冲池的预读:《InnoDB源码分析--缓冲池(二)》,最后因为着急看欧洲杯,没有把线性预读写完,今天接着写。
线性预读是由这个函数实现的:buf_read_ahead_linear,和随机预读一样,首先是要确定区域边界,这个边界内被访问过的page如果达到一个阈值(BUF_READ_AHEAD_LINEAR_THRESHOLD),就会触发预读操作。边界的算法由BUF_READ_AHEAD_LINEAR_AREA决定:
low = (offset / BUF_READ_AHEAD_LINEAR_AREA)
* BUF_READ_AHEAD_LINEAR_AREA;
high = (offset / BUF_READ_AHEAD_LINEAR_AREA + )
* BUF_READ_AHEAD_LINEAR_AREA; if ((offset != low) && (offset != high - )) {
/* This is not a border page of the area: return */ return();
}
注意,如果offset不在边界上,就不会进行预读了,这一点和随机预读是不一样的。线性预读其实是顺序性读取的,如果offset在low位置,逆序读取page,如果offset在high位置,正序读取page。读取的每个页,都要进行判断,如果被访问过的页的数量到达上面提到的阈值,就满足了线性预读的条件,达不到阈值,就不进行预读,代码如下:
asc_or_desc = ; //默认正序 if (offset == low) {
asc_or_desc = -; //如果offset在low位置,变成逆序
} fail_count = ; for (i = low; i < high; i++) {
block = buf_page_hash_get(space, i); //遍历边界范围内的页 if ((block == NULL) || !block->accessed) {
/* Not accessed */
fail_count++; //未读取的页计数 } else if (pred_block
&& (ut_ulint_cmp(block->LRU_position,
pred_block->LRU_position)
!= asc_or_desc)) {
/* Accesses not in the right order */ fail_count++;
pred_block = block;
}
} if (fail_count > BUF_READ_AHEAD_LINEAR_AREA
- BUF_READ_AHEAD_LINEAR_THRESHOLD) { //不满足预读条件,退出
/* Too many failures: return */ mutex_exit(&(buf_pool->mutex)); return();
}
我之前在一本书上看到过一句话,大概意思是内存里的页可以不是物理上连续的,逻辑上却是连续的。这里的线性预读要求这些页在物理上也是必须连续的:
pred_offset = fil_page_get_prev(frame);
succ_offset = fil_page_get_next(frame); mutex_exit(&(buf_pool->mutex)); if ((offset == low) && (succ_offset == offset + )) { /* This is ok, we can continue */
new_offset = pred_offset; //满足了条件,继续 } else if ((offset == high - ) && (pred_offset == offset - )) { /* This is ok, we can continue */
new_offset = succ_offset; //这是正序情况下,满足条件
} else {
/* Successor or predecessor not in the right order */ return();
}
这个地方是这样的,首先利用fil_page_get_prev和fil_page_get_next函数读取offset->frame之后或者之前的4个bytes,如果结果满足顺序条件,可以继续进行线性预读。
for (i = low; i < high; i++) {
/* It is only sensible to do read-ahead in the non-sync
aio mode: hence FALSE as the first parameter */ if (!ibuf_bitmap_page(i)) {
count += buf_read_page_low(
&err, FALSE,
ibuf_mode | OS_AIO_SIMULATED_WAKE_LATER,
space, tablespace_version, i);
if (err == DB_TABLESPACE_DELETED) {
ut_print_timestamp(stderr);
fprintf(stderr,
" InnoDB: Warning: in"
" linear readahead trying to access\n"
"InnoDB: tablespace %lu page %lu,\n"
"InnoDB: but the tablespace does not"
" exist or is just being dropped.\n",
(ulong) space, (ulong) i);
}
}
}
线性预读还是利用了buf_read_page_low函数,这一点和随机预读一样,而且是异步方式。
至此便完成了线性预读。
不管是随机预读还是线性预读,都会有一些条件不进行预读,比如系统压力大的时候不预读,这个的实现:
if (buf_pool->n_pend_reads
> buf_pool->curr_size / BUF_READ_AHEAD_PEND_LIMIT) {
mutex_exit(&(buf_pool->mutex)); return();
}
这里规定了pend读取数大于buf_pool->curr_size一半的时候,就不预读了,相似的还有很多条件,都在代码里,这里就不写了。