MySQL crash recovery恢复慢分析

mysql中的数据库比较多的时候，异常启动时间很慢

l_data/xb_doublewrite
2023-02-10T03:10:13.386776Z 0 [Note] InnoDB: Opened 3 undo tablespaces
2023-02-10T03:10:13.386799Z 0 [Note] InnoDB: 3 undo tablespaces made active
2023-02-10T03:10:13.387079Z 0 [Note] InnoDB: Highest supported file format is Barracuda.
2023-02-10T03:10:13.420026Z 0 [Note] InnoDB: Log scan progressed past the checkpoint lsn 5566077531443
2023-02-10T03:10:13.420054Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 5566077531864
2023-02-10T03:10:13.420635Z 0 [Note] InnoDB: Database was not shutdown normally!
2023-02-10T03:10:13.420648Z 0 [Note] InnoDB: Starting crash recovery.
2023-02-10T03:10:13.474616Z 0 [Note] InnoDB: Created parallel doublewrite buffer at /data/mysql_data/xb_doublewrite, size 7864320 bytes
2023-02-10T03:10:13.705255Z 0 [Note] InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percent: 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
2023-02-10T03:10:14.318639Z 0 [Note] InnoDB: Apply batch completed
2023-02-10T03:10:14.318686Z 0 [Note] InnoDB: Last MySQL binlog file position 0 704652105, file name mysql-bin.004248
2023-02-10T03:18:00.657774Z 0 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1"
2023-02-10T03:18:00.657826Z 0 [Note] InnoDB: Creating shared tablespace for temporary tables

可以看到，在Apply batch completed后，等待了8分钟才开始清理临时表空间，看下为什么会怎么慢

在输出Last MySQL binlog file position后，会启动purge sys

恢复需要checkpoint后的trx list

然后需要创建检查系统表

接下来的表空间检查和存储表空间的最大id，这个在恢复的过程中比较耗时，打开每个表空间的时候，会读取每个header 页，验证字典

The 'validate' flag indicates that when a tablespace
            is opened, we also read the header page and validate
            the contents to the data dictionary. This is time
            consuming, especially for databases with lots of ibd
            files.  So only do it after a crash and not forcing
            recovery.  Open rw transactions at this point is not
            a good reason to validate. */
bool validate = recv_needed_recovery
&& srv_force_recovery == 0;
dict_check_tablespaces_and_store_max_id(validate);

在校验的时候，会扫描sys_tablespace和sys_table中的记录，sys_tablespace中的扫描会过滤掉file-per-table tablespaces
    /* Open all general tablespaces found in SYS_TABLESPACES. */
ulint   max1 = dict_check_sys_tablespaces(validate);
    /* Open all tablespaces referenced in SYS_TABLES.
    This will update SYS_TABLESPACES and SYS_DATAFILES if it
    finds any file-per-table tablespaces not already there. */
ulint   max2 = dict_check_sys_tables(validate);

在fil_ibd_open中，文件有三个位置，默认位置，数据字典位置，和远程位置

Datafile    df_default; /* default location */
Datafile    df_dict;    /* dictionary location */
RemoteDatafile  df_remote;  /* remote location */

在校验时候，针对这三个可能的位置文件，都进行了文件打开，进行了第一个页的校验。

/* Read and validate the first page of these three tablespace
    locations, if found. */
valid_tablespaces_found +=
(df_remote.validate_to_dd(id, flags, for_import)
== DB_SUCCESS) ? 1 : 0;
valid_tablespaces_found +=
(df_default.validate_to_dd(id, flags, for_import)
== DB_SUCCESS) ? 1 : 0;
valid_tablespaces_found +=
(df_dict.validate_to_dd(id, flags, for_import)
== DB_SUCCESS) ? 1 : 0;

这里可以进行优化吧，先判断这3个位置是不是同一个，相同的话，就执行一次第一个页的校验即可。另外这个函数中是顺序一个个表进行校验的，可以按库级别进行并行，加快恢复的速度。

秒客网

MySQL crash recovery恢复慢分析

相关文章