mysql中的数据库比较多的时候,异常启动时间很慢
l_data/xb_doublewrite 2023-02-10T03:10:13.386776Z 0 [Note] InnoDB: Opened 3 undo tablespaces 2023-02-10T03:10:13.386799Z 0 [Note] InnoDB: 3 undo tablespaces made active 2023-02-10T03:10:13.387079Z 0 [Note] InnoDB: Highest supported file format is Barracuda. 2023-02-10T03:10:13.420026Z 0 [Note] InnoDB: Log scan progressed past the checkpoint lsn 5566077531443 2023-02-10T03:10:13.420054Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 5566077531864 2023-02-10T03:10:13.420635Z 0 [Note] InnoDB: Database was not shutdown normally! 2023-02-10T03:10:13.420648Z 0 [Note] InnoDB: Starting crash recovery. 2023-02-10T03:10:13.474616Z 0 [Note] InnoDB: Created parallel doublewrite buffer at /data/mysql_data/xb_doublewrite, size 7864320 bytes 2023-02-10T03:10:13.705255Z 0 [Note] InnoDB: Starting an apply batch of log records to the database... InnoDB: Progress in percent: 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 2023-02-10T03:10:14.318639Z 0 [Note] InnoDB: Apply batch completed 2023-02-10T03:10:14.318686Z 0 [Note] InnoDB: Last MySQL binlog file position 0 704652105, file name mysql-bin.004248 2023-02-10T03:18:00.657774Z 0 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1" 2023-02-10T03:18:00.657826Z 0 [Note] InnoDB: Creating shared tablespace for temporary tables
可以看到,在Apply batch completed后,等待了8分钟才开始清理临时表空间,看下为什么会怎么慢
在输出Last MySQL binlog file position后,会启动purge sys
恢复需要checkpoint后的trx list
然后需要创建检查系统表
接下来的表空间检查和存储表空间的最大id,这个在恢复的过程中比较耗时,打开每个表空间的时候,会读取每个header 页,验证字典
The 'validate' flag indicates that when a tablespace is opened, we also read the header page and validate the contents to the data dictionary. This is time consuming, especially for databases with lots of ibd files. So only do it after a crash and not forcing recovery. Open rw transactions at this point is not a good reason to validate. */ bool validate = recv_needed_recovery && srv_force_recovery == 0; dict_check_tablespaces_and_store_max_id(validate);
在校验的时候,会扫描sys_tablespace和sys_table中的记录,sys_tablespace中的扫描会过滤掉file-per-table tablespaces /* Open all general tablespaces found in SYS_TABLESPACES. */ ulint max1 = dict_check_sys_tablespaces(validate); /* Open all tablespaces referenced in SYS_TABLES. This will update SYS_TABLESPACES and SYS_DATAFILES if it finds any file-per-table tablespaces not already there. */ ulint max2 = dict_check_sys_tables(validate);
在fil_ibd_open中,文件有三个位置,默认位置,数据字典位置,和远程位置
Datafile df_default; /* default location */ Datafile df_dict; /* dictionary location */ RemoteDatafile df_remote; /* remote location */
在校验时候,针对这三个可能的位置文件,都进行了文件打开,进行了第一个页的校验。
/* Read and validate the first page of these three tablespace locations, if found. */ valid_tablespaces_found += (df_remote.validate_to_dd(id, flags, for_import) == DB_SUCCESS) ? 1 : 0; valid_tablespaces_found += (df_default.validate_to_dd(id, flags, for_import) == DB_SUCCESS) ? 1 : 0; valid_tablespaces_found += (df_dict.validate_to_dd(id, flags, for_import) == DB_SUCCESS) ? 1 : 0;
这里可以进行优化吧,先判断这3个位置是不是同一个,相同的话,就执行一次第一个页的校验即可。另外这个函数中是顺序一个个表进行校验的,可以按库级别进行并行,加快恢复的速度。