流式处理方式即使用数据库的流式查询功能,查询成功之后不是返回一个数据集合,而是返回一个迭代器,通过这个迭代器可以进行循环,每次查询出一条数据来进行处理。使用该方式可以有效降低内存占用,且因为不需要像分页一样每次重头扫描表,每查询一条数据都是在上次查询的基础上面查询,即知道上条数据的位置,因此查询效率较高
/**
* 流式处理
* 检查数据,删除 无效备份信息 和 已备份文件
* 什么叫无效?简单来说就是,已备份文件和原文件对应不上,或者说原文件被删除了
*
* @param sourceId
*/
@SneakyThrows
public void clearBySourceIdV2(Long sourceId) {
// 获取 dataSource Bean 的连接
@Cleanup Connection conn = dataSource.getConnection();
@Cleanup Statement stmt = conn.createStatement(ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY);
stmt.setFetchSize(Integer.MIN_VALUE);
long start = System.currentTimeMillis();
// 查询sql,只查询关键的字段
String sql = "SELECT id,source_file_path,target_file_path FROM backup_file where backup_source_id = " + sourceId;
@Cleanup ResultSet rs = stmt.executeQuery(sql);
loopResultSetProcessClear(rs, sourceId);
log.info("流式清理花费时间:{} s ", (System.currentTimeMillis() - start) / 1000);
}
/**
* 循环读取,每次读取一行数据进行处理
*
* @param rs
* @param sourceId
* @return
*/
@SneakyThrows
private Long loopResultSetProcessClear(ResultSet rs, Long sourceId) {
// 填充数据源相关信息
BackupSource source = backupSourceService.getById(sourceId);
if (source == null) {
throw new ClientException("所需要清理的数据源不存在");
}
// 中途用来存储需要删除的文件信息
List<Long> removeBackupFileIdList = new ArrayList<>();
List<String> removeBackupTargetFilePathList = new ArrayList<>();
// 查询文件总数
long totalFileNum = backupFileService.count(Wrappers.query(new BackupFile()).eq("backup_source_id", sourceId));
// 已经扫描的文件数量
long finishFileNum = 0;
ClearStatistic clearStatistic = new ClearStatistic(0);
long second = System.currentTimeMillis() / 1000;
long curSecond;
// 发送消息通知前端 清理正式开始
ClearTask clearTask = ClearTask.builder()
.id(snowFlakeUtil.nextId())
.clearSourceRoot(source.getRootPath())
.totalFileNum(totalFileNum)
.finishFileNum(0L)
.clearStatus(0)
.clearNumProgress("0.0")
.startTime(new DateTime())
.clearTime(0L)
.build();
Map<String, Object> dataMap = new HashMap<>();
dataMap.put("clearTask", clearTask);
notify(WebsocketNoticeEnum.CLEAR_START, dataMap);
// 每次获取一行数据进行处理,rs.next()如果有数据返回true,否则返回false
while (rs.next()) {
// 获取数据中的属性
long fileId = rs.getLong("id");
String sourceFilePath = rs.getString("source_file_path");
String targetFilePath = rs.getString("target_file_path");
// 所扫描的文件数量+1
finishFileNum++;
// 获取备份文件的路径
File sourceFile = new File(sourceFilePath);
if (!sourceFile.exists()) {
// --if-- 如果原目录该文件已经被删除,则删除
removeBackupFileIdList.add(fileId);
removeBackupTargetFilePathList.add(targetFilePath);
}
if (removeBackupFileIdList.size() >= 2000) {
clear(removeBackupFileIdList, removeBackupTargetFilePathList, clearStatistic);
}
curSecond = System.currentTimeMillis() / 1000;
if (curSecond > second) {
second = curSecond;
// 告诉前端,更新清理状态
clearTask.setFinishFileNum(finishFileNum);
clearTask.setClearStatus(1);
clearTask.setFinishDeleteFileNum(clearStatistic.finishDeleteFileNum);
setClearProgress(clearTask, dataMap);
notify(WebsocketNoticeEnum.CLEAR_PROCESS, dataMap);
}
}
// 循环结束之后,再清理一次,避免文件数没有到达清理批量导致清理失败
clear(removeBackupFileIdList, removeBackupTargetFilePathList, clearStatistic);
// 告诉前端,清理成功
clearTask.setFinishFileNum(finishFileNum);
clearTask.setClearStatus(2);
clearTask.setFinishDeleteFileNum(clearStatistic.finishDeleteFileNum);
setClearProgress(clearTask, dataMap);
notify(WebsocketNoticeEnum.CLEAR_SUCCESS, dataMap);
return 0L;
}
/**
* 执行清理
* @param removeBackupFileIdList
* @param removeBackupTargetFilePathList
* @param clearStatistic
*/
private void clear(List<Long> removeBackupFileIdList, List<String> removeBackupTargetFilePathList, ClearStatistic clearStatistic) {
// 批量删除无效备份文件
backupFileService.removeByIds(removeBackupFileIdList);
// 删除无效的已备份文件
for (String backupTargetFilePath : removeBackupTargetFilePathList) {
File removeFile = new File(backupTargetFilePath);
if (removeFile.exists()) {
boolean delete = FileUtils.recursionDeleteFiles(removeFile, clearStatistic);
if (!delete) {
throw new ServiceException("文件无法删除");
}
}
}
// 批量删除无效备份文件对应的备份记录
backupFileHistoryService.removeByFileIds(removeBackupFileIdList);
removeBackupFileIdList.clear();
removeBackupTargetFilePathList.clear();
}
/**
* 发送通知给前端
*
* @param noticeEnum
* @param dataMap
*/
private void notify(WebsocketNoticeEnum noticeEnum, Map<String, Object> dataMap) {
dataMap.put("code", noticeEnum.getCode());
dataMap.put("message", noticeEnum.getDetail());
webSocketServer.sendMessage(JSON.toJSONString(dataMap), WebSocketServer.usernameAndSessionMap.get("Admin"));
}
测试
经过测试,发现改进后的程序只需要70秒就可以完成清理,速度是原始方案的25倍左右