获取文件的MD5哈希是非常慢的

I'm using the following code to get a MD5 hash for several files with an approx. total size of 1GB:

我正在使用以下代码获取几个文件的MD5哈希值。总大小为1GB:

md5 = hashlib.md5()
with open(filename,'rb') as f: 
    for chunk in iter(lambda: f.read(128*md5.block_size), b''): 
        md5.update(chunk)
fileHash = md5.hexdigest()

For me, it's getting it pretty fast as it takes about 3 seconds to complete. But unfortunately for my users (having an old PC's), this method is very slow and from my observations it may take about 4 minutes for some user to get all of the file hashes. This is a very annoying process for them, but at the same I think this is the simplest & fastest way possible - am I right?

对我来说,它的速度非常快,因为它需要大约3秒才能完成。但不幸的是,对于我的用户(拥有旧PC),这种方法非常慢,根据我的观察,一些用户可能需要大约4分钟来获取所有文件哈希值。这对他们来说是一个非常讨厌的过程,但同时我认为这是最简单和最快的方式 - 我是对的吗?

Would it be possible to speed-up the hash collecting process somehow?

是否有可能以某种方式加速哈希收集过程?

1 个解决方案

#1

I have a fairly weak laptop as well, and I just tried it - I can md5 one GB in four seconds as well. To go to several minutes, I suspect it's not the calculation but reading the file from hard disk. Try reading 1 MB blocks, i.e., f.read(2**20). That should need far fewer reads and increase the overall reading speed.

我也有一台相当弱的笔记本电脑,我只是试了一下 - 我也可以在4秒钟内完成1 GB的md5。要花几分钟,我怀疑这不是计算,而是从硬盘读取文件。尝试读取1 MB块,即f.read(2 ** 20)。这应该需要更少的读取并提高整体阅读速度。

秒客网

获取文件的MD5哈希是非常慢的

1 个解决方案

#1

#1

相关文章