Python:如何将字符串哈希为8位数?

Is there anyway that I can hash a random string into a 8 digit number without implementing any algorithms myself? Thanks.

无论如何，我可以将一个随机字符串散列成一个8位数，而不需要自己实现任何算法吗?谢谢。

3 个解决方案

#1

Yes, you can use the built-in hashlib modules or the built-in hash function. Then, chop-off the last eight digits using modulo operations or string slicing operations on the integer form of the hash:

是的，您可以使用内置的hashlib模块或内置的散列函数。然后，使用模块化操作或字符串切片操作对哈希的整数形式进行切取最后8位数字:

>>> s = 'she sells sea shells by the sea shore'

>>> # Use hashlib
>>> import hashlib
>>> int(hashlib.sha1(s).hexdigest(), 16) % (10 ** 8)
58097614L

>>> # Use hash()
>>> abs(hash(s)) % (10 ** 8)
82148974

#2

Raymond's answer is great for python2 (though, you don't need the abs() nor the parens around 10 ** 8). However, for python3, there are important caveats. First, you'll need to make sure you are passing an encoded string. These days, in most circumstances, it's probably also better to shy away from sha-1 and use something like sha-256, instead. So, the hashlib approach would be:

雷蒙德的回答对python2来说很好(不过，你不需要abs()，也不需要10 ** 8的解析。但是，对于python3来说，有一些重要的警告。首先，您需要确保传递的是一个编码字符串。现在，在大多数情况下，最好避开sha-1，使用sha-256之类的东西。因此，hashlib方法是:

>>> import hashlib
>>> s = 'your string'
>>> int(hashlib.sha256(s.encode('utf-8')).hexdigest(), 16) % 10**8
80262417

If you want to use the hash() function instead, the important caveat is that, unlike in Python 2.x, in Python 3.x, the result of hash() will only be consistent within a process, not across python invocations. See here:

如果您想要使用hash()函数，需要注意的是，与Python 2不同。x,在Python 3。x，散列()的结果只在一个过程中是一致的，而不是在python调用中。在这里看到的:

$ python -V
Python 2.7.5
$ python -c 'print(hash("foo"))'
-4177197833195190597
$ python -c 'print(hash("foo"))'
-4177197833195190597

$ python3 -V
Python 3.4.2
$ python3 -c 'print(hash("foo"))'
5790391865899772265
$ python3 -c 'print(hash("foo"))'
-8152690834165248934

This means the hash()-based solution suggested, which can be shortened to just:

这意味着基于hash()的解决方案建议，可以缩短为:

hash(s) % 10**8

散列(s)% 10 * * 8

will only return the same value within a given script run:

只会在给定的脚本运行中返回相同的值:

#Python 2:
$ python2 -c 's="your string"; print(hash(s) % 10**8)'
52304543
$ python2 -c 's="your string"; print(hash(s) % 10**8)'
52304543

#Python 3:
$ python3 -c 's="your string"; print(hash(s) % 10**8)'
12954124
$ python3 -c 's="your string"; print(hash(s) % 10**8)'
32065451

So, depending on if this matters in your application (it did in mine), you'll probably want to stick to the hashlib-based approach.

因此，根据这在应用程序中是否重要(在我的应用程序中确实如此)，您可能希望坚持使用基于hashlib的方法。

#3

Just to complete JJC answer, in python 3.5.3 the behavior is correct if you use hashlib this way:

为了完成JJC的回答，在python 3.5.3中，如果您这样使用hashlib，那么行为是正确的:

$ python3 -c '
import hashlib
hash_object = hashlib.sha256(b"Caroline")
hex_dig = hash_object.hexdigest()
print(hex_dig)
'
739061d73d65dcdeb755aa28da4fea16a02b9c99b4c2735f2ebfa016f3e7fded
$ python3 -c '
import hashlib
hash_object = hashlib.sha256(b"Caroline")
hex_dig = hash_object.hexdigest()
print(hex_dig)
'
739061d73d65dcdeb755aa28da4fea16a02b9c99b4c2735f2ebfa016f3e7fded

$ python3 -V
Python 3.5.3

#1