This question already has an answer here:
这个问题在这里已有答案:
- Simple Python Challenge: Fastest Bitwise XOR on Data Buffers 11 answers
- 简单的Python挑战:数据缓冲区上最快的按位异或11个答案
I need to xor 2 bytes objects. I use this code:
我需要xor 2字节对象。我用这个代码:
def bxor(b1, b2): # use xor for bytes
result = b""
for b1, b2 in zip(b1, b2):
result += bytes([b1 ^ b2])
return result
It works fine when the bytes objects are small, but if I xor big objects (a few MB) it takes very long time (a few hours). How can I make it faster?
当字节对象很小时,它工作正常,但如果我和大对象(几MB),它需要很长时间(几个小时)。我怎样才能让它更快?
4 个解决方案
#1
5
When XORing bytes
objects with one million elements each, this loop creates roughly one million temporary bytes
objects and copies each byte, on average, roughly 500 thousand times from one temporary bytes
to the next. Note that the exact same problem exists for strings (in many other languages, too). The string solution is to create a list of string parts and use ''.join
at the end to concatenate them efficiently. You can do the same thing with bytes:
当XORing字节对象各有一百万个元素时,这个循环创建大约一百万个临时字节对象,并且平均每个字节从一个临时字节到下一个临时字节复制大约50万次。请注意,字符串存在完全相同的问题(在许多其他语言中也是如此)。字符串解决方案是创建一个字符串部分列表,并在末尾使用'.join来有效地连接它们。您可以使用字节执行相同的操作:
def bxor(b1, b2): # use xor for bytes
parts = []
for b1, b2 in zip(b1, b2):
parts.append(bytes([b1 ^ b2]))
return b''.join(parts)
Alternatively, you can use a bytearray
which is mutable and can therefore avoid the problem. It also allows you to not allocate a new bytes
object on every iteration, you can just append the byte/int
.
或者,您可以使用可变的bytearray,因此可以避免此问题。它还允许您不在每次迭代时分配新的字节对象,您可以只追加byte / int。
def bxor(b1, b2): # use xor for bytes
result = bytearray()
for b1, b2 in zip(b1, b2):
result.append(b1 ^ b2)
return result
You can alternatively return bytes(result)
if you want/need a bytes
object.
如果需要/需要字节对象,也可以返回字节(结果)。
#2
4
Using a bytearray
is a lot faster already:
使用bytearray已经快很多了:
def bxor(b1, b2):
result = bytearray(b1)
for i, b in enumerate(b2):
result[i] ^= b
return bytes(result)
A quick timeit
comparison:
快速的时间比较:
>>> import timeit
>>> b1, b2 = b'abcdefg' * 10, b'aaaaaaa' * 10
>>> timeit.timeit('it(b1, b2)', 'from __main__ import b1, b2, bxor as it', number=10000)
0.9230150280000089
>>> timeit.timeit('it(b1, b2)', 'from __main__ import b1, b2, bxor_ba as it', number=10000)
0.16270576599890774
This avoids creating new bytes
objects for all the concatenations.
这避免了为所有连接创建新的字节对象。
The b''.join()
method proposed by delnan is no much better than the original version:
delnan提出的b''。join()方法并不比原始版本好多少:
>>> timeit.timeit('it(b1, b2)', 'from __main__ import b1, b2, bxor_join as it', number=10000)
0.9936718749995634
And a re-run with bytestrings 100 times larger:
并且使用大于100倍的字节串重新运行:
>>> b1, b2 = b'abcdefg' * 1000, b'aaaaaaa' * 1000
>>> timeit.timeit('it(b1, b2)', 'from __main__ import b1, b2, bxor as it', number=1000)
11.032563796999966
>>> timeit.timeit('it(b1, b2)', 'from __main__ import b1, b2, bxor_join as it', number=1000)
9.242204494001271
>>> timeit.timeit('it(b1, b2)', 'from __main__ import b1, b2, bxor_ba as it', number=1000)
1.762020197998936
to show that bytes.join()
is a faster than repeated concatenation.
表明bytes.join()比重复串联更快。
A final 7 million byte run, repeated 10 times, just with the bytearray
version, I ran out of patience with the other versions:
最后的700万字节运行,重复10次,只是使用了bytearray版本,我对其他版本没有耐心了:
>>> b1, b2 = b'abcdefg' * 1000000, b'aaaaaaa' * 1000000
>>> timeit.timeit('it(b1, b2)', 'from __main__ import b1, b2, bxor_ba as it', number=10)
16.18445999799951
#3
4
Adding this in another answer, 'cause it is one:
在另一个答案中添加这个,因为它是一个:
If you want something faster than the "manual" methods given, there's always Numpy:
如果你想要比给出的“手动”方法更快的东西,总会有Numpy:
import numpy
def bxor_numpy(b1, b2):
n_b1 = numpy.fromstring(b1, dtype='uint8')
n_b2 = numpy.fromstring(b2, dtype='uint8')
return (n_b1 ^ n_b2).tostring()
and it's fast:
它很快:
first_random = urandom(100000)
second_random = urandom(100000)
min(Timer(partial(bxor_inplace, first_random, second_random)).repeat(10, 100))
#>>> 1.5381054869794752
min(Timer(partial(bxor_append, first_random, second_random)).repeat(10, 100))
#>>> 1.5624085619929247
min(Timer(partial(bxor_numpy, first_random, second_random)).repeat(10, 100))
#>>> 0.009930026979418471
So it's 150x faster than the best alternatives posted here.
所以它比这里发布的最佳替代品快150倍。
#4
1
Martijn Pieters' timings are a bit different to mine:
Martijn Pieters的时间与我的有点不同:
def bxor_add(b1, b2): # use xor for bytes
result = b""
for b1, b2 in zip(b1, b2):
result += bytes([b1 ^ b2])
return result
def bxor_inplace(b1, b2):
result = bytearray(b1)
for i, b in enumerate(b2):
result[i] ^= b
return bytes(result)
def bxor_join(b1, b2): # use xor for bytes
parts = []
for b1, b2 in zip(b1, b2):
parts.append(bytes([b1 ^ b2]))
return b''.join(parts)
def bxor_append(b1, b2): # use xor for bytes
result = bytearray()
for b1, b2 in zip(b1, b2):
result.append(b1 ^ b2)
return bytes(result)
#>>>
from os import urandom
from timeit import Timer
from functools import partial
first_random = urandom(200000)
second_random = urandom(200000)
Timer(partial(bxor_add, first_random, second_random)).timeit(1)
#>>> 1.3261873809969984
Timer(partial(bxor_inplace, first_random, second_random)).timeit(1)
#>>> 0.03055390200461261
Timer(partial(bxor_join, first_random, second_random)).timeit(1)
#>>> 0.15852201101370156
Timer(partial(bxor_append, first_random, second_random)).timeit(1)
#>>> 0.030534288001945242
first_random = urandom(10000000)
second_random = urandom(10000000)
Timer(partial(bxor_inplace, first_random, second_random)).timeit(1)
#>>> 1.5432947289955337
Timer(partial(bxor_join, first_random, second_random)).timeit(1)
#>>> 7.90503858300508
Timer(partial(bxor_append, first_random, second_random)).timeit(1)
#>>> 1.5145326450001448
I'd go with the append
version for clarity and speed.
为了清晰和速度,我会使用追加版本。
For clarification, I don't think the append
method is meaningfully faster than the inplace
version; I just think it's a tiny bit more straightforward.
为了澄清,我认为append方法没有比inplace版本快得多;我只是认为它更简单一些。
Nevertheless, because it was requested:
然而,因为它被要求:
first_random = urandom(100000)
second_random = urandom(100000)
min(Timer(partial(bxor_inplace, first_random, second_random)).repeat(10, 100))
#>>> 1.5381054869794752
min(Timer(partial(bxor_append, first_random, second_random)).repeat(10, 100))
#>>> 1.5196998479950707
#1
5
When XORing bytes
objects with one million elements each, this loop creates roughly one million temporary bytes
objects and copies each byte, on average, roughly 500 thousand times from one temporary bytes
to the next. Note that the exact same problem exists for strings (in many other languages, too). The string solution is to create a list of string parts and use ''.join
at the end to concatenate them efficiently. You can do the same thing with bytes:
当XORing字节对象各有一百万个元素时,这个循环创建大约一百万个临时字节对象,并且平均每个字节从一个临时字节到下一个临时字节复制大约50万次。请注意,字符串存在完全相同的问题(在许多其他语言中也是如此)。字符串解决方案是创建一个字符串部分列表,并在末尾使用'.join来有效地连接它们。您可以使用字节执行相同的操作:
def bxor(b1, b2): # use xor for bytes
parts = []
for b1, b2 in zip(b1, b2):
parts.append(bytes([b1 ^ b2]))
return b''.join(parts)
Alternatively, you can use a bytearray
which is mutable and can therefore avoid the problem. It also allows you to not allocate a new bytes
object on every iteration, you can just append the byte/int
.
或者,您可以使用可变的bytearray,因此可以避免此问题。它还允许您不在每次迭代时分配新的字节对象,您可以只追加byte / int。
def bxor(b1, b2): # use xor for bytes
result = bytearray()
for b1, b2 in zip(b1, b2):
result.append(b1 ^ b2)
return result
You can alternatively return bytes(result)
if you want/need a bytes
object.
如果需要/需要字节对象,也可以返回字节(结果)。
#2
4
Using a bytearray
is a lot faster already:
使用bytearray已经快很多了:
def bxor(b1, b2):
result = bytearray(b1)
for i, b in enumerate(b2):
result[i] ^= b
return bytes(result)
A quick timeit
comparison:
快速的时间比较:
>>> import timeit
>>> b1, b2 = b'abcdefg' * 10, b'aaaaaaa' * 10
>>> timeit.timeit('it(b1, b2)', 'from __main__ import b1, b2, bxor as it', number=10000)
0.9230150280000089
>>> timeit.timeit('it(b1, b2)', 'from __main__ import b1, b2, bxor_ba as it', number=10000)
0.16270576599890774
This avoids creating new bytes
objects for all the concatenations.
这避免了为所有连接创建新的字节对象。
The b''.join()
method proposed by delnan is no much better than the original version:
delnan提出的b''。join()方法并不比原始版本好多少:
>>> timeit.timeit('it(b1, b2)', 'from __main__ import b1, b2, bxor_join as it', number=10000)
0.9936718749995634
And a re-run with bytestrings 100 times larger:
并且使用大于100倍的字节串重新运行:
>>> b1, b2 = b'abcdefg' * 1000, b'aaaaaaa' * 1000
>>> timeit.timeit('it(b1, b2)', 'from __main__ import b1, b2, bxor as it', number=1000)
11.032563796999966
>>> timeit.timeit('it(b1, b2)', 'from __main__ import b1, b2, bxor_join as it', number=1000)
9.242204494001271
>>> timeit.timeit('it(b1, b2)', 'from __main__ import b1, b2, bxor_ba as it', number=1000)
1.762020197998936
to show that bytes.join()
is a faster than repeated concatenation.
表明bytes.join()比重复串联更快。
A final 7 million byte run, repeated 10 times, just with the bytearray
version, I ran out of patience with the other versions:
最后的700万字节运行,重复10次,只是使用了bytearray版本,我对其他版本没有耐心了:
>>> b1, b2 = b'abcdefg' * 1000000, b'aaaaaaa' * 1000000
>>> timeit.timeit('it(b1, b2)', 'from __main__ import b1, b2, bxor_ba as it', number=10)
16.18445999799951
#3
4
Adding this in another answer, 'cause it is one:
在另一个答案中添加这个,因为它是一个:
If you want something faster than the "manual" methods given, there's always Numpy:
如果你想要比给出的“手动”方法更快的东西,总会有Numpy:
import numpy
def bxor_numpy(b1, b2):
n_b1 = numpy.fromstring(b1, dtype='uint8')
n_b2 = numpy.fromstring(b2, dtype='uint8')
return (n_b1 ^ n_b2).tostring()
and it's fast:
它很快:
first_random = urandom(100000)
second_random = urandom(100000)
min(Timer(partial(bxor_inplace, first_random, second_random)).repeat(10, 100))
#>>> 1.5381054869794752
min(Timer(partial(bxor_append, first_random, second_random)).repeat(10, 100))
#>>> 1.5624085619929247
min(Timer(partial(bxor_numpy, first_random, second_random)).repeat(10, 100))
#>>> 0.009930026979418471
So it's 150x faster than the best alternatives posted here.
所以它比这里发布的最佳替代品快150倍。
#4
1
Martijn Pieters' timings are a bit different to mine:
Martijn Pieters的时间与我的有点不同:
def bxor_add(b1, b2): # use xor for bytes
result = b""
for b1, b2 in zip(b1, b2):
result += bytes([b1 ^ b2])
return result
def bxor_inplace(b1, b2):
result = bytearray(b1)
for i, b in enumerate(b2):
result[i] ^= b
return bytes(result)
def bxor_join(b1, b2): # use xor for bytes
parts = []
for b1, b2 in zip(b1, b2):
parts.append(bytes([b1 ^ b2]))
return b''.join(parts)
def bxor_append(b1, b2): # use xor for bytes
result = bytearray()
for b1, b2 in zip(b1, b2):
result.append(b1 ^ b2)
return bytes(result)
#>>>
from os import urandom
from timeit import Timer
from functools import partial
first_random = urandom(200000)
second_random = urandom(200000)
Timer(partial(bxor_add, first_random, second_random)).timeit(1)
#>>> 1.3261873809969984
Timer(partial(bxor_inplace, first_random, second_random)).timeit(1)
#>>> 0.03055390200461261
Timer(partial(bxor_join, first_random, second_random)).timeit(1)
#>>> 0.15852201101370156
Timer(partial(bxor_append, first_random, second_random)).timeit(1)
#>>> 0.030534288001945242
first_random = urandom(10000000)
second_random = urandom(10000000)
Timer(partial(bxor_inplace, first_random, second_random)).timeit(1)
#>>> 1.5432947289955337
Timer(partial(bxor_join, first_random, second_random)).timeit(1)
#>>> 7.90503858300508
Timer(partial(bxor_append, first_random, second_random)).timeit(1)
#>>> 1.5145326450001448
I'd go with the append
version for clarity and speed.
为了清晰和速度,我会使用追加版本。
For clarification, I don't think the append
method is meaningfully faster than the inplace
version; I just think it's a tiny bit more straightforward.
为了澄清,我认为append方法没有比inplace版本快得多;我只是认为它更简单一些。
Nevertheless, because it was requested:
然而,因为它被要求:
first_random = urandom(100000)
second_random = urandom(100000)
min(Timer(partial(bxor_inplace, first_random, second_random)).repeat(10, 100))
#>>> 1.5381054869794752
min(Timer(partial(bxor_append, first_random, second_random)).repeat(10, 100))
#>>> 1.5196998479950707