在Python 3中将二进制字符串转换为bytearray

Despite the many related questions, I can't find any that match my problem. I'd like to change a binary string (for example, "0110100001101001") into a byte array (same example, b"hi").

尽管存在许多相关问题,但我找不到任何与我的问题相符的问题。我想将二进制字符串(例如,“0110100001101001”)更改为字节数组(相同示例,b“hi”)。

I tried this:

我试过这个:

bytes([int(i) for i in "0110100001101001"])

but I got:

但我得到了:

b'\x00\x01\x01\x00\x01' #... and so on

What's the correct way to do this in Python 3?

在Python 3中执行此操作的正确方法是什么?

3 个解决方案

#1

Here's an example of doing it the first way that Patrick mentioned: convert the bitstring to an int and take 8 bits at a time. The natural way to do that generates the bytes in reverse order. To get the bytes back into the proper order I use extended slice notation on the bytearray with a step of -1: b[::-1].

下面是Patrick提到的第一种方式:将bitstring转换为int并一次取8位。自然的方法是以相反的顺序生成字节。为了使字节恢复到正确的顺序,我在bytearray上使用扩展切片表示法,步长为-1:b [:: - 1]。

def bitstring_to_bytes(s):
    v = int(s, 2)
    b = bytearray()
    while v:
        b.append(v & 0xff)
        v >>= 8
    return bytes(b[::-1])

s = "0110100001101001"
print(bitstring_to_bytes(s))

Clearly, Patrick's second way is more compact. :)

显然,帕特里克的第二种方式更紧凑。 :)

However, there's a better way to do this in Python 3: use the int.to_bytes method:

但是,在Python 3中有更好的方法:使用int.to_bytes方法:

def bitstring_to_bytes(s):
    return int(s, 2).to_bytes(len(s) // 8, byteorder='big')

#2

>>> zero_one_string = "0110100001101001"
>>> int(zero_one_string, 2).to_bytes((len(zero_one_string) + 7) // 8, 'big')
b'hi'

It returns bytes object that is an immutable sequence of bytes. If you want to get a bytearray -- a mutable sequence of bytes -- then just call bytearray(b'hi').

它返回字节对象,它是一个不可变的字节序列。如果你想获得一个bytearray - 一个可变的字节序列 - 那么只需调用bytearray(b'hi')。

#3

You have to either convert it to an int and take 8 bits at a time, or chop it into 8 byte long strings and then convert each of them into ints. In Python 3, as PM 2Ring and J.F Sebastian's answers show, the to_bytes() method of int allows you to do the first method very efficiently. This is not available in Python 2, so for people stuck with that, the second method may be more efficient. Here is an example:

您必须将其转换为int并一次取8位,或将其切换为8字节长的字符串,然后将每个字符串转换为整数。在Python 3中,正如PM 2Ring和J.F Sebastian的答案所示,int的to_bytes()方法允许您非常有效地执行第一种方法。这在Python 2中不可用,因此对于那些坚持使用它的人来说,第二种方法可能更有效。这是一个例子:

>>> s = "0110100001101001"
>>> bytes(int(s[i : i + 8], 2) for i in range(0, len(s), 8))
b'hi'

To break this down, the range statement starts at index 0, and gives us indices into the source string, but advances 8 indices at a time. Since s is 16 characters long, it will give us two indices:

为了解决这个问题,range语句从索引0开始,并将索引提供给源字符串,但一次提前8个索引。由于s是16个字符长,它将给我们两个索引:

>>> list(range(0, 50, 8))
[0, 8, 16, 24, 32, 40, 48]
>>> list(range(0, len(s), 8))
[0, 8]

(We use list() here to show the values that will be retrieved from the range iterator in Python 3.)

(我们在这里使用list()来显示将从Python 3中的范围迭代器中检索的值。)

We can then build on this to break the string apart by taking slices of it that are 8 characters long:

然后我们可以在此基础上通过采用长度为8个字符的切片来打破字符串:

>>> [s[i : i + 8] for i in range(0, len(s), 8)]
['01101000', '01101001']

Then we can convert each of those into integers, base 2:

然后我们可以将每个转换为整数,基数2:

>>> list(int(s[i : i + 8], 2) for i in range(0, len(s), 8))
[104, 105]

And finally, we wrap the whole thing in bytes() to get the answer:

最后,我们用bytes()包装整个东西来得到答案:

>>> bytes(int(s[i : i + 8], 2) for i in range(0, len(s), 8))
b'hi'

#1

def bitstring_to_bytes(s):
    v = int(s, 2)
    b = bytearray()
    while v:
        b.append(v & 0xff)
        v >>= 8
    return bytes(b[::-1])

s = "0110100001101001"
print(bitstring_to_bytes(s))