将int转换为Python 3中的字节。

时间:2021-11-21 18:10:48

I was trying to build this bytes object in Python 3:

我试着用Python 3构建这个bytes对象:

b'3\r\n'

b“3 \ r \ n”

so I tried the obvious (for me), and found a weird behaviour:

所以我尝试了一些显而易见的方法,发现了一种奇怪的行为:

>>> bytes(3) + b'\r\n'
b'\x00\x00\x00\r\n'

Apparently:

显然:

>>> bytes(10)
b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'

I've been unable to see any pointers on why the bytes conversion works this way reading the documentation. However, I did find some surprise messages in this Python issue about adding format to bytes (see also Python 3 bytes formatting):

我一直没有看到任何关于字节转换为什么以这种方式读取文档的指针。但是,在这个Python问题中,我确实发现了一些令人惊讶的消息,关于向字节添加格式(参见Python 3字节格式):

http://bugs.python.org/issue3982

http://bugs.python.org/issue3982

This interacts even more poorly with oddities like bytes(int) returning zeroes now

这与字节(int)返回0这样的奇怪东西交互得更差

and:

和:

It would be much more convenient for me if bytes(int) returned the ASCIIfication of that int; but honestly, even an error would be better than this behavior. (If I wanted this behavior - which I never have - I'd rather it be a classmethod, invoked like "bytes.zeroes(n)".)

如果bytes(int)返回该int的ascii化,对我来说会方便得多;但老实说,即使是一个错误也比这种行为好。(如果我想要这种行为——我从来没有这样做过——我宁愿它是一个类方法,像“bytes.zeroes(n)”那样被调用。)

Can someone explain me where this behaviour comes from?

有人能解释一下这种行为从何而来吗?

9 个解决方案

#1


83  

That's the way it was designed - and it makes sense because usually, you would call bytes on an iterable instead of a single integer:

这就是它的设计方法——这是有意义的,因为通常,你会在迭代中调用字节而不是单个整数:

>>> bytes([3])
b'\x03'

The docs state this, as well as the docstring for bytes:

文档说明了这一点,以及用于字节的docstring:

 >>> help(bytes)
 ...
 bytes(int) -> bytes object of size given by the parameter initialized with null bytes

#2


79  

From python 3.2 you can do

从python 3.2中可以做到

>>> (1024).to_bytes(2, byteorder='big')
b'\x04\x00'

https://docs.python.org/3/library/stdtypes.html#int.to_bytes

https://docs.python.org/3/library/stdtypes.html int.to_bytes

def int_to_bytes(x):
    return x.to_bytes((x.bit_length() + 7) // 8, 'big')

def int_from_bytes(xbytes):
    return int.from_bytes(xbytes, 'big')

Accordingly, x == int_from_bytes(int_to_bytes(x)).

因此,x = = int_from_bytes(int_to_bytes(x))。

#3


23  

You can use the struct's pack:

你可以使用结构体包装:

In [11]: struct.pack(">I", 1)
Out[11]: '\x00\x00\x00\x01'

The ">" is the byte-order (big-endian) and the "I" is the format character. So you can be specific if you want to do something else:

“>”是字节顺序(big-endian),“I”是格式字符。所以如果你想做别的事情,你可以具体一点:

In [12]: struct.pack("<H", 1)
Out[12]: '\x01\x00'

In [13]: struct.pack("B", 1)
Out[13]: '\x01'

This works the same on both python 2 and python 3.

这在python 2和python 3上都是一样的。

Note: the inverse operation (bytes to int) can be done with unpack.

注意:逆操作(从字节到int)可以用unpack完成。

#4


10  

Python 3.5+ introduces %-interpolation (printf-style formatting) for bytes:

Python 3.5+为字节引入了%-内插(printf格式):

>>> b'%d\r\n' % 3
b'3\r\n'

See PEP 0461 -- Adding % formatting to bytes and bytearray.

参见PEP 0461——向字节和bytearray添加%格式。

On earlier versions, you could use str and .encode('ascii') the result:

在早期版本中,您可以使用str和.encode('ascii')的结果:

>>> s = '%d\r\n' % 3
>>> s.encode('ascii')
b'3\r\n'

Note: It is different from what int.to_bytes produces:

注意:它与int.to_bytes不同:

>>> n = 3
>>> n.to_bytes((n.bit_length() + 7) // 8, 'big') or b'\0'
b'\x03'
>>> b'3' == b'\x33' != '\x03'
True

#5


9  

The documentation says:

文档表示:

bytes(int) -> bytes object of size given by the parameter
              initialized with null bytes

The sequence:

序列:

b'3\r\n'

It is the character '3' (decimal 51) the character '\r' (13) and '\n' (10).

它是字符'3' (decimal 51)字符'\r'(13)和'\n'(10)。

Therefore, the way would treat it as such, for example:

因此,对待它的方式是这样的,例如:

>>> bytes([51, 13, 10])
b'3\r\n'

>>> bytes('3', 'utf8') + b'\r\n'
b'3\r\n'

>>> n = 3
>>> bytes(str(n), 'ascii') + b'\r\n'
b'3\r\n'

Tested on IPython 1.1.0 & Python 3.2.3

在IPython 1.1.0和Python 3.2.3上进行测试

#6


5  

The ASCIIfication of 3 is "\x33" not "\x03"!

3的ascii码是“\x33”而不是“\x03”!

That is what python does for str(3) but it would be totally wrong for bytes, as they should be considered arrays of binary data and not be abused as strings.

这就是python对于str(3)所做的,但是对于字节来说是完全错误的,因为它们应该被认为是二进制数据的数组,而不是字符串。

The most easy way to achieve what you want is bytes((3,)), which is better than bytes([3]) because initializing a list is much more expensive, so never use lists when you can use tuples. You can convert bigger integers by using int.to_bytes(3, "little").

实现所需的最简单的方法是字节((3,)),这比字节([3])要好,因为初始化列表要昂贵得多,所以在使用元组时不要使用列表。您可以使用int.to_bytes(3,“little”)来转换较大的整数。

Initializing bytes with a given length makes sense and is the most useful, as they are often used to create some type of buffer for which you need some memory of given size allocated. I often use this when initializing arrays or expanding some file by writing zeros to it.

使用给定长度初始化字节是有意义的,也是最有用的,因为它们通常用于创建某种类型的缓冲区,您需要为其分配一定大小的内存。在初始化数组或将某个文件写入0时,我经常使用这个方法。

#7


4  

int (including Python2's long) can be converted to bytes using following function:

int(包括Python2的long)可以通过以下函数转换成字节:

import codecs

def int2bytes(i):
    hex_value = '{0:x}'.format(i)
    # make length of hex_value a multiple of two
    hex_value = '0' * (len(hex_value) % 2) + hex_value
    return codecs.decode(hex_value, 'hex_codec')

The reverse conversion can be done by another one:

反向转换可以由另一个进行:

import codecs
import six  # should be installed via 'pip install six'

long = six.integer_types[-1]

def bytes2int(b):
    return long(codecs.encode(b, 'hex_codec'), 16)

Both functions work on both Python2 and Python3.

这两个函数都适用于Python2和Python3。

#8


3  

The behaviour comes from the fact that in Python prior to version 3 bytes was just an alias for str. In Python3.x bytes is an immutable version of bytearray - completely new type, not backwards compatible.

这种行为源于以下事实:在Python中,3字节之前只是str的别名。x字节是bytearray的不可变版本——完全是新类型,而不是向后兼容。

#9


3  

From bytes docs:

从字节文档:

Accordingly, constructor arguments are interpreted as for bytearray().

因此,构造函数参数被解释为bytearray()。

Then, from bytearray docs:

然后,从中bytearray文档:

The optional source parameter can be used to initialize the array in a few different ways:

可选的源参数可以用几种不同的方式初始化数组:

  • If it is an integer, the array will have that size and will be initialized with null bytes.
  • 如果它是一个整数,那么数组将具有这个大小,并将使用空字节初始化。

Note, that differs from 2.x (where x >= 6) behavior, where bytes is simply str:

注意,这与2不同。x(其中x >= 6)行为,其中字节为str:

>>> bytes is str
True

PEP 3112:

PEP 3112:

The 2.6 str differs from 3.0’s bytes type in various ways; most notably, the constructor is completely different.

2.6 str不同于3.0的字节类型有很多不同之处;最值得注意的是,构造函数完全不同。

#1


83  

That's the way it was designed - and it makes sense because usually, you would call bytes on an iterable instead of a single integer:

这就是它的设计方法——这是有意义的,因为通常,你会在迭代中调用字节而不是单个整数:

>>> bytes([3])
b'\x03'

The docs state this, as well as the docstring for bytes:

文档说明了这一点,以及用于字节的docstring:

 >>> help(bytes)
 ...
 bytes(int) -> bytes object of size given by the parameter initialized with null bytes

#2


79  

From python 3.2 you can do

从python 3.2中可以做到

>>> (1024).to_bytes(2, byteorder='big')
b'\x04\x00'

https://docs.python.org/3/library/stdtypes.html#int.to_bytes

https://docs.python.org/3/library/stdtypes.html int.to_bytes

def int_to_bytes(x):
    return x.to_bytes((x.bit_length() + 7) // 8, 'big')

def int_from_bytes(xbytes):
    return int.from_bytes(xbytes, 'big')

Accordingly, x == int_from_bytes(int_to_bytes(x)).

因此,x = = int_from_bytes(int_to_bytes(x))。

#3


23  

You can use the struct's pack:

你可以使用结构体包装:

In [11]: struct.pack(">I", 1)
Out[11]: '\x00\x00\x00\x01'

The ">" is the byte-order (big-endian) and the "I" is the format character. So you can be specific if you want to do something else:

“>”是字节顺序(big-endian),“I”是格式字符。所以如果你想做别的事情,你可以具体一点:

In [12]: struct.pack("<H", 1)
Out[12]: '\x01\x00'

In [13]: struct.pack("B", 1)
Out[13]: '\x01'

This works the same on both python 2 and python 3.

这在python 2和python 3上都是一样的。

Note: the inverse operation (bytes to int) can be done with unpack.

注意:逆操作(从字节到int)可以用unpack完成。

#4


10  

Python 3.5+ introduces %-interpolation (printf-style formatting) for bytes:

Python 3.5+为字节引入了%-内插(printf格式):

>>> b'%d\r\n' % 3
b'3\r\n'

See PEP 0461 -- Adding % formatting to bytes and bytearray.

参见PEP 0461——向字节和bytearray添加%格式。

On earlier versions, you could use str and .encode('ascii') the result:

在早期版本中,您可以使用str和.encode('ascii')的结果:

>>> s = '%d\r\n' % 3
>>> s.encode('ascii')
b'3\r\n'

Note: It is different from what int.to_bytes produces:

注意:它与int.to_bytes不同:

>>> n = 3
>>> n.to_bytes((n.bit_length() + 7) // 8, 'big') or b'\0'
b'\x03'
>>> b'3' == b'\x33' != '\x03'
True

#5


9  

The documentation says:

文档表示:

bytes(int) -> bytes object of size given by the parameter
              initialized with null bytes

The sequence:

序列:

b'3\r\n'

It is the character '3' (decimal 51) the character '\r' (13) and '\n' (10).

它是字符'3' (decimal 51)字符'\r'(13)和'\n'(10)。

Therefore, the way would treat it as such, for example:

因此,对待它的方式是这样的,例如:

>>> bytes([51, 13, 10])
b'3\r\n'

>>> bytes('3', 'utf8') + b'\r\n'
b'3\r\n'

>>> n = 3
>>> bytes(str(n), 'ascii') + b'\r\n'
b'3\r\n'

Tested on IPython 1.1.0 & Python 3.2.3

在IPython 1.1.0和Python 3.2.3上进行测试

#6


5  

The ASCIIfication of 3 is "\x33" not "\x03"!

3的ascii码是“\x33”而不是“\x03”!

That is what python does for str(3) but it would be totally wrong for bytes, as they should be considered arrays of binary data and not be abused as strings.

这就是python对于str(3)所做的,但是对于字节来说是完全错误的,因为它们应该被认为是二进制数据的数组,而不是字符串。

The most easy way to achieve what you want is bytes((3,)), which is better than bytes([3]) because initializing a list is much more expensive, so never use lists when you can use tuples. You can convert bigger integers by using int.to_bytes(3, "little").

实现所需的最简单的方法是字节((3,)),这比字节([3])要好,因为初始化列表要昂贵得多,所以在使用元组时不要使用列表。您可以使用int.to_bytes(3,“little”)来转换较大的整数。

Initializing bytes with a given length makes sense and is the most useful, as they are often used to create some type of buffer for which you need some memory of given size allocated. I often use this when initializing arrays or expanding some file by writing zeros to it.

使用给定长度初始化字节是有意义的,也是最有用的,因为它们通常用于创建某种类型的缓冲区,您需要为其分配一定大小的内存。在初始化数组或将某个文件写入0时,我经常使用这个方法。

#7


4  

int (including Python2's long) can be converted to bytes using following function:

int(包括Python2的long)可以通过以下函数转换成字节:

import codecs

def int2bytes(i):
    hex_value = '{0:x}'.format(i)
    # make length of hex_value a multiple of two
    hex_value = '0' * (len(hex_value) % 2) + hex_value
    return codecs.decode(hex_value, 'hex_codec')

The reverse conversion can be done by another one:

反向转换可以由另一个进行:

import codecs
import six  # should be installed via 'pip install six'

long = six.integer_types[-1]

def bytes2int(b):
    return long(codecs.encode(b, 'hex_codec'), 16)

Both functions work on both Python2 and Python3.

这两个函数都适用于Python2和Python3。

#8


3  

The behaviour comes from the fact that in Python prior to version 3 bytes was just an alias for str. In Python3.x bytes is an immutable version of bytearray - completely new type, not backwards compatible.

这种行为源于以下事实:在Python中,3字节之前只是str的别名。x字节是bytearray的不可变版本——完全是新类型,而不是向后兼容。

#9


3  

From bytes docs:

从字节文档:

Accordingly, constructor arguments are interpreted as for bytearray().

因此,构造函数参数被解释为bytearray()。

Then, from bytearray docs:

然后,从中bytearray文档:

The optional source parameter can be used to initialize the array in a few different ways:

可选的源参数可以用几种不同的方式初始化数组:

  • If it is an integer, the array will have that size and will be initialized with null bytes.
  • 如果它是一个整数,那么数组将具有这个大小,并将使用空字节初始化。

Note, that differs from 2.x (where x >= 6) behavior, where bytes is simply str:

注意,这与2不同。x(其中x >= 6)行为,其中字节为str:

>>> bytes is str
True

PEP 3112:

PEP 3112:

The 2.6 str differs from 3.0’s bytes type in various ways; most notably, the constructor is completely different.

2.6 str不同于3.0的字节类型有很多不同之处;最值得注意的是,构造函数完全不同。