Python中的字符串相关转换

Python 3.X 中，默认的字符串str为unicode编码，在普通的使用中基本可以不用考虑。
使用int(str) float(str) 等函数进行字符串到数值的转换。
使用 % 可进行C语言中printf风格的字符串输出，此外，也有Python更为强大的string.Formatter 类其中的format函数
使用举例

>>> print('%(language)s has %(number)03d quote types.' %
...       {'language': "Python", "number": 2})
Python has 002 quote types.

>>> '{0}, {1}, {2}'.format('a', 'b', 'c')
'a, b, c'

>>> 'Coordinates: {latitude}, {longitude}'.format(latitude='37.24N', longitude='-115.81W')
'Coordinates: 37.24N, -115.81W'

>>> import datetime
>>> d = datetime.datetime(2010, 7, 4, 12, 15, 58)
>>> '{:%Y-%m-%d %H:%M:%S}'.format(d)
'2010-07-04 12:15:58'

>>> octets = [192, 168, 0, 1]
>>> '{:02X}{:02X}{:02X}{:02X}'.format(*octets)
'C0A80001'

希望在str表示与实际字节之间转换时，使用
str = bytes.decode(encoding=”utf-8”, errors=”strict”) 和
bytes = str.encode(encoding=”utf-8”, errors=”strict”) 函数
特别关注此二函数的命名与转换方向

bytes是Python中的字节区，不可更改(immutable)，bytearray是可更改版本
为了将其转为int等类型，需要使用Struct类的函数
struct.unpack(fmt, buffer)
其中，fmt是描述数据格式的字符串

Sym	C Type	Python Type	Size
x	pad byte	no value
c	char	bytes of length 1	1
b	signed char	integer	1
B	unsigned char	integer	1
?	_Bool	bool	1
i	int	integer	4
I	unsigned int	integer	4
l	long	integer	4
L	unsigned long integer 4

用法举例如下：

>>> from struct import *
>>> unpack('hhl', b'\x00\x01\x00\x02\x00\x00\x00\x03')
(1, 2, 3)
>>> calcsize('hhl')
8

>>>a = bytes([1,55,0xab])
>>>print(a)
b'\x017\xab'
>>>struct.unpack('3B',a) 
#也可struct.unpack(str(len(a)) + 'B', a)
(1, 55, 171)

此外还有struct.pack(fmt, v1, v2, …)

>>> pack('hhl', 1, 2, 3)
b'\x00\x01\x00\x02\x00\x00\x00\x03'

注意对齐问题

>>> pack('ci', b'*', 0x12131415)
b'*\x00\x00\x00\x12\x13\x14\x15'
>>> pack('ic', 0x12131415, b'*')
b'\x12\x13\x14\x15*'

与bytes不同的设计，ctypes.create_string_buffer(b, size)函数创建的c_char_array对象，含有的raw对象可以直接用[idx]得到int类型

>>> a = ctypes.create_string_buffer(b'\000',2)
>>> b = a.raw[0]
>>> type(b)
<class 'int'>

秒客网

Python中的字符串相关转换

相关文章