具有1位条目的numpy布尔数组

时间:2022-06-28 01:48:52

Is there a way in numpy to create an array of booleans that uses just 1 bit for each entry?

有没有办法在numpy中创建一个布尔数组,每个条目只使用1位?

The standard np.bool type is 1 byte, but this way I use 8 times the required memory.

标准的np.bool类型是1个字节,但这样我使用所需内存的8倍。

On Google I found that C++ has std::vector<bool>.

在谷歌上我发现C ++有std :: vector

3 个解决方案

#1


10  

You might like to take a look at bitstring (documentation here).

您可能想看看bitstring(这里的文档)。

If you create a ConstBitArray or ConstBitStream from a file then it will use mmap and not load it into memory. In this case it won't be mutable so if you want to make changes it will have to be loaded in memory.

如果从文件创建ConstBitArray或ConstBitStream,则它将使用mmap而不将其加载到内存中。在这种情况下,它不会是可变的,所以如果你想进行更改,它必须加载到内存中。

For example to create without loading into memory:

例如,无需加载到内存中即可创建:

>>> a = bitstring.ConstBitArray(filename='your_file')

or

>>> b = bitstring.ConstBitStream(a_file_object)

#2


14  

You want a bitarray:

你想要一个比特币:

efficient arrays of booleans -- C extension

高效的布尔数组 - C扩展

This module provides an object type which efficiently represents an array of booleans. Bitarrays are sequence types and behave very much like usual lists. Eight bits are represented by one byte in a contiguous block of memory. The user can select between two representations; little-endian and big-endian. All of the functionality is implemented in C. Methods for accessing the machine representation are provided. This can be useful when bit level access to binary files is required, such as portable bitmap image files (.pbm). Also, when dealing with compressed data which uses variable bit length encoding, you may find this module useful...

该模块提供了一种有效表示布尔数组的对象类型。 Bitarrays是序列类型,其行为与通常的列表非常相似。八个位由连续的存储器块中的一个字节表示。用户可以在两种表示之间进行选择; little-endian和big-endian。所有功能都在C中实现。提供了用于访问机器表示的方法。当需要对二进制文件进行位级访问时,例如便携式位图图像文件(.pbm),这可能很有用。此外,在处理使用可变位长编码的压缩数据时,您可能会发现此模块很有用......

#3


6  

To do this you can use numpy's native packbits and unpackbits. The first function is straight-forward to use, but to reconstruct you will need additional manipulations. Here is an example:

为此,您可以使用numpy的本机packbits和unpackbits。第一个功能是直接使用,但要重建,您将需要额外的操作。这是一个例子:

import numpy as np
# original boolean array
A1 = np.array([
    [0, 1, 1, 0, 1],
    [0, 0, 1, 1, 1],
    [1, 1, 1, 1, 1],
], dtype=np.bool)

# packed data
A2 = np.packbits(A1, axis=None)

# checking the size
print len(A1.tostring()) # 15 bytes
print len(A2.tostring()) #  2 bytes (ceil(15/8))

# reconstructing from packed data. You need to resize and reshape
A3 = np.unpackbits(A2, axis=None)[:A1.size].reshape(A1.shape).astype(np.bool)

# and the arrays are equal
print np.array_equal(A1, A3) # True

#1


10  

You might like to take a look at bitstring (documentation here).

您可能想看看bitstring(这里的文档)。

If you create a ConstBitArray or ConstBitStream from a file then it will use mmap and not load it into memory. In this case it won't be mutable so if you want to make changes it will have to be loaded in memory.

如果从文件创建ConstBitArray或ConstBitStream,则它将使用mmap而不将其加载到内存中。在这种情况下,它不会是可变的,所以如果你想进行更改,它必须加载到内存中。

For example to create without loading into memory:

例如,无需加载到内存中即可创建:

>>> a = bitstring.ConstBitArray(filename='your_file')

or

>>> b = bitstring.ConstBitStream(a_file_object)

#2


14  

You want a bitarray:

你想要一个比特币:

efficient arrays of booleans -- C extension

高效的布尔数组 - C扩展

This module provides an object type which efficiently represents an array of booleans. Bitarrays are sequence types and behave very much like usual lists. Eight bits are represented by one byte in a contiguous block of memory. The user can select between two representations; little-endian and big-endian. All of the functionality is implemented in C. Methods for accessing the machine representation are provided. This can be useful when bit level access to binary files is required, such as portable bitmap image files (.pbm). Also, when dealing with compressed data which uses variable bit length encoding, you may find this module useful...

该模块提供了一种有效表示布尔数组的对象类型。 Bitarrays是序列类型,其行为与通常的列表非常相似。八个位由连续的存储器块中的一个字节表示。用户可以在两种表示之间进行选择; little-endian和big-endian。所有功能都在C中实现。提供了用于访问机器表示的方法。当需要对二进制文件进行位级访问时,例如便携式位图图像文件(.pbm),这可能很有用。此外,在处理使用可变位长编码的压缩数据时,您可能会发现此模块很有用......

#3


6  

To do this you can use numpy's native packbits and unpackbits. The first function is straight-forward to use, but to reconstruct you will need additional manipulations. Here is an example:

为此,您可以使用numpy的本机packbits和unpackbits。第一个功能是直接使用,但要重建,您将需要额外的操作。这是一个例子:

import numpy as np
# original boolean array
A1 = np.array([
    [0, 1, 1, 0, 1],
    [0, 0, 1, 1, 1],
    [1, 1, 1, 1, 1],
], dtype=np.bool)

# packed data
A2 = np.packbits(A1, axis=None)

# checking the size
print len(A1.tostring()) # 15 bytes
print len(A2.tostring()) #  2 bytes (ceil(15/8))

# reconstructing from packed data. You need to resize and reshape
A3 = np.unpackbits(A2, axis=None)[:A1.size].reshape(A1.shape).astype(np.bool)

# and the arrays are equal
print np.array_equal(A1, A3) # True