I have a binary file which contains records of position of a plane. Each record look like:
我有一个包含平面位置记录的二进制文件。每个记录的样子:
0x00: Time, float32
0x04: X, float32 // X axis position
0x08: Y, float32 // Y axis position
0x0C: Elevation, float32
0x10: float32*4 = Quaternion (x,y,z axis and w scalar)
0x20: Distance, float32 (unused)
So each record is 32 bytes long.
每个记录有32字节长。
I would like to get a Numpy array.
我想要一个Numpy数组。
At offset 1859 there is an unsigned int 32 (4 bytes) which indicates the number of elements of the array. 12019 in my case.
在偏移量1859处有一个无符号int 32(4字节),它指示数组元素的数量。12019在我的例子中。
I don't care (for now) header data (before offset 1859)
我现在不关心页眉数据(在偏移1859之前)
Array only start at offset 1863 (=1859+4).
阵列只从1863年偏移开始(=1859+4)。
I defined my own Numpy dtype like
我定义了我自己的Numpy dtype
dtype = np.dtype([
("time", np.float32),
("PosX", np.float32),
("PosY", np.float32),
("Alt", np.float32),
("Qx", np.float32),
("Qy", np.float32),
("Qz", np.float32),
("Qw", np.float32),
("dist", np.float32),
])
And I'm reading file using fromfile
:
我用fromfile读取文件:
a_bytes = np.fromfile(filename, dtype=dtype)
But I don't see any parameter to provide to fromfile
to pass offset.
但是,我没有看到要向fromfile提供的任何参数来传递偏移量。
2 个解决方案
#1
12
You can open the file with a standard python file open, then seek to skip the header, then pass in the file object to fromfile
. Something like this:
您可以打开一个标准的python文件来打开文件,然后试图跳过头文件,然后将文件对象传递给fromfile。是这样的:
import numpy as np
import os
dtype = np.dtype([
("time", np.float32),
("PosX", np.float32),
("PosY", np.float32),
("Alt", np.float32),
("Qx", np.float32),
("Qy", np.float32),
("Qz", np.float32),
("Qw", np.float32),
("dist", np.float32),
])
f = open("myfile", "rb")
f.seek(1863, os.SEEK_SET)
data = np.fromfile(f, dtype=dtype)
print x
#2
3
I faced a similar problem, but none of the answers above satisfied me. I needed to implement something like virtual table with a very big number of binary records that potentially occupied more memory than I can afford in one numpy array. So my question was how to read and write a small set of integers from/to a binary file - a subset of a file into a subset of numpy array.
我也遇到过类似的问题,但上面的答案都不能让我满意。我需要实现一些东西,比如使用大量二进制记录的虚拟表,这些记录占用的内存可能比我在一个numpy数组中负担得起的内存要多。因此,我的问题是如何从二进制文件(文件的一个子集)到numpy数组的一个子集,读写一小组整数。
This is a solution that worked for me:
这是一个对我有效的解决方案:
import numpy as np
recordLen = 10 # number of int64's per record
recordSize = recordLen * 8 # size of a record in bytes
memArray = np.zeros(recordLen, dtype=np.int64) # a buffer for 1 record
# Create a binary file and open it for write+read
with open('BinaryFile.dat', 'w+b') as file:
# Writing the array into the file as record recordNo:
recordNo = 200 # the index of a target record in the file
file.seek(recordSize * recordNo)
bytes = memArray.tobytes()
file.write(bytes)
# Reading a record recordNo from file into the memArray
file.seek(recordSize * recordNo)
bytes = file.read(recordSize)
memArray = np.frombuffer(bytes, dtype=np.int64).copy()
# Note copy() added to make the memArray mutable
#1
12
You can open the file with a standard python file open, then seek to skip the header, then pass in the file object to fromfile
. Something like this:
您可以打开一个标准的python文件来打开文件,然后试图跳过头文件,然后将文件对象传递给fromfile。是这样的:
import numpy as np
import os
dtype = np.dtype([
("time", np.float32),
("PosX", np.float32),
("PosY", np.float32),
("Alt", np.float32),
("Qx", np.float32),
("Qy", np.float32),
("Qz", np.float32),
("Qw", np.float32),
("dist", np.float32),
])
f = open("myfile", "rb")
f.seek(1863, os.SEEK_SET)
data = np.fromfile(f, dtype=dtype)
print x
#2
3
I faced a similar problem, but none of the answers above satisfied me. I needed to implement something like virtual table with a very big number of binary records that potentially occupied more memory than I can afford in one numpy array. So my question was how to read and write a small set of integers from/to a binary file - a subset of a file into a subset of numpy array.
我也遇到过类似的问题,但上面的答案都不能让我满意。我需要实现一些东西,比如使用大量二进制记录的虚拟表,这些记录占用的内存可能比我在一个numpy数组中负担得起的内存要多。因此,我的问题是如何从二进制文件(文件的一个子集)到numpy数组的一个子集,读写一小组整数。
This is a solution that worked for me:
这是一个对我有效的解决方案:
import numpy as np
recordLen = 10 # number of int64's per record
recordSize = recordLen * 8 # size of a record in bytes
memArray = np.zeros(recordLen, dtype=np.int64) # a buffer for 1 record
# Create a binary file and open it for write+read
with open('BinaryFile.dat', 'w+b') as file:
# Writing the array into the file as record recordNo:
recordNo = 200 # the index of a target record in the file
file.seek(recordSize * recordNo)
bytes = memArray.tobytes()
file.write(bytes)
# Reading a record recordNo from file into the memArray
file.seek(recordSize * recordNo)
bytes = file.read(recordSize)
memArray = np.frombuffer(bytes, dtype=np.int64).copy()
# Note copy() added to make the memArray mutable