numpy genfromtxt在数据文件只有一行时折叠重新排列

时间:2022-09-11 16:23:02

I am using genfromtxt function to read data from a csv file.

我正在使用genfromtxt函数从csv文件中读取数据。

data = np.genfromtxt(file_name, dtype=np.dtype(input_vars), delimiter=",")

Then I can access the array columns with e.g.:

然后我可以访问数组列,例如:

data["My column name"]

which then returns a 1-dimensional vector. Only when the source file has exactly one data row, the array is collapsed - its shape==() and therefore the vector returned by data["My column name"] is not a vector but just a value and some subsequent functions fail because they expect a vector.

然后返回一维向量。仅当源文件只有一个数据行时,数组才会折叠 - 其形状为==(),因此数据[“我的列名”]返回的向量不是向量而只是一个值,而后续的某些函数会失败,因为他们期待一个向量。

What I need is to make it always a vector. In other words, I need that genfromtxt does not collapse the dimensionality of the array even if the data file has only one row.

我需要的是使它永远是一个矢量。换句话说,我需要genfromtxt不会折叠数组的维度,即使数据文件只有一行。

In other words, if the source data file has two rows, the data.shape==(2,). But if the source data file has only one row, the data.shape==() but I need it to be (1,). Then, if I am correct, data["My column name"] would return a vector (though with one element) and the subsequent functions would not fail.

换句话说,如果源数据文件有两行,则data.shape ==(2,)。但是如果源数据文件只有一行,则data.shape ==()但我需要它(1,)。然后,如果我是正确的,数据[“我的列名称”]将返回一个向量(虽然有一个元素),后续函数不会失败。

How to do it? data.reshape((1,)) and np.atleast_1d(data) do not work for me for some strange reason, not sure why...

怎么做? data.reshape((1,))和np.atleast_1d(data)因某些奇怪的原因对我不起作用,不知道为什么......

Update:

更新:

I made a simple example to illustarate my problem.

我做了一个简单的例子来说明我的问题。

Suppose I have two files:

假设我有两个文件:

mydata1.csv which is one row:

mydata1.csv是一行:

1,2,3

and mydata2.csv which has two rows:

和mydata2.csv有两行:

1,2,3
4,5,6

This is the code snippet (problem described in the comments):

这是代码段(注释中描述的问题):

import numpy as np
dt = [("A", "<i4"), ("B", "<i4"), ("C", "<i4")]
data2 = np.genfromtxt("mydata2.csv", dtype=dt, delimiter=",")
print(data2.shape)  # returns (2,)
data1 = np.genfromtxt("mydata1.csv", dtype=dt, delimiter=",")
print(data1.shape)  # returns () but I need it to return (1,)

data2["A"]  # returns a 1D vector with two values
data1["A"]  # returns a value (zero dimensional) bt I need a 1D vector with one value

All workarounds that I can come up with are a way too ugly and result in too much code refactoring. Ideally I would need to have always a 1-D recarray as the result of genfromtxt.

我能提出的所有变通方法都太难看了,导致代码重构过多。理想情况下,我需要始终使用genfromtxt的1-D重新排列。

2 个解决方案

#1


0  

When you have only one line in the csv file you are obtaining data as a np.void object. You can use force data to be a np.ndarray doing:

当csv文件中只有一行时,您将获取数据作为np.void对象。您可以使用强制数据作为np.ndarray:

data = np.atleast_1d(data)

#2


0  

Instead of genfromtxt, you could try loadtxt, with the argument ndmin=1. (Of course, this won't be an option if you are using some of the enhanced features of genfromtxt.)

您可以尝试使用参数ndmin = 1的loadtxt而不是genfromtxt。 (当然,如果你使用genfromtxt的一些增强功能,这将不是一个选项。)

#1


0  

When you have only one line in the csv file you are obtaining data as a np.void object. You can use force data to be a np.ndarray doing:

当csv文件中只有一行时,您将获取数据作为np.void对象。您可以使用强制数据作为np.ndarray:

data = np.atleast_1d(data)

#2


0  

Instead of genfromtxt, you could try loadtxt, with the argument ndmin=1. (Of course, this won't be an option if you are using some of the enhanced features of genfromtxt.)

您可以尝试使用参数ndmin = 1的loadtxt而不是genfromtxt。 (当然,如果你使用genfromtxt的一些增强功能,这将不是一个选项。)