numpy genfromtxt在数据文件只有一行时折叠重新排列

I am using genfromtxt function to read data from a csv file.

我正在使用genfromtxt函数从csv文件中读取数据。

data = np.genfromtxt(file_name, dtype=np.dtype(input_vars), delimiter=",")

Then I can access the array columns with e.g.:

然后我可以访问数组列，例如：

data["My column name"]

which then returns a 1-dimensional vector. Only when the source file has exactly one data row, the array is collapsed - its shape==() and therefore the vector returned by data["My column name"] is not a vector but just a value and some subsequent functions fail because they expect a vector.

然后返回一维向量。仅当源文件只有一个数据行时，数组才会折叠 - 其形状为==（），因此数据[“我的列名”]返回的向量不是向量而只是一个值，而后续的某些函数会失败，因为他们期待一个向量。

What I need is to make it always a vector. In other words, I need that genfromtxt does not collapse the dimensionality of the array even if the data file has only one row.

我需要的是使它永远是一个矢量。换句话说，我需要genfromtxt不会折叠数组的维度，即使数据文件只有一行。

In other words, if the source data file has two rows, the data.shape==(2,). But if the source data file has only one row, the data.shape==() but I need it to be (1,). Then, if I am correct, data["My column name"] would return a vector (though with one element) and the subsequent functions would not fail.

换句话说，如果源数据文件有两行，则data.shape ==（2，）。但是如果源数据文件只有一行，则data.shape ==（）但我需要它（1，）。然后，如果我是正确的，数据[“我的列名称”]将返回一个向量（虽然有一个元素），后续函数不会失败。

How to do it? data.reshape((1,)) and np.atleast_1d(data) do not work for me for some strange reason, not sure why...

怎么做？ data.reshape（（1，））和np.atleast_1d（data）因某些奇怪的原因对我不起作用，不知道为什么......

Update:

更新：

I made a simple example to illustarate my problem.

我做了一个简单的例子来说明我的问题。

Suppose I have two files:

假设我有两个文件：

mydata1.csv which is one row:

mydata1.csv是一行：

1,2,3

and mydata2.csv which has two rows:

和mydata2.csv有两行：

1,2,3
4,5,6

This is the code snippet (problem described in the comments):

这是代码段（注释中描述的问题）：

import numpy as np
dt = [("A", "<i4"), ("B", "<i4"), ("C", "<i4")]
data2 = np.genfromtxt("mydata2.csv", dtype=dt, delimiter=",")
print(data2.shape)  # returns (2,)
data1 = np.genfromtxt("mydata1.csv", dtype=dt, delimiter=",")
print(data1.shape)  # returns () but I need it to return (1,)

data2["A"]  # returns a 1D vector with two values
data1["A"]  # returns a value (zero dimensional) bt I need a 1D vector with one value

All workarounds that I can come up with are a way too ugly and result in too much code refactoring. Ideally I would need to have always a 1-D recarray as the result of genfromtxt.

我能提出的所有变通方法都太难看了，导致代码重构过多。理想情况下，我需要始终使用genfromtxt的1-D重新排列。

2 个解决方案

#1

When you have only one line in the csv file you are obtaining data as a np.void object. You can use force data to be a np.ndarray doing:

当csv文件中只有一行时，您将获取数据作为np.void对象。您可以使用强制数据作为np.ndarray：

data = np.atleast_1d(data)

#2

Instead of genfromtxt, you could try loadtxt, with the argument ndmin=1. (Of course, this won't be an option if you are using some of the enhanced features of genfromtxt.)

您可以尝试使用参数ndmin = 1的loadtxt而不是genfromtxt。（当然，如果你使用genfromtxt的一些增强功能，这将不是一个选项。）

#1