Is there an easier way to load an excel file directly into a Numpy array?
是否有更容易的方法将excel文件直接加载到Numpy数组中?
I have looked at the numpy.genfromtxt
autoloading function from numpy documentation but it doesn't load excel files directly.
我看过电影《麻木》。genfromtxt自定义功能从numpy文档,但它不直接加载excel文件。
array = np.genfromtxt("Stats.xlsx")
ValueError: Some errors were detected !
Line #3 (got 2 columns instead of 1)
Line #5 (got 5 columns instead of 1)
......
Right now I am using using openpyxl.reader.excel
to read the excel file and then append to numpy 2D arrays. This seems to be inefficient. Ideally I would like to have to excel file directly loaded to numpy 2D array.
现在我使用的是openpyx .reader。excel读取excel文件,然后附加到numpy 2D数组。这似乎是低效的。理想情况下,我希望能够将excel文件直接加载到numpy 2D数组中。
1 个解决方案
#1
11
Honestly, if you're working with heterogeneous data (as spreadsheets are likely to contain) using a pandas.DataFrame
is a better choice than using numpy
directly.
老实说,如果您正在使用熊猫处理异构数据(如电子表格可能包含的数据)。与直接使用numpy相比,DataFrame是更好的选择。
While pandas
is in some sense just a wrapper around numpy, it handles heterogeneous data very very nicely. (As well as a ton of other things... For "spreadsheet-like" data, it's the gold standard in the python world.)
虽然熊猫在某种意义上只是numpy的一个包装,但它非常好地处理异构数据。(还有很多其他的事情……)对于“类似于电子表格的”数据,它是python世界中的黄金标准。
If you decide to go that route, just use pandas.read_excel
.
如果您决定走这条路,只需使用pandas.read_excel。
#1
11
Honestly, if you're working with heterogeneous data (as spreadsheets are likely to contain) using a pandas.DataFrame
is a better choice than using numpy
directly.
老实说,如果您正在使用熊猫处理异构数据(如电子表格可能包含的数据)。与直接使用numpy相比,DataFrame是更好的选择。
While pandas
is in some sense just a wrapper around numpy, it handles heterogeneous data very very nicely. (As well as a ton of other things... For "spreadsheet-like" data, it's the gold standard in the python world.)
虽然熊猫在某种意义上只是numpy的一个包装,但它非常好地处理异构数据。(还有很多其他的事情……)对于“类似于电子表格的”数据,它是python世界中的黄金标准。
If you decide to go that route, just use pandas.read_excel
.
如果您决定走这条路,只需使用pandas.read_excel。