使用float类型的NaN创建空pandas DataFrame的优雅方法

时间:2021-11-25 20:22:09

I want to create a Pandas DataFrame filled with NaNs. During my research I found an answer:

我想创建一个充满NaN的Pandas DataFrame。在我的研究期间,我找到了答案:

import pandas as pd

df = pd.DataFrame(index=range(0,4),columns=['A'])

This code results in a DataFrame filled with NaNs of type "object". So they cannot be used later on for example with the interpolate() method. Therefore, I created the DataFrame with this complicated code (inspired by this answer):

此代码导致DataFrame填充了“object”类型的NaN。因此,以后不能使用它们,例如使用interpolate()方法。因此,我使用这个复杂的代码创建了DataFrame(灵感来自这个答案):

import pandas as pd
import numpy as np

dummyarray = np.empty((4,1))
dummyarray[:] = np.nan

df = pd.DataFrame(dummyarray)

This results in a DataFrame filled with NaN of type "float", so it can be used later on with interpolate(). Is there a more elegant way to create the same result?

这导致DataFrame填充了“float”类型的NaN,因此稍后可以使用interpolate()。是否有更优雅的方式来创建相同的结果?

3 个解决方案

#1


42  

Simply pass the desired representative as a scalar first argument, like 0, math.inf or, in this case, np.nan. The constructor then initializes the value array to the size specified by index and columns:

只需将所需的代表作为标量第一个参数传递,如0,math.inf,或者在本例中为np.nan。然后构造函数将value数组初始化为index和columns指定的大小:

 >>> df = pd.DataFrame(np.nan, index=[0,1,2,3], columns=['A'])
 >>> df.dtypes
 A    float64
 dtype: object

#2


10  

You could specify the dtype directly when constructing the DataFrame:

您可以在构造DataFrame时直接指定dtype:

>>> df = pd.DataFrame(index=range(0,4),columns=['A'], dtype='float')
>>> df.dtypes
A    float64
dtype: object

Specifying the dtype forces Pandas to try creating the DataFrame with that type, rather than trying to infer it.

指定dtype会强制Pandas尝试使用该类型创建DataFrame,而不是尝试推断它。

#3


2  

Hope this can help!

希望这可以帮助你!

 pd.DataFrame(np.nan, index = np.arange(<num_rows>), columns = ['A'])

#1


42  

Simply pass the desired representative as a scalar first argument, like 0, math.inf or, in this case, np.nan. The constructor then initializes the value array to the size specified by index and columns:

只需将所需的代表作为标量第一个参数传递,如0,math.inf,或者在本例中为np.nan。然后构造函数将value数组初始化为index和columns指定的大小:

 >>> df = pd.DataFrame(np.nan, index=[0,1,2,3], columns=['A'])
 >>> df.dtypes
 A    float64
 dtype: object

#2


10  

You could specify the dtype directly when constructing the DataFrame:

您可以在构造DataFrame时直接指定dtype:

>>> df = pd.DataFrame(index=range(0,4),columns=['A'], dtype='float')
>>> df.dtypes
A    float64
dtype: object

Specifying the dtype forces Pandas to try creating the DataFrame with that type, rather than trying to infer it.

指定dtype会强制Pandas尝试使用该类型创建DataFrame,而不是尝试推断它。

#3


2  

Hope this can help!

希望这可以帮助你!

 pd.DataFrame(np.nan, index = np.arange(<num_rows>), columns = ['A'])