I have a numpy array (a):
我有一个numpy数组(a):
array([[ 1. , 5.1, 3.5, 1.4, 0.2],
[ 1. , 4.9, 3. , 1.4, 0.2],
[ 2. , 4.7, 3.2, 1.3, 0.2],
[ 2. , 4.6, 3.1, 1.5, 0.2]])
I would like to make a pandas dataframe (pd) with values=a, columns= A,B,C,D and index= to the first column of my numpy array, finally it should looks like this:
我想在我的numpy数组的第一列创建一个prandas dataframe(pd),其值为= a,columns = A,B,C,D和index =,最后它应该如下所示:
A B C D
1 5.1 3.5 1.4 0.2
1 4.9 3.0 1.4 0.2
2 4.7 3.2 1.3 0.2
2 4.6 3.1 1.5 0.2
I am trying this:
我正在尝试这个:
df = pd.DataFrame(a, index=a[:,0], columns=['A', 'B','C','D'])
and I get the following error:
我收到以下错误:
ValueError: Shape of passed values is (5, 4), indices imply (4, 4)
Any help? Thanks
有帮助吗?谢谢
1 个解决方案
#1
7
You passed the complete array as the data
param, you need to slice your array also if you want just 4 columns from the array as the data:
您将完整数组作为数据参数传递,如果您只想从数组中选择4列作为数据,则还需要对数组进行切片:
In [158]:
df = pd.DataFrame(a[:,1:], index=a[:,0], columns=['A', 'B','C','D'])
df
Out[158]:
A B C D
1 5.1 3.5 1.4 0.2
1 4.9 3.0 1.4 0.2
2 4.7 3.2 1.3 0.2
2 4.6 3.1 1.5 0.2
Also having duplicate values in the index will make filtering/indexing problematic
索引中也有重复值会使过滤/索引成为问题
So here a[:,1:]
I take all the rows but index from column 1 onwards as desired, see the docs
所以这里有一个[:,1:]我会根据需要从第1列开始获取所有行,但请参阅文档
#1
7
You passed the complete array as the data
param, you need to slice your array also if you want just 4 columns from the array as the data:
您将完整数组作为数据参数传递,如果您只想从数组中选择4列作为数据,则还需要对数组进行切片:
In [158]:
df = pd.DataFrame(a[:,1:], index=a[:,0], columns=['A', 'B','C','D'])
df
Out[158]:
A B C D
1 5.1 3.5 1.4 0.2
1 4.9 3.0 1.4 0.2
2 4.7 3.2 1.3 0.2
2 4.6 3.1 1.5 0.2
Also having duplicate values in the index will make filtering/indexing problematic
索引中也有重复值会使过滤/索引成为问题
So here a[:,1:]
I take all the rows but index from column 1 onwards as desired, see the docs
所以这里有一个[:,1:]我会根据需要从第1列开始获取所有行,但请参阅文档