如何将'pandas'中的DataFrame相乘？

I have two probability table p(B) and p(A|B):

我有两个概率表p(B)和p(A | B):

pB_array=np.array([[0.97],[0.01],[0.02]])
pB = pd.DataFrame(pB_array,index=['B=n','B=m','B=s'])
pA_B_array=np.array([[0.9,0.8,0.3],[0.1,0.2,0.7]])
pA_B=pd.DataFrame(pA_B_array,index=['A=F','A=T'],columns=['B=n','B=m','B=s'])

I want to multiply them by columns:

我想将它们乘以列:

fAB=pA_B.multiply(pB.T,axis='columns')

And get some result like:

得到一些结果,如:

     B=n  B=m  B=s
A=F  0.1  0.2  0.3
A=T  0.5  0.4  0.1

But I can only get this:

但我只能得到这个:

     B=n  B=m  B=s
0    NaN  NaN  NaN
A=F  NaN  NaN  NaN
A=T  NaN  NaN  NaN

How could I make it right?

我怎么能做对的?

1 个解决方案

#1

The problem here is alignment will occur along the axes, as these don't match you get NaN values.

这里的问题是沿着轴会发生对齐,因为这些与你得到的NaN值不匹配。

In [173]:

fAB=pA_B.multiply(pB.T.squeeze().values,axis='columns')
fAB
Out[173]:
       B=n    B=m    B=s
A=F  0.873  0.008  0.006
A=T  0.097  0.002  0.014

We need to call squeeze here as the shape is wrong if this not done, also we can anonymise the data by calling .values to return a np array so that the alignment doesn't become an issue.

我们需要在这里调用squeeze,因为如果没有这样做,形状是错误的,我们也可以通过调用.values来匿名化数据以返回一个np数组,这样对齐就不会成为问题。

fAB=pA_B.multiply(pB.T.values,axis='columns')

results in:

ValueError: Shape of passed values is (3, 1), indices imply (3, 2)

ValueError:传递值的形状为(3,1),索引意味着(3,2)

As:

In [176]:

print(pB.T.shape)
print(pB.T.squeeze().shape)
(1, 3)
(3,)

So squeeze flattens the 2-d array to a 1-d array

因此,挤压将2维阵列展平为1维阵列

#1