如何在pandas中使用DataFrame实现概率边缘化函数?

时间:2021-02-04 18:17:45

I have a probability table like this:

我有一个像这样的概率表:

        BC_array =[np.array(['B=n','B=m','B=s','B=n','B=m','B=s']),np.array(['C=F', 'C=F', 'C=F', 'C=T', 'C=T', 'C=T'])]
        pD_BC_array=np.array([[0.9,0.8,0.1,0.3,0.4,0.01],[0.08,0.17,0.01,0.05,0.05,0.01],[0.01,0.01,0.87,0.05,0.15,0.97],[0.01,0.02,0.02,0.6,0.4,0.01]])
        pD_BC=pd.DataFrame(pD_BC_array,index=['D=h','D=c','D=s','D=r'],columns=BC_array)
      B=n   B=m   B=s   B=n   B=m   B=s
      C=F   C=F   C=F   C=T   C=T   C=T
D=h  0.90  0.80  0.10  0.30  0.40  0.01
D=c  0.08  0.17  0.01  0.05  0.05  0.01
D=s  0.01  0.01  0.87  0.05  0.15  0.97
D=r  0.01  0.02  0.02  0.60  0.40  0.01

How could I marginalize 'C'(sum up all the 'C=F' and 'C=T' together) and get table:

我如何边缘化'C'(将所有'C = F'和'C = T'加在一起)得到表格:

      B=n   B=m   B=s 
D=h  1.20  1.20  0.11  
D=c  0.13  0.22  0.02 
D=s  0.06  0.16  1.84 
D=r  0.61  0.42  0.03 

like this?

1 个解决方案

#1


You can call sum on the df and pass params axis=1 for row-wise and level=0 to sum along that level:

您可以在df上调用sum并传递params axis = 1用于行方式,level = 0用于沿该级别求和:

In [259]:

pD_BC.sum(axis=1, level=0)
Out[259]:
      B=m   B=n   B=s
D=h  1.20  1.20  0.11
D=c  0.22  0.13  0.02
D=s  0.16  0.06  1.84
D=r  0.42  0.61  0.03

#1


You can call sum on the df and pass params axis=1 for row-wise and level=0 to sum along that level:

您可以在df上调用sum并传递params axis = 1用于行方式,level = 0用于沿该级别求和:

In [259]:

pD_BC.sum(axis=1, level=0)
Out[259]:
      B=m   B=n   B=s
D=h  1.20  1.20  0.11
D=c  0.22  0.13  0.02
D=s  0.16  0.06  1.84
D=r  0.42  0.61  0.03