Python:pandas中的数据帧操作和聚合

时间:2022-09-26 23:02:10

I have the following df:

我有以下df:

dfdict = {'letter': ['a', 'a', 'a', 'b', 'b'], 'category': ['foo', 'foo', 'bar', 'bar', 'spam']}
df1 = pd.DataFrame(dfdict)

  category  letter
0   foo      a
1   foo      a
2   bar      a
3   bar      b
4   spam     b

I want it to output me an aggregated count df like this:

我希望它输出一个聚合计数df,如下所示:

     a    b
foo  2    0
bar  1    1
spam 0    1

This seems like it should be an easy operation. I have figured out how to use df1 = df1.groupby(['category','letter']).size() to get:

这似乎应该是一个简单的操作。我已经想出如何使用df1 = df1.groupby(['category','letter'])。size()来获取:

category  letter
bar       a         1
          b         1
foo       a         2
spam      b         1

This is closer, except now I need the letters a, b along the top and the counts coming down.

这是更接近的,除了现在我需要顶部的字母a,b和倒计数。

1 个解决方案

#1


3  

You can using crosstab

你可以使用交叉表

pd.crosstab(df1.category,df1.letter)
Out[554]: 
letter    a  b
category      
bar       1  1
foo       2  0
spam      0  1

To fix your code , adding unstack

要修复代码,请添加unstack

df1.groupby(['category','letter']).size().unstack(fill_value=0)
Out[556]: 
letter    a  b
category      
bar       1  1
foo       2  0
spam      0  1

#1


3  

You can using crosstab

你可以使用交叉表

pd.crosstab(df1.category,df1.letter)
Out[554]: 
letter    a  b
category      
bar       1  1
foo       2  0
spam      0  1

To fix your code , adding unstack

要修复代码,请添加unstack

df1.groupby(['category','letter']).size().unstack(fill_value=0)
Out[556]: 
letter    a  b
category      
bar       1  1
foo       2  0
spam      0  1