I have the following df:
我有以下df:
dfdict = {'letter': ['a', 'a', 'a', 'b', 'b'], 'category': ['foo', 'foo', 'bar', 'bar', 'spam']}
df1 = pd.DataFrame(dfdict)
category letter
0 foo a
1 foo a
2 bar a
3 bar b
4 spam b
I want it to output me an aggregated count df like this:
我希望它输出一个聚合计数df,如下所示:
a b
foo 2 0
bar 1 1
spam 0 1
This seems like it should be an easy operation. I have figured out how to use df1 = df1.groupby(['category','letter']).size()
to get:
这似乎应该是一个简单的操作。我已经想出如何使用df1 = df1.groupby(['category','letter'])。size()来获取:
category letter
bar a 1
b 1
foo a 2
spam b 1
This is closer, except now I need the letters a, b
along the top and the counts coming down.
这是更接近的,除了现在我需要顶部的字母a,b和倒计数。
1 个解决方案
#1
3
You can using crosstab
你可以使用交叉表
pd.crosstab(df1.category,df1.letter)
Out[554]:
letter a b
category
bar 1 1
foo 2 0
spam 0 1
To fix your code , adding unstack
要修复代码,请添加unstack
df1.groupby(['category','letter']).size().unstack(fill_value=0)
Out[556]:
letter a b
category
bar 1 1
foo 2 0
spam 0 1
#1
3
You can using crosstab
你可以使用交叉表
pd.crosstab(df1.category,df1.letter)
Out[554]:
letter a b
category
bar 1 1
foo 2 0
spam 0 1
To fix your code , adding unstack
要修复代码,请添加unstack
df1.groupby(['category','letter']).size().unstack(fill_value=0)
Out[556]:
letter a b
category
bar 1 1
foo 2 0
spam 0 1