I have the following code:
我有以下代码:
import pandas as pd
df1 = pd.DataFrame({'Counterparty':['Bank','Bank','GSE','PSE'],
'Sub Cat':['Tier1','Small','Small', 'Small'],
'Location':['US','US','UK','UK'],
'Amount':[50, 55, 65, 55],
'Amount1':[1,2,3,4]})
df2=df1.groupby(['Counterparty','Location'])[['Amount']].sum()
df2.dtypes
df1.dtypes
The df2 data frame does not have the columns that I am aggregating across ( Counterparty and Location). Any ideas why this is the case ? Both Amount and Amount1 are numeric fields. I just want to sum across Amount and aggregate across Amount1
df2数据框没有我正在聚合的列(Counterparty和Location)。任何想法为什么会这样? Amount和Amount1都是数字字段。我只想在Amount1之间总结Amount和汇总
2 个解决方案
#1
7
For columns from index add as_index=False
parameter or reset_index
:
对于来自索引的列,添加as_index = False参数或reset_index:
df2=df1.groupby(['Counterparty','Location'])[['Amount']].sum().reset_index()
print (df2)
Counterparty Location Amount
0 Bank US 105
1 GSE UK 65
2 PSE UK 55
df2=df1.groupby(['Counterparty','Location'], as_index=False)[['Amount']].sum()
print (df2)
Counterparty Location Amount
0 Bank US 105
1 GSE UK 65
2 PSE UK 55
If aggregate by all columns here happens automatic exclusion of nuisance columns - column Sub Cat
is omitted:
如果此处所有列的聚合发生,则会自动排除有害列 - 列Sub Cat被省略:
df2=df1.groupby(['Counterparty','Location']).sum().reset_index()
print (df2)
Counterparty Location Amount Amount1
0 Bank US 105 3
1 GSE UK 65 3
2 PSE UK 55 4
df2=df1.groupby(['Counterparty','Location'], as_index=False).sum()
#2
0
Remove the double brackets around the 'Amount'
and make them single brackets. You're telling it to only select one column.
删除'Amount'周围的双括号,并将它们作为单个括号。你告诉它只选择一列。
#1
7
For columns from index add as_index=False
parameter or reset_index
:
对于来自索引的列,添加as_index = False参数或reset_index:
df2=df1.groupby(['Counterparty','Location'])[['Amount']].sum().reset_index()
print (df2)
Counterparty Location Amount
0 Bank US 105
1 GSE UK 65
2 PSE UK 55
df2=df1.groupby(['Counterparty','Location'], as_index=False)[['Amount']].sum()
print (df2)
Counterparty Location Amount
0 Bank US 105
1 GSE UK 65
2 PSE UK 55
If aggregate by all columns here happens automatic exclusion of nuisance columns - column Sub Cat
is omitted:
如果此处所有列的聚合发生,则会自动排除有害列 - 列Sub Cat被省略:
df2=df1.groupby(['Counterparty','Location']).sum().reset_index()
print (df2)
Counterparty Location Amount Amount1
0 Bank US 105 3
1 GSE UK 65 3
2 PSE UK 55 4
df2=df1.groupby(['Counterparty','Location'], as_index=False).sum()
#2
0
Remove the double brackets around the 'Amount'
and make them single brackets. You're telling it to only select one column.
删除'Amount'周围的双括号,并将它们作为单个括号。你告诉它只选择一列。