Pandas groupby存储在一个新的数据帧中

时间:2023-02-11 21:40:53

I have the following code:

我有以下代码:

import pandas as pd
df1 = pd.DataFrame({'Counterparty':['Bank','Bank','GSE','PSE'],
            'Sub Cat':['Tier1','Small','Small', 'Small'],
            'Location':['US','US','UK','UK'],
            'Amount':[50, 55, 65, 55],
            'Amount1':[1,2,3,4]})

df2=df1.groupby(['Counterparty','Location'])[['Amount']].sum()
df2.dtypes
df1.dtypes

The df2 data frame does not have the columns that I am aggregating across ( Counterparty and Location). Any ideas why this is the case ? Both Amount and Amount1 are numeric fields. I just want to sum across Amount and aggregate across Amount1

df2数据框没有我正在聚合的列(Counterparty和Location)。任何想法为什么会这样? Amount和Amount1都是数字字段。我只想在Amount1之间总结Amount和汇总

2 个解决方案

#1


7  

For columns from index add as_index=False parameter or reset_index:

对于来自索引的列,添加as_index = False参数或reset_index:

df2=df1.groupby(['Counterparty','Location'])[['Amount']].sum().reset_index()
print (df2)
  Counterparty Location  Amount
0         Bank       US     105
1          GSE       UK      65
2          PSE       UK      55

df2=df1.groupby(['Counterparty','Location'], as_index=False)[['Amount']].sum()
print (df2)
  Counterparty Location  Amount
0         Bank       US     105
1          GSE       UK      65
2          PSE       UK      55

If aggregate by all columns here happens automatic exclusion of nuisance columns - column Sub Cat is omitted:

如果此处所有列的聚合发生,则会自动排除有害列 - 列Sub Cat被省略:

df2=df1.groupby(['Counterparty','Location']).sum().reset_index()
print (df2)
  Counterparty Location  Amount  Amount1
0         Bank       US     105        3
1          GSE       UK      65        3
2          PSE       UK      55        4


df2=df1.groupby(['Counterparty','Location'], as_index=False).sum()

#2


0  

Remove the double brackets around the 'Amount' and make them single brackets. You're telling it to only select one column.

删除'Amount'周围的双括号,并将它们作为单个括号。你告诉它只选择一列。

#1


7  

For columns from index add as_index=False parameter or reset_index:

对于来自索引的列,添加as_index = False参数或reset_index:

df2=df1.groupby(['Counterparty','Location'])[['Amount']].sum().reset_index()
print (df2)
  Counterparty Location  Amount
0         Bank       US     105
1          GSE       UK      65
2          PSE       UK      55

df2=df1.groupby(['Counterparty','Location'], as_index=False)[['Amount']].sum()
print (df2)
  Counterparty Location  Amount
0         Bank       US     105
1          GSE       UK      65
2          PSE       UK      55

If aggregate by all columns here happens automatic exclusion of nuisance columns - column Sub Cat is omitted:

如果此处所有列的聚合发生,则会自动排除有害列 - 列Sub Cat被省略:

df2=df1.groupby(['Counterparty','Location']).sum().reset_index()
print (df2)
  Counterparty Location  Amount  Amount1
0         Bank       US     105        3
1          GSE       UK      65        3
2          PSE       UK      55        4


df2=df1.groupby(['Counterparty','Location'], as_index=False).sum()

#2


0  

Remove the double brackets around the 'Amount' and make them single brackets. You're telling it to only select one column.

删除'Amount'周围的双括号,并将它们作为单个括号。你告诉它只选择一列。