熊猫:取消堆叠DataFrame的一列

时间:2022-02-09 21:40:06

I want to unstack one column in my Pandas DataFrame. The DataFrame is indexed by the 'Date' and I want to unstack the 'Country' column so each Country is its own column. The current pandas DF looks like this:

我想在我的Pandas DataFrame中取消堆叠一列。 DataFrame由'Date'索引,我想取消堆叠'Country'列,因此每个Country都是它自己的列。目前的熊猫DF看起来像这样:

             Country   Product      Flow Unit  Quantity  
Date                                                         
2002-01-31   FINLAND  KEROSENE  TOTEXPSB  KBD    3.8129     
2002-01-31    TURKEY  KEROSENE  TOTEXPSB  KBD    0.2542     
2002-01-31  AUSTRALI  KEROSENE  TOTEXPSB  KBD   12.2787     
2002-01-31    CANADA  KEROSENE  TOTEXPSB  KBD    5.1161     
2002-01-31        UK  KEROSENE  TOTEXPSB  KBD   12.2013     

When I use df.pivot I get the following error "ReshapeError: Index contains duplicate entries, cannot reshape" This is true since I'm looking at a Dates that are reported at the same time by each country. What I would like is to unstack the 'Country Column so only one Date would show for each month.

当我使用df.pivot时,我收到以下错误“ReshapeError:索引包含重复的条目,无法重新形成”这是真的,因为我正在查看每个国家/地区同时报告的日期。我想要的是拆除“国家专栏”,这样每个月只会显示一个日期。

the DataFrame headers like this Date would still be the index:

像这样Date的DataFrame标题仍然是索引:

Date        FINLAND TURKEY  AUSTRALI  CANADA Flow      Unit

2002-01-31  3.8129  0.2542  12.2787   5.1161 TOTEXPSB   KBD

I have worked on this for a while and I'm not getting anywhere so any direction or insight would be great.

我已经研究了一段时间而且我没有得到任何地方,所以任何方向或洞察都会很棒。

Also, note you are only seeing the head of the DataFrame so years of Data is in this format.

另外,请注意,您只看到DataFrame的头部,因此数据的年份是这种格式。

Thanks,

Douglas

1 个解决方案

#1


2  

If you can drop Product, Unit, and Flow then it should be as easy as

如果您可以放弃产品,单位和流量,那么它应该如此简单

df.reset_index().pivot(columns='Country', index='Date', values='Quantity')

to give

Country  AUSTRALI    CANADA  FINLAND TURKEY  UK
Date                    
2002-01-31   12.2787     5.1161  3.8129  0.2542  12.2013

#1


2  

If you can drop Product, Unit, and Flow then it should be as easy as

如果您可以放弃产品,单位和流量,那么它应该如此简单

df.reset_index().pivot(columns='Country', index='Date', values='Quantity')

to give

Country  AUSTRALI    CANADA  FINLAND TURKEY  UK
Date                    
2002-01-31   12.2787     5.1161  3.8129  0.2542  12.2013