I want to unstack one column in my Pandas DataFrame. The DataFrame is indexed by the 'Date' and I want to unstack the 'Country' column so each Country is its own column. The current pandas DF looks like this:
我想在我的Pandas DataFrame中取消堆叠一列。 DataFrame由'Date'索引,我想取消堆叠'Country'列,因此每个Country都是它自己的列。目前的熊猫DF看起来像这样:
Country Product Flow Unit Quantity
Date
2002-01-31 FINLAND KEROSENE TOTEXPSB KBD 3.8129
2002-01-31 TURKEY KEROSENE TOTEXPSB KBD 0.2542
2002-01-31 AUSTRALI KEROSENE TOTEXPSB KBD 12.2787
2002-01-31 CANADA KEROSENE TOTEXPSB KBD 5.1161
2002-01-31 UK KEROSENE TOTEXPSB KBD 12.2013
When I use df.pivot I get the following error "ReshapeError: Index contains duplicate entries, cannot reshape" This is true since I'm looking at a Dates that are reported at the same time by each country. What I would like is to unstack the 'Country Column so only one Date would show for each month.
当我使用df.pivot时,我收到以下错误“ReshapeError:索引包含重复的条目,无法重新形成”这是真的,因为我正在查看每个国家/地区同时报告的日期。我想要的是拆除“国家专栏”,这样每个月只会显示一个日期。
the DataFrame headers like this Date would still be the index:
像这样Date的DataFrame标题仍然是索引:
Date FINLAND TURKEY AUSTRALI CANADA Flow Unit
2002-01-31 3.8129 0.2542 12.2787 5.1161 TOTEXPSB KBD
I have worked on this for a while and I'm not getting anywhere so any direction or insight would be great.
我已经研究了一段时间而且我没有得到任何地方,所以任何方向或洞察都会很棒。
Also, note you are only seeing the head of the DataFrame so years of Data is in this format.
另外,请注意,您只看到DataFrame的头部,因此数据的年份是这种格式。
Thanks,
Douglas
1 个解决方案
#1
2
If you can drop Product
, Unit
, and Flow
then it should be as easy as
如果您可以放弃产品,单位和流量,那么它应该如此简单
df.reset_index().pivot(columns='Country', index='Date', values='Quantity')
to give
Country AUSTRALI CANADA FINLAND TURKEY UK
Date
2002-01-31 12.2787 5.1161 3.8129 0.2542 12.2013
#1
2
If you can drop Product
, Unit
, and Flow
then it should be as easy as
如果您可以放弃产品,单位和流量,那么它应该如此简单
df.reset_index().pivot(columns='Country', index='Date', values='Quantity')
to give
Country AUSTRALI CANADA FINLAND TURKEY UK
Date
2002-01-31 12.2787 5.1161 3.8129 0.2542 12.2013