I have following pandas Dataframe:
我有以下pandas Dataframe:
ID Year Jan_salary Jan_days Feb_salary Feb_days Mar_salary Mar_days
1 2016 4500 22 4200 18 4700 24
2 2016 3800 23 3600 19 4400 23
3 2016 5500 21 5200 17 5300 23
I want to convert this dataframe to following dataframe:
我想将此数据帧转换为以下数据帧:
ID Year month salary days
1 2016 01 4500 22
1 2016 02 4200 18
1 2016 03 4700 24
2 2016 01 3800 23
2 2016 02 3600 19
2 2016 03 4400 23
3 2016 01 5500 21
3 2016 02 5200 17
3 2016 03 5300 23
I tried use pandas.DataFrame.stack
but couldn't get the expected outcome. I am using Python 2.7 Please guide me to reshape this Pandas dataframe. Thanks.
我尝试使用pandas.DataFrame.stack但无法获得预期的结果。我正在使用Python 2.7请指导我重塑这个Pandas数据帧。谢谢。
2 个解决方案
#1
#2
1
I love pd.melt
so that's what I used in this long-winded approach:
我喜欢pd.melt,这就是我在这个冗长的方法中使用的:
ldf = pd.melt(df,id_vars=['ID','Year'],
value_vars=['Jan_salary','Feb_salary','Mar_salary'],
var_name='month',value_name='salary')
rdf = pd.melt(df,id_vars=['ID','Year'],
value_vars=['Jan_days','Feb_days','Mar_days'],
value_name='days')
rdf.drop(['ID','Year','variable'],inplace=True,axis=1)
cdf = pd.concat([ldf,rdf],axis=1)
cdf['month'] = cdf['month'].str.replace('_salary','')
import calendar
def mapper(month_abbr):
# from http://*.com/a/3418092/42346
d = {v: str(k).zfill(2) for k,v in enumerate(calendar.month_abbr)}
return d[month_abbr]
cdf['month'] = cdf['month'].apply(mapper)
Result:
结果:
>>> cdf
ID Year month salary days
0 1 2016 01 4500 22
1 2 2016 01 3800 23
2 3 2016 01 5500 21
3 1 2016 02 4200 18
4 2 2016 02 3600 19
5 3 2016 02 5200 17
6 1 2016 03 4700 24
7 2 2016 03 4400 23
8 3 2016 03 5300 23
#1
4
df = df.set_index(['ID', 'Year'])
df.columns = df.columns.str.split('_', expand=True).rename('month', level=0)
df = df.stack(0).reset_index()
md = dict(Jan='01', Feb='02', Mar='03')
df.month = df.month.map(md)
df[['ID', 'Year', 'month', 'salary', 'days']]
#2
1
I love pd.melt
so that's what I used in this long-winded approach:
我喜欢pd.melt,这就是我在这个冗长的方法中使用的:
ldf = pd.melt(df,id_vars=['ID','Year'],
value_vars=['Jan_salary','Feb_salary','Mar_salary'],
var_name='month',value_name='salary')
rdf = pd.melt(df,id_vars=['ID','Year'],
value_vars=['Jan_days','Feb_days','Mar_days'],
value_name='days')
rdf.drop(['ID','Year','variable'],inplace=True,axis=1)
cdf = pd.concat([ldf,rdf],axis=1)
cdf['month'] = cdf['month'].str.replace('_salary','')
import calendar
def mapper(month_abbr):
# from http://*.com/a/3418092/42346
d = {v: str(k).zfill(2) for k,v in enumerate(calendar.month_abbr)}
return d[month_abbr]
cdf['month'] = cdf['month'].apply(mapper)
Result:
结果:
>>> cdf
ID Year month salary days
0 1 2016 01 4500 22
1 2 2016 01 3800 23
2 3 2016 01 5500 21
3 1 2016 02 4200 18
4 2 2016 02 3600 19
5 3 2016 02 5200 17
6 1 2016 03 4700 24
7 2 2016 03 4400 23
8 3 2016 03 5300 23