i have table following like
我有下面的表格
mark name total point
70 bala 100 10
80 bala 100 10
80 bala 100 10
100 karthik 100 5
100 karthik 150 5
100 karthik 150 5
50 abdul 80 10
50 abdul 80 5
50 abdul 80 6
i want to split this table following (remove duplicate column based on name and unique column will be seperated by comma )
我想拆分此表(根据名称删除重复列,唯一列将以逗号分隔)
mark name total point
70,80 bala 100 10
100 karthik 100,150 5
50 abdul 80 10,5,6
3 个解决方案
#1
3
Use
使用
In [858]: (df.astype(str).groupby('name', as_index=False, sort=False)
.apply(lambda x: pd.Series({v: ','.join(x[v].unique()) for v in x})))
Out[858]:
mark name total point
0 70,80 bala 100 10
1 100 karthik 100,150 5
2 50 abdul 80 10,5,6
Or,
要么,
In [863]: (df.astype(str).groupby('name', as_index=False, sort=False)
.apply(lambda x: x.apply(lambda x: ','.join(x.unique()))))
Out[863]:
mark name total point
0 70,80 bala 100 10
1 100 karthik 100,150 5
2 50 abdul 80 10,5,6
#2
2
With the help of pivot table
在数据透视表的帮助下
df.pivot_table(index='name',aggfunc=lambda x : ','.join(x.unique().astype(str))).reset_index()
Output:
输出:
name mark point total 0 abdul 50 10,5,6 80 1 bala 70,80 10 100 2 karthik 100 5 100,150
#3
2
Use DataFrameGroupBy.agg
:
使用DataFrameGroupBy.agg:
df = (df.astype(str)
.groupby('name', as_index=False, sort=False)
.agg(lambda x: ','.join(x.unique())))
print (df)
name mark total point
0 bala 70,80 100 10
1 karthik 100 100,150 5
2 abdul 50 80 10,5,6
#1
3
Use
使用
In [858]: (df.astype(str).groupby('name', as_index=False, sort=False)
.apply(lambda x: pd.Series({v: ','.join(x[v].unique()) for v in x})))
Out[858]:
mark name total point
0 70,80 bala 100 10
1 100 karthik 100,150 5
2 50 abdul 80 10,5,6
Or,
要么,
In [863]: (df.astype(str).groupby('name', as_index=False, sort=False)
.apply(lambda x: x.apply(lambda x: ','.join(x.unique()))))
Out[863]:
mark name total point
0 70,80 bala 100 10
1 100 karthik 100,150 5
2 50 abdul 80 10,5,6
#2
2
With the help of pivot table
在数据透视表的帮助下
df.pivot_table(index='name',aggfunc=lambda x : ','.join(x.unique().astype(str))).reset_index()
Output:
输出:
name mark point total 0 abdul 50 10,5,6 80 1 bala 70,80 10 100 2 karthik 100 5 100,150
#3
2
Use DataFrameGroupBy.agg
:
使用DataFrameGroupBy.agg:
df = (df.astype(str)
.groupby('name', as_index=False, sort=False)
.agg(lambda x: ','.join(x.unique())))
print (df)
name mark total point
0 bala 70,80 100 10
1 karthik 100 100,150 5
2 abdul 50 80 10,5,6