在表中连接列值 - pandas

时间:2022-08-16 09:18:03

i have table following like

我有下面的表格

mark     name total  point
 70     bala   100    10
 80     bala   100    10
 80     bala   100    10
100  karthik   100     5
100  karthik   150     5
100  karthik   150     5
50     abdul   80     10
50     abdul   80      5
50     abdul   80      6

i want to split this table following (remove duplicate column based on name and unique column will be seperated by comma )

我想拆分此表(根据名称删除重复列,唯一列将以逗号分隔)

mark      name     total    point
70,80     bala     100        10
100       karthik  100,150     5
50        abdul    80       10,5,6

3 个解决方案

#1


3  

Use

使用

In [858]: (df.astype(str).groupby('name', as_index=False, sort=False)
             .apply(lambda x: pd.Series({v: ','.join(x[v].unique()) for v in x})))
Out[858]:
    mark     name    total   point
0  70,80     bala      100      10
1    100  karthik  100,150       5
2     50    abdul       80  10,5,6

Or,

要么,

In [863]: (df.astype(str).groupby('name', as_index=False, sort=False)
             .apply(lambda x: x.apply(lambda x: ','.join(x.unique()))))
Out[863]:
    mark     name    total   point
0  70,80     bala      100      10
1    100  karthik  100,150       5
2     50    abdul       80  10,5,6

#2


2  

With the help of pivot table

在数据透视表的帮助下

df.pivot_table(index='name',aggfunc=lambda x : ','.join(x.unique().astype(str))).reset_index()

Output:

输出:

    name   mark   point    total
0    abdul     50  10,5,6       80
1     bala  70,80      10      100
2  karthik    100       5  100,150

#3


2  

Use DataFrameGroupBy.agg:

使用DataFrameGroupBy.agg:

df = (df.astype(str)
       .groupby('name', as_index=False, sort=False)
       .agg(lambda x: ','.join(x.unique())))
print (df)
      name   mark    total   point
0     bala  70,80      100      10
1  karthik    100  100,150       5
2    abdul     50       80  10,5,6

#1


3  

Use

使用

In [858]: (df.astype(str).groupby('name', as_index=False, sort=False)
             .apply(lambda x: pd.Series({v: ','.join(x[v].unique()) for v in x})))
Out[858]:
    mark     name    total   point
0  70,80     bala      100      10
1    100  karthik  100,150       5
2     50    abdul       80  10,5,6

Or,

要么,

In [863]: (df.astype(str).groupby('name', as_index=False, sort=False)
             .apply(lambda x: x.apply(lambda x: ','.join(x.unique()))))
Out[863]:
    mark     name    total   point
0  70,80     bala      100      10
1    100  karthik  100,150       5
2     50    abdul       80  10,5,6

#2


2  

With the help of pivot table

在数据透视表的帮助下

df.pivot_table(index='name',aggfunc=lambda x : ','.join(x.unique().astype(str))).reset_index()

Output:

输出:

    name   mark   point    total
0    abdul     50  10,5,6       80
1     bala  70,80      10      100
2  karthik    100       5  100,150

#3


2  

Use DataFrameGroupBy.agg:

使用DataFrameGroupBy.agg:

df = (df.astype(str)
       .groupby('name', as_index=False, sort=False)
       .agg(lambda x: ','.join(x.unique())))
print (df)
      name   mark    total   point
0     bala  70,80      100      10
1  karthik    100  100,150       5
2    abdul     50       80  10,5,6