如何用逗号连接Pandas数据框的2列?

时间:2021-08-22 09:12:37

I'd like to join 2 columns of a Pandas Data Frame with a comma, i.e.: "abc" in column 1 joins with "123" in column 2 to become "abc, 123".

我想用逗号加入Pandas数据框的2列,即:第1列中的“abc”与第2列中的“123”连接成“abc,123”。

For example:

>>> import pandas as pd
>>> import numpy as np
>>> df = pd.DataFrame({'IDx': ['a','b',np.nan,'C'], 'IDy':['1','','2','D']})
>>> df
   IDx  IDy
0    a    1
1    b     
2  NaN    2
3    C    D

The following do not work:

以下不起作用:

>>> ', '.join([df['IDx'],df['IDy']])
>>> df.apply(lambda x: ', '.join([x['IDx'],x['IDy']]))

This is the desired result:

这是期望的结果:

>>> df = pd.DataFrame({'ID': ['a, 1', 'b', '2', 'C, D']})
>>> df
     ID
0  a, 1
1     b
2     2
3  C, D

1 个解决方案

#1


2  

You can use apply with fillna to empty string, map columns to string and strip:

您可以使用apply with fillna来清空字符串,将列映射到字符串和条带:

df['ID'] = df[['IDx', 'IDy']].apply(lambda x: ','.join(x.fillna('').map(str)), axis=1)
df['ID'] = df['ID'].str.strip(',')
print df
   IDx IDy   ID
0    a   1  a,1
1    b        b
2  NaN   2    2
3    C   D  C,D

Or fillna to empty string and astype to string and strip:

或者填充空字符串和astype到字符串和条带:

df['ID'] = df['IDx'].fillna('').astype(str) + ',' + df['IDy'].fillna('').astype(str)
df['ID'] = df['ID'].str.strip(',')
print df
   IDx IDy   ID
0    a   1  a,1
1    b        b
2  NaN   2    2
3    C   D  C,D

EDIT: If dtype of your columns is string, you can omit map or astype:

编辑:如果你的列的dtype是字符串,你可以省略map或astype:

df['ID'] = df[['IDx', 'IDy']].apply(lambda x: ','.join(x.fillna('')), axis=1)
df['ID'] = df['ID'].str.strip(',')

Or:

df['ID'] = df['IDx'].fillna('') + ',' + df['IDy'].fillna('')
df['ID'] = df['ID'].str.strip(',')
print df

#1


2  

You can use apply with fillna to empty string, map columns to string and strip:

您可以使用apply with fillna来清空字符串,将列映射到字符串和条带:

df['ID'] = df[['IDx', 'IDy']].apply(lambda x: ','.join(x.fillna('').map(str)), axis=1)
df['ID'] = df['ID'].str.strip(',')
print df
   IDx IDy   ID
0    a   1  a,1
1    b        b
2  NaN   2    2
3    C   D  C,D

Or fillna to empty string and astype to string and strip:

或者填充空字符串和astype到字符串和条带:

df['ID'] = df['IDx'].fillna('').astype(str) + ',' + df['IDy'].fillna('').astype(str)
df['ID'] = df['ID'].str.strip(',')
print df
   IDx IDy   ID
0    a   1  a,1
1    b        b
2  NaN   2    2
3    C   D  C,D

EDIT: If dtype of your columns is string, you can omit map or astype:

编辑:如果你的列的dtype是字符串,你可以省略map或astype:

df['ID'] = df[['IDx', 'IDy']].apply(lambda x: ','.join(x.fillna('')), axis=1)
df['ID'] = df['ID'].str.strip(',')

Or:

df['ID'] = df['IDx'].fillna('') + ',' + df['IDy'].fillna('')
df['ID'] = df['ID'].str.strip(',')
print df