I'd like to join 2 columns of a Pandas Data Frame with a comma, i.e.: "abc" in column 1 joins with "123" in column 2 to become "abc, 123".
我想用逗号加入Pandas数据框的2列,即:第1列中的“abc”与第2列中的“123”连接成“abc,123”。
For example:
>>> import pandas as pd
>>> import numpy as np
>>> df = pd.DataFrame({'IDx': ['a','b',np.nan,'C'], 'IDy':['1','','2','D']})
>>> df
IDx IDy
0 a 1
1 b
2 NaN 2
3 C D
The following do not work:
以下不起作用:
>>> ', '.join([df['IDx'],df['IDy']])
>>> df.apply(lambda x: ', '.join([x['IDx'],x['IDy']]))
This is the desired result:
这是期望的结果:
>>> df = pd.DataFrame({'ID': ['a, 1', 'b', '2', 'C, D']})
>>> df
ID
0 a, 1
1 b
2 2
3 C, D
1 个解决方案
#1
2
You can use apply
with fillna
to empty string
, map
columns to string
and strip
:
您可以使用apply with fillna来清空字符串,将列映射到字符串和条带:
df['ID'] = df[['IDx', 'IDy']].apply(lambda x: ','.join(x.fillna('').map(str)), axis=1)
df['ID'] = df['ID'].str.strip(',')
print df
IDx IDy ID
0 a 1 a,1
1 b b
2 NaN 2 2
3 C D C,D
Or fillna
to empty string and astype
to string
and strip
:
或者填充空字符串和astype到字符串和条带:
df['ID'] = df['IDx'].fillna('').astype(str) + ',' + df['IDy'].fillna('').astype(str)
df['ID'] = df['ID'].str.strip(',')
print df
IDx IDy ID
0 a 1 a,1
1 b b
2 NaN 2 2
3 C D C,D
EDIT: If dtype
of your columns is string
, you can omit map
or astype
:
编辑:如果你的列的dtype是字符串,你可以省略map或astype:
df['ID'] = df[['IDx', 'IDy']].apply(lambda x: ','.join(x.fillna('')), axis=1)
df['ID'] = df['ID'].str.strip(',')
Or:
df['ID'] = df['IDx'].fillna('') + ',' + df['IDy'].fillna('')
df['ID'] = df['ID'].str.strip(',')
print df
#1
2
You can use apply
with fillna
to empty string
, map
columns to string
and strip
:
您可以使用apply with fillna来清空字符串,将列映射到字符串和条带:
df['ID'] = df[['IDx', 'IDy']].apply(lambda x: ','.join(x.fillna('').map(str)), axis=1)
df['ID'] = df['ID'].str.strip(',')
print df
IDx IDy ID
0 a 1 a,1
1 b b
2 NaN 2 2
3 C D C,D
Or fillna
to empty string and astype
to string
and strip
:
或者填充空字符串和astype到字符串和条带:
df['ID'] = df['IDx'].fillna('').astype(str) + ',' + df['IDy'].fillna('').astype(str)
df['ID'] = df['ID'].str.strip(',')
print df
IDx IDy ID
0 a 1 a,1
1 b b
2 NaN 2 2
3 C D C,D
EDIT: If dtype
of your columns is string
, you can omit map
or astype
:
编辑:如果你的列的dtype是字符串,你可以省略map或astype:
df['ID'] = df[['IDx', 'IDy']].apply(lambda x: ','.join(x.fillna('')), axis=1)
df['ID'] = df['ID'].str.strip(',')
Or:
df['ID'] = df['IDx'].fillna('') + ',' + df['IDy'].fillna('')
df['ID'] = df['ID'].str.strip(',')
print df