I'm trying to add the values from one column of a dataframe to another dataframe by matching values from one column or another.
我正在尝试通过匹配来自一列或另一列的值来将数据帧的一列中的值添加到另一个数据帧。
For example:
I have 2 df's with different length and df2 does not have all the pairs listed in df1:
我有2个不同长度的df,而df2没有df1中列出的所有对:
df1
Year Territory Pair_ID
1 1999 BGD 1 5
2 2000 TAR 6 2
3 2001 JAM 3 7
4 2002 TER 9 2
df2
ID1 ID2 pair pair1 type detail
1 1 5 1 5 5 1 PO N/A
2 2 6 2 6 6 2 SB N/A
3 3 7 3 7 7 3 PO N/A
4 4 8 4 8 8 4 SB N/A
5 4 3 4 3 3 4 SB N/A
I want this:
我要这个:
Year Territory Pair_ID type
1 1999 BGD 1 5 PO
2 2000 TAR 6 2 SB
3 2001 JAM 3 7 PO
4 2002 TER 9 2 N/A
I don't want to completely merge the 2 dataframes. I just want to add the "type" column from df2 to df1 by matching the "Pair" column from df1 to either the "pair" column or "pair1" column in df2. I would also like it to fill in with "N/A" for Pairs that are not found in df2.
我不想完全合并2个数据帧。我只想通过将df1中的“Pair”列与df2中的“pair”列或“pair1”列相匹配,将df2中的“type”列添加到df1。我还希望用df2中没有找到的对“N / A”填写。
I could not find anything that addresses this specific problem.
我找不到解决这个特定问题的任何东西。
I've tried this:
我试过这个:
df1$type <- df2$type[match(df1$Pairs, c(df2$pair,df2$pair1))]
But it only matches with the "pair" column and ignores the "pair1" column.
但它只与“对”列匹配,并忽略“pair1”列。
2 个解决方案
#1
3
Good case for sqldf
:
sqldf的好例子:
library(sqldf)
sqldf("select df1.Year
,df1.Territory
,df1.Pair_ID
,df2.type
from df1
left join df2
on df1.Pair_ID = df2.pair
or df1.Pair_ID = df2.pair1
")
Results
Year Territory Pair_ID type
1 1999 BGD 1 5 PO
2 2000 TAR 6 2 SB
3 2001 JAM 3 7 PO
4 2002 TER 9 2 <NA>
#2
1
Try something like
尝试类似的东西
typeA <- df2$type[match(df1$Pairs, df2$pair)]
typeB <- df2$type[match(df1$Pairs, df2$pair1)]
df1$type <- ifelse(is.na(typeA), typeB, typeA)
#1
3
Good case for sqldf
:
sqldf的好例子:
library(sqldf)
sqldf("select df1.Year
,df1.Territory
,df1.Pair_ID
,df2.type
from df1
left join df2
on df1.Pair_ID = df2.pair
or df1.Pair_ID = df2.pair1
")
Results
Year Territory Pair_ID type
1 1999 BGD 1 5 PO
2 2000 TAR 6 2 SB
3 2001 JAM 3 7 PO
4 2002 TER 9 2 <NA>
#2
1
Try something like
尝试类似的东西
typeA <- df2$type[match(df1$Pairs, df2$pair)]
typeB <- df2$type[match(df1$Pairs, df2$pair1)]
df1$type <- ifelse(is.na(typeA), typeB, typeA)