I have the following data frame, and am trying to merge the two columns into one, while replacing NA
's with the numeric values.
我有以下数据框,并尝试将两列合并为一个,同时用数值替换NA。
ID A B
1 3 NA
2 NA 2
3 NA 4
4 1 NA
The result I want is:
我想要的结果是:
ID New
1 3
2 2
3 4
4 1
Thanks in advance!
提前致谢!
6 个解决方案
#1
12
Another very simple solution in this case is to use the rowSums
function.
在这种情况下另一个非常简单的解决方案是使用rowSums函数。
df$New<-rowSums(df[, c("A", "B")], na.rm=T)
df<-df[, c("ID", "New")]
Update: Thanks @Artem Klevtsov for mentioning that this method only works with numeric data.
更新:感谢@Artem Klevtsov提到此方法仅适用于数字数据。
#2
11
You can also do: with(d,ifelse(is.na(A),B,A))
你也可以:with(d,ifelse(is.na(A),B,A))
where d
is your data frame.
其中d是您的数据框。
#3
9
You can use unite
from tidyr
:
你可以使用tidyr的unite:
library(tidyr)
df[is.na(df)] = ''
unite(df, new, A:B, sep='')
# ID new
#1 1 3
#2 2 2
#3 3 4
#4 4 1
#4
7
This probably didn't exist when the answers were written, but since I came here with the same question and found a better solution, here it is for future googlers:
当答案写完时,这可能不存在,但是既然我来到这里有同样的问题并找到了更好的解决方案,这里是未来的googlers:
What you want is the coalesce()
function from dplyr
:
你想要的是来自dplyr的coalesce()函数:
y <- c(1, 2, NA, NA, 5)
z <- c(NA, NA, 3, 4, 5)
coalesce(y, z)
[1] 1 2 3 4 5
#5
6
You could try
你可以试试
New <- do.call(pmax, c(df1[-1], na.rm=TRUE))
Or
要么
New <- df1[-1][cbind(1:nrow(df1),max.col(!is.na(df1[-1])))]
d1 <- data.frame(ID=df1$ID, New)
d1
# ID New
#1 1 3
#2 2 2
#3 3 4
#4 4 1
#6
5
Assuming either A or B have a NA, that would work just fine:
假设A或B都有一个NA,那就可以了:
# creating initial data frame (actually data.table in this case)
library(data.table)
x<- as.data.table(list(ID = c(1,2,3,4), A = c(3, NA, NA, 1), B = c(NA, 2, 4, NA)))
x
# ID A B
#1: 1 3 NA
#2: 2 NA 2
#3: 3 NA 4
#4: 4 1 NA
#solution
y[,New := na.omit(c(A,B)), by = ID][,c("A","B"):=NULL]
y
# ID New
#1: 1 3
#2: 2 2
#3: 3 4
#4: 4 1
#1
12
Another very simple solution in this case is to use the rowSums
function.
在这种情况下另一个非常简单的解决方案是使用rowSums函数。
df$New<-rowSums(df[, c("A", "B")], na.rm=T)
df<-df[, c("ID", "New")]
Update: Thanks @Artem Klevtsov for mentioning that this method only works with numeric data.
更新:感谢@Artem Klevtsov提到此方法仅适用于数字数据。
#2
11
You can also do: with(d,ifelse(is.na(A),B,A))
你也可以:with(d,ifelse(is.na(A),B,A))
where d
is your data frame.
其中d是您的数据框。
#3
9
You can use unite
from tidyr
:
你可以使用tidyr的unite:
library(tidyr)
df[is.na(df)] = ''
unite(df, new, A:B, sep='')
# ID new
#1 1 3
#2 2 2
#3 3 4
#4 4 1
#4
7
This probably didn't exist when the answers were written, but since I came here with the same question and found a better solution, here it is for future googlers:
当答案写完时,这可能不存在,但是既然我来到这里有同样的问题并找到了更好的解决方案,这里是未来的googlers:
What you want is the coalesce()
function from dplyr
:
你想要的是来自dplyr的coalesce()函数:
y <- c(1, 2, NA, NA, 5)
z <- c(NA, NA, 3, 4, 5)
coalesce(y, z)
[1] 1 2 3 4 5
#5
6
You could try
你可以试试
New <- do.call(pmax, c(df1[-1], na.rm=TRUE))
Or
要么
New <- df1[-1][cbind(1:nrow(df1),max.col(!is.na(df1[-1])))]
d1 <- data.frame(ID=df1$ID, New)
d1
# ID New
#1 1 3
#2 2 2
#3 3 4
#4 4 1
#6
5
Assuming either A or B have a NA, that would work just fine:
假设A或B都有一个NA,那就可以了:
# creating initial data frame (actually data.table in this case)
library(data.table)
x<- as.data.table(list(ID = c(1,2,3,4), A = c(3, NA, NA, 1), B = c(NA, 2, 4, NA)))
x
# ID A B
#1: 1 3 NA
#2: 2 NA 2
#3: 3 NA 4
#4: 4 1 NA
#solution
y[,New := na.omit(c(A,B)), by = ID][,c("A","B"):=NULL]
y
# ID New
#1: 1 3
#2: 2 2
#3: 3 4
#4: 4 1