I'm trying to replace elements of a data.frame containing "#N/A" with "NULL", and I'm running into problems:
我正在尝试替换一个包含“#N/ a”和“NULL”的数据的元素,我遇到了一些问题:
foo <- data.frame("day"= c(1, 3, 5, 7), "od" = c(0.1, "#N/A", 0.4, 0.8))
indices_of_NAs <- which(foo == "#N/A")
replace(foo, indices_of_NAs, "NULL")
Error in [<-.data.frame
(*tmp*
, list, value = "NULL") : new columns would leave holes after existing columns
在[<-.data.frame(*tmp*, list, value = "NULL")中出现错误:新列将在现有列之后留下漏洞。
I think that the problem is that my index is treating the data.frame as a vector, but that the replace function is treating it differently somehow, but I'm not sure what the issue is?
我认为问题是我的索引把数据a作为一个向量,但是替换函数以某种方式处理它,但是我不确定这个问题是什么?
3 个解决方案
#1
19
NULL really means "nothing", not "missing" so it cannot take the place of an actual value - for missing R uses NA.
空的意思是“没有”,而不是“丢失”,所以它不能代替实际值——因为缺少R使用NA。
You can use the replacement method of is.na to directly update the selected elements, this will work with a logical result. (Using which for indices will only work with is.na, direct use of [ invokes list access, which is the cause of your error).
你可以用is的替换方法。na直接更新所选元素,这将符合逻辑结果。(用于索引的使用只适用于is。na,直接使用[调用列表访问,这是导致错误的原因)。
foo <- data.frame("day"= c(1, 3, 5, 7), "od" = c(0.1, "#N/A", 0.4, 0.8))
NAs <- foo == "#N/A"
## by replace method
is.na(foo)[NAs] <- TRUE
## or directly
foo[NAs] <- NA
But, you are already dealing with strings (actually a factor by default) in your od column by forced coercion when it was created with c(), and you might need to treat columns individually. Any numeric column will never have a match on the string "#N/A", for example.
但是,在使用c()创建时,您已经使用强制强制方法处理了在od列中的字符串(实际上是默认的一个因素),您可能需要单独处理列。例如,任何数字列都不会在字符串“#N/ a”上匹配。
#2
12
Why not
为什么不
x$col[is.na(x$col)]<-value
?
You wont have to change your dataframe
吗?你不需要改变你的数据。
#3
1
The replace function expects a vector and you're supplying a data.frame.
replace函数需要一个向量,并且提供一个数据。frame。
You should really try to use NA
and NULL
instead of the character values that you're currently using. Otherwise you won't be able to take advantage of all of R's functionality to handle missing values.
您应该真正尝试使用NA和NULL,而不是您当前使用的字符值。否则,您将无法利用R的所有功能来处理丢失的值。
Edit
编辑
You could use an apply function, or do something like this:
你可以使用一个应用函数,或者这样做:
foo <- data.frame(day= c(1, 3, 5, 7), od = c(0.1, NA, 0.4, 0.8))
idx <- which(is.na(foo), arr.ind=TRUE)
foo[idx[1], idx[2]] <- "NULL"
You cannot assign a real NULL
value in this case, because it has length zero. It is important to understand the difference between NA
and NULL
, so I recommend that you read ?NA
and ?NULL
.
在这种情况下,您不能指定一个真正的空值,因为它的长度为0。理解NA和NULL的区别是很重要的,所以我建议您阅读?NA和?NULL。
#1
19
NULL really means "nothing", not "missing" so it cannot take the place of an actual value - for missing R uses NA.
空的意思是“没有”,而不是“丢失”,所以它不能代替实际值——因为缺少R使用NA。
You can use the replacement method of is.na to directly update the selected elements, this will work with a logical result. (Using which for indices will only work with is.na, direct use of [ invokes list access, which is the cause of your error).
你可以用is的替换方法。na直接更新所选元素,这将符合逻辑结果。(用于索引的使用只适用于is。na,直接使用[调用列表访问,这是导致错误的原因)。
foo <- data.frame("day"= c(1, 3, 5, 7), "od" = c(0.1, "#N/A", 0.4, 0.8))
NAs <- foo == "#N/A"
## by replace method
is.na(foo)[NAs] <- TRUE
## or directly
foo[NAs] <- NA
But, you are already dealing with strings (actually a factor by default) in your od column by forced coercion when it was created with c(), and you might need to treat columns individually. Any numeric column will never have a match on the string "#N/A", for example.
但是,在使用c()创建时,您已经使用强制强制方法处理了在od列中的字符串(实际上是默认的一个因素),您可能需要单独处理列。例如,任何数字列都不会在字符串“#N/ a”上匹配。
#2
12
Why not
为什么不
x$col[is.na(x$col)]<-value
?
You wont have to change your dataframe
吗?你不需要改变你的数据。
#3
1
The replace function expects a vector and you're supplying a data.frame.
replace函数需要一个向量,并且提供一个数据。frame。
You should really try to use NA
and NULL
instead of the character values that you're currently using. Otherwise you won't be able to take advantage of all of R's functionality to handle missing values.
您应该真正尝试使用NA和NULL,而不是您当前使用的字符值。否则,您将无法利用R的所有功能来处理丢失的值。
Edit
编辑
You could use an apply function, or do something like this:
你可以使用一个应用函数,或者这样做:
foo <- data.frame(day= c(1, 3, 5, 7), od = c(0.1, NA, 0.4, 0.8))
idx <- which(is.na(foo), arr.ind=TRUE)
foo[idx[1], idx[2]] <- "NULL"
You cannot assign a real NULL
value in this case, because it has length zero. It is important to understand the difference between NA
and NULL
, so I recommend that you read ?NA
and ?NULL
.
在这种情况下,您不能指定一个真正的空值,因为它的长度为0。理解NA和NULL的区别是很重要的,所以我建议您阅读?NA和?NULL。