I have a large dataset with ~200 columns of various types. I need to replace NA
values with ""
, but only in character columns.
我有一个包含各种类型约200列的大型数据集。我需要用“”替换NA值,但只能在字符列中。
Using the dummy data table
使用虚拟数据表
DT <- data.table(x = c(1, NA, 2),
y = c("a", "b", NA))
> DT
x y
1: 1 a
2: NA b
3: 2 <NA>
> str(DT)
Classes ‘data.table’ and 'data.frame': 3 obs. of 2 variables:
$ x: num 1 NA 2
$ y: chr "a" "b" NA
I have tried the following for-loop with a condition, but it doesn't work.
我尝试了以下for循环的条件,但它不起作用。
for (i in names(DT)) {
if (class(DT$i) == "character") {
DT[is.na(i), i := ""]
}
}
The loop runs with no errors, but doesn't change the DT
.
循环运行没有错误,但不会更改DT。
The expected output I am looking for is this:
我正在寻找的预期输出是这样的:
x y
1: 1 a
2: NA b
3: 2
The solution doesn't necessarily have to involve a loop, but I couldn't think of one.
解决方案不一定涉及循环,但我想不到一个。
2 个解决方案
#1
2
DT[, lapply(.SD, function(x){if(is.character(x)) x[is.na(x)] <- ' '; x})]
Or, if you don't like typing function(x)
或者,如果你不喜欢打字功能(x)
library(purrr)
DT[, map(.SD, ~{if(is.character(.x)) .x[is.na(.x)] <- ' '; .x})]
To replace
DT[, names(DT) := map(.SD, ~{if(is.character(.x)) .x[is.na(.x)] <- ' '; .x})]
#2
2
One option if you don't mind using dplyr:
如果您不介意使用dplyr,可以选择一种方法:
na_to_space <- function(x) ifelse(is.na(x)," ",x)
> DT %>% mutate_if(.predicate = is.character,.funs = na_to_space)
x y
1 1 a
2 NA b
3 2
#1
2
DT[, lapply(.SD, function(x){if(is.character(x)) x[is.na(x)] <- ' '; x})]
Or, if you don't like typing function(x)
或者,如果你不喜欢打字功能(x)
library(purrr)
DT[, map(.SD, ~{if(is.character(.x)) .x[is.na(.x)] <- ' '; .x})]
To replace
DT[, names(DT) := map(.SD, ~{if(is.character(.x)) .x[is.na(.x)] <- ' '; .x})]
#2
2
One option if you don't mind using dplyr:
如果您不介意使用dplyr,可以选择一种方法:
na_to_space <- function(x) ifelse(is.na(x)," ",x)
> DT %>% mutate_if(.predicate = is.character,.funs = na_to_space)
x y
1 1 a
2 NA b
3 2