How can I remove NA values from a vector?
如何从向量中去除NA值?
I have a huge vector which has a couple of NA
values, and I'm trying to find the max value in that vector (the vector is all numbers), but I can't do this because of the NA values.
我有一个巨大的向量,它有两个NA值,我试图找到这个向量的最大值(这个向量是所有数字),但是我不能这么做,因为这个值是NA。
How can I remove the NA values so that I can compute the max?
如何去掉NA的值以便计算最大值?
5 个解决方案
#1
196
Trying ?max
, you'll see that it actually has a na.rm =
argument, set by default to FALSE
. (That's the common default for many other R functions, including sum()
, mean()
, etc.)
试试?麦克斯,你会发现它其实有个na。参数,默认设置为FALSE。(这是许多其他R函数的常见默认值,包括sum()、mean()等)。
Setting na.rm=TRUE
does just what you're asking for:
设置na。rm=TRUE就是你想要的:
d <- c(1, 100, NA, 10)
max(d, na.rm=TRUE)
If you do want to remove all of the NA
s, use this idiom instead:
如果您确实想删除所有的NAs,请使用这个习语:
d <- d[!is.na(d)]
A final note: Other functions (e.g. table()
, lm()
, and sort()
) have NA
-related arguments that use different names (and offer different options). So if NA
's cause you problems in a function call, it's worth checking for a built-in solution among the function's arguments. I've found there's usually one already there.
最后要注意的是:其他函数(例如表()、lm()和sort())具有使用不同名称(并提供不同选项)的与na相关的参数。因此,如果NA在函数调用中出现问题,那么在函数的参数中检查内置的解决方案是值得的。我发现通常已经有一个了。
#2
62
The na.omit
function is what a lot of the regression routines use internally:
na。省略函数是许多回归例程内部使用的函数:
vec <- 1:1000
vec[runif(200, 1, 1000)] <- NA
max(vec)
#[1] NA
max( na.omit(vec) )
#[1] 1000
#3
11
?max
shows you that there is an extra parameter na.rm
that you can set to TRUE
.
max向你展示了一个额外的参数na。可以设置为TRUE。
Apart from that, if you really want to remove the NA
s, just use something like:
除此之外,如果您真的想删除NAs,只需使用以下内容:
myvec[!is.na(myvec)]
#4
10
You can call max(vector, na.rm = TRUE)
. More generally, you can use the na.omit()
function.
你可以叫max(向量,na)rm = TRUE)。更一般地说,您可以使用na.omit()函数。
#5
10
Just in case someone new to R wants a simplified answer to the original question
以防有人对R有新想法,想要原始问题的简化答案
How can I remove NA values from a vector?
如何从向量中去除NA值?
Here it is:
这里是:
Assume you have a vector foo
as follows:
假设有一个向量foo,如下所示:
foo = c(1:10, NA, 20:30)
running length(foo)
gives 22.
运行长度(foo)为22。
nona_foo = foo[!is.na(foo)]
length(nona_foo)
is 21, because the NA values have been removed.
长度(nona_foo)是21,因为NA值已经被删除。
Remember is.na(foo)
returns a boolean matrix, so indexing foo
with the opposite of this value will give you all the elements which are not NA.
记住,is.na(foo)返回一个布尔矩阵,因此,与这个值相反的foo将给出所有非NA的元素。
#1
196
Trying ?max
, you'll see that it actually has a na.rm =
argument, set by default to FALSE
. (That's the common default for many other R functions, including sum()
, mean()
, etc.)
试试?麦克斯,你会发现它其实有个na。参数,默认设置为FALSE。(这是许多其他R函数的常见默认值,包括sum()、mean()等)。
Setting na.rm=TRUE
does just what you're asking for:
设置na。rm=TRUE就是你想要的:
d <- c(1, 100, NA, 10)
max(d, na.rm=TRUE)
If you do want to remove all of the NA
s, use this idiom instead:
如果您确实想删除所有的NAs,请使用这个习语:
d <- d[!is.na(d)]
A final note: Other functions (e.g. table()
, lm()
, and sort()
) have NA
-related arguments that use different names (and offer different options). So if NA
's cause you problems in a function call, it's worth checking for a built-in solution among the function's arguments. I've found there's usually one already there.
最后要注意的是:其他函数(例如表()、lm()和sort())具有使用不同名称(并提供不同选项)的与na相关的参数。因此,如果NA在函数调用中出现问题,那么在函数的参数中检查内置的解决方案是值得的。我发现通常已经有一个了。
#2
62
The na.omit
function is what a lot of the regression routines use internally:
na。省略函数是许多回归例程内部使用的函数:
vec <- 1:1000
vec[runif(200, 1, 1000)] <- NA
max(vec)
#[1] NA
max( na.omit(vec) )
#[1] 1000
#3
11
?max
shows you that there is an extra parameter na.rm
that you can set to TRUE
.
max向你展示了一个额外的参数na。可以设置为TRUE。
Apart from that, if you really want to remove the NA
s, just use something like:
除此之外,如果您真的想删除NAs,只需使用以下内容:
myvec[!is.na(myvec)]
#4
10
You can call max(vector, na.rm = TRUE)
. More generally, you can use the na.omit()
function.
你可以叫max(向量,na)rm = TRUE)。更一般地说,您可以使用na.omit()函数。
#5
10
Just in case someone new to R wants a simplified answer to the original question
以防有人对R有新想法,想要原始问题的简化答案
How can I remove NA values from a vector?
如何从向量中去除NA值?
Here it is:
这里是:
Assume you have a vector foo
as follows:
假设有一个向量foo,如下所示:
foo = c(1:10, NA, 20:30)
running length(foo)
gives 22.
运行长度(foo)为22。
nona_foo = foo[!is.na(foo)]
length(nona_foo)
is 21, because the NA values have been removed.
长度(nona_foo)是21,因为NA值已经被删除。
Remember is.na(foo)
returns a boolean matrix, so indexing foo
with the opposite of this value will give you all the elements which are not NA.
记住,is.na(foo)返回一个布尔矩阵,因此,与这个值相反的foo将给出所有非NA的元素。