如何在R中只删除一个重复值的实例?

时间:2021-03-28 07:57:20

Let's consider a vector of numeric values "x". Some values may be duplicates. I need to remove the max value one by one until x is empty.

让我们考虑一个数值为“x”的向量。有些值可能是重复的。我需要一个一个地删除最大值,直到x为空。

Problem, if I use:

问题,如果我用:

x <- x[x != max(x)]

It removes all duplicates equal to the maximum. I want to remove only one of the duplicates. So until now, I do:

它删除所有等于最大值的重复。我只想删除其中一个副本。到目前为止,我是这么认为的:

max.x <- x[x == max(x)]
max.x <- max.x[1:length(max.x) - 1]
x <- c(x[x != max(x)], max.x)

But this is far from computationally efficient, and I'm not good enough at R to find the right way to do this. Can someone has a better trick?

但这远没有计算效率,我在R方面还不够好,找不到正确的方法。有人有更好的办法吗?

Thanks

谢谢

3 个解决方案

#1


1  

You're not entirely clear what the scope of your problem is, so I'll just give the first suggestion I have that comes to mind. Use the sort function to get the list of values in decreasing order.

你还不完全清楚你的问题的范围是什么,所以我先给出我的第一个建议。使用sort函数按降序获取值列表。

sorted <- sort(x,decreasing=TRUE,index.return=TRUE)

< -排序排序(x,减少= TRUE,index.return = TRUE)

You can now iteratively remove the highest item from x. Re-using the sort function over and over on your subset data is inefficient - better to keep a permanent copy of x and do the removals from that, if possible.

您现在可以迭代地从x中删除最高的项。在您的子集数据上反复使用sort函数是低效的——如果可能的话,最好保留一个永久的x副本,并从其中删除。

Consider this approach

考虑这个方法

# random set of data with duplicates
x <- floor(runif(50)*15)
# sort with index.return returns a sorted x in sorted$x and the 
# indices of the sorted values from the original x in sorted$ix
sorted <- sort(x,decreasing=TRUE,index.return=TRUE)

for( i in 1:length(x) )
{
 # remove data from x
 newX <- x[-sorted$ix[1:i]]
 print(sort(newX,decreasing=TRUE))
}

#2


2  

Just for fun,
x <- x[ -which.max(x)]

只是为了好玩,x <- x[-which.max(x)]

rinse, lather, repeat.

泡沫冲洗,重复。

dagnabit howcome 4 spaces isn't causing code coloration?

dagnabit howcome 4空格不会导致代码着色吗?

#3


0  

The way I understand your question,

我理解你的问题的方式,

 ?unique

might give you what you want.

可能会给你你想要的。

Rgds, Rainer

祝好,Rainer

#1


1  

You're not entirely clear what the scope of your problem is, so I'll just give the first suggestion I have that comes to mind. Use the sort function to get the list of values in decreasing order.

你还不完全清楚你的问题的范围是什么,所以我先给出我的第一个建议。使用sort函数按降序获取值列表。

sorted <- sort(x,decreasing=TRUE,index.return=TRUE)

< -排序排序(x,减少= TRUE,index.return = TRUE)

You can now iteratively remove the highest item from x. Re-using the sort function over and over on your subset data is inefficient - better to keep a permanent copy of x and do the removals from that, if possible.

您现在可以迭代地从x中删除最高的项。在您的子集数据上反复使用sort函数是低效的——如果可能的话,最好保留一个永久的x副本,并从其中删除。

Consider this approach

考虑这个方法

# random set of data with duplicates
x <- floor(runif(50)*15)
# sort with index.return returns a sorted x in sorted$x and the 
# indices of the sorted values from the original x in sorted$ix
sorted <- sort(x,decreasing=TRUE,index.return=TRUE)

for( i in 1:length(x) )
{
 # remove data from x
 newX <- x[-sorted$ix[1:i]]
 print(sort(newX,decreasing=TRUE))
}

#2


2  

Just for fun,
x <- x[ -which.max(x)]

只是为了好玩,x <- x[-which.max(x)]

rinse, lather, repeat.

泡沫冲洗,重复。

dagnabit howcome 4 spaces isn't causing code coloration?

dagnabit howcome 4空格不会导致代码着色吗?

#3


0  

The way I understand your question,

我理解你的问题的方式,

 ?unique

might give you what you want.

可能会给你你想要的。

Rgds, Rainer

祝好,Rainer