Ruby -查找两个数组中不常见的元素

时间:2021-03-14 12:47:25

I've been thinking about a following problem - there are two arrays, and I need to find elements not common for them both, for example:

我一直在思考一个问题——有两个数组,我需要找到它们不常见的元素,例如:

a = [1,2,3,4]
b = [1,2,4]

And the expected answer is [3].

期望答案是[3]。

So far I've been doing it like this:

到目前为止,我一直这样做:

a.select { |elem| !b.include?(elem) }

But it gives me O(N ** 2) time complexity. I'm sure it can be done faster ;)

但它给了我O(N * 2)时间复杂度。我相信可以做得更快;

Also, I've been thinking about getting it somehow like this (using some method opposite to & which gives common elements of 2 arrays):

另外,我一直在考虑如何获得它(使用与&相反的方法,给出两个数组的公共元素):

a !& b  #=> doesn't work of course

Another way might be to add two arrays and find the unique element with some method similar to uniq, so that:

另一种方法可能是添加两个数组,并使用与uniq类似的方法找到惟一的元素,以便:

[1,1,2,2,3,4,4].some_method #=> would return 3

4 个解决方案

#1


18  

The simplest (in terms of using only the arrays already in place and stock array methods, anyway) solution is the union of the differences:

最简单的解决方案(仅使用已经存在的数组和股票数组方法)解决方案是差异的结合:

a = [1,2,3,4]
b = [1,2,4]
(a-b) | (b-a)
=> [3]

This may or may not be better than O(n**2). There are other options which are likely to give better peformance (see other answers/comments).

这可能比O(n**2)好,也可能不好。还有其他可能提供更好性能的选项(请参阅其他答案/注释)。

Edit: Here's a quick-ish implementation of the sort-and-iterate approach (this assumes no array has repeated elements; otherwise it will need to be modified depending on what behavior is wanted in that case). If anyone can come up with a shorter way to do it, I'd be interested. The limiting factor is the sort used. I assume Ruby uses some sort of Quicksort, so complexity averages O(n log n) with possible worst-case of O(n**2); if the arrays are already sorted, then of course the two calls to sort can be removed and it will run in O(n).

编辑:这里是排序迭代方法的快速实现(假设没有数组具有重复元素;否则,它将需要根据需要的行为进行修改)。如果有人能想出一个更短的方法,我会很感兴趣的。限制因素是所使用的排序。我假设Ruby使用了某种快速排序,复杂度为O(n log n),可能最坏的情况是O(n**2);如果数组已经被排序,那么当然这两个调用可以被删除,并且它将在O(n)中运行。

def diff a, b
  a = a.sort
  b = b.sort
  result = []
  bi = 0
  ai = 0
  while (ai < a.size && bi < b.size)
    if a[ai] == b[bi]
      ai += 1
      bi += 1
    elsif a[ai]<b[bi]
      result << a[ai]
      ai += 1
    else
      result << b[bi]
      bi += 1
    end
  end
  result += a[ai, a.size-ai] if ai<a.size
  result += b[bi, b.size-bi] if bi<b.size
  result
end

#2


14  

As @iamnotmaynard noted in the comments, this is traditionally a set operation (called the symmetric difference). Ruby's Set class includes this operation, so the most idiomatic way to express it would be with a Set:

正如@iamnotmaynard在评论中指出的,这通常是一个set操作(称为对称差异)。Ruby的Set类包含了这个操作,所以最常用的表达方式是集合:

Set.new(a) ^ b

That should give O(n) performance (since a set membership test is constant-time).

这应该会带来O(n)性能(因为集合成员测试是常量时间的)。

#3


10  

a = [1, 2, 3]
b = [2, 3, 4]
a + b - (a & b)
# => [1, 4]

#4


0  

The solution for Array divergences is like:

阵列发散的解为:

a = [1, 2, 3]
b = [2, 3, 4]
(a - b) | (b - a)
# => [1, 4]

You can also read a blog post about Array coherences

您还可以阅读关于数组一致性的博客文章。

#1


18  

The simplest (in terms of using only the arrays already in place and stock array methods, anyway) solution is the union of the differences:

最简单的解决方案(仅使用已经存在的数组和股票数组方法)解决方案是差异的结合:

a = [1,2,3,4]
b = [1,2,4]
(a-b) | (b-a)
=> [3]

This may or may not be better than O(n**2). There are other options which are likely to give better peformance (see other answers/comments).

这可能比O(n**2)好,也可能不好。还有其他可能提供更好性能的选项(请参阅其他答案/注释)。

Edit: Here's a quick-ish implementation of the sort-and-iterate approach (this assumes no array has repeated elements; otherwise it will need to be modified depending on what behavior is wanted in that case). If anyone can come up with a shorter way to do it, I'd be interested. The limiting factor is the sort used. I assume Ruby uses some sort of Quicksort, so complexity averages O(n log n) with possible worst-case of O(n**2); if the arrays are already sorted, then of course the two calls to sort can be removed and it will run in O(n).

编辑:这里是排序迭代方法的快速实现(假设没有数组具有重复元素;否则,它将需要根据需要的行为进行修改)。如果有人能想出一个更短的方法,我会很感兴趣的。限制因素是所使用的排序。我假设Ruby使用了某种快速排序,复杂度为O(n log n),可能最坏的情况是O(n**2);如果数组已经被排序,那么当然这两个调用可以被删除,并且它将在O(n)中运行。

def diff a, b
  a = a.sort
  b = b.sort
  result = []
  bi = 0
  ai = 0
  while (ai < a.size && bi < b.size)
    if a[ai] == b[bi]
      ai += 1
      bi += 1
    elsif a[ai]<b[bi]
      result << a[ai]
      ai += 1
    else
      result << b[bi]
      bi += 1
    end
  end
  result += a[ai, a.size-ai] if ai<a.size
  result += b[bi, b.size-bi] if bi<b.size
  result
end

#2


14  

As @iamnotmaynard noted in the comments, this is traditionally a set operation (called the symmetric difference). Ruby's Set class includes this operation, so the most idiomatic way to express it would be with a Set:

正如@iamnotmaynard在评论中指出的,这通常是一个set操作(称为对称差异)。Ruby的Set类包含了这个操作,所以最常用的表达方式是集合:

Set.new(a) ^ b

That should give O(n) performance (since a set membership test is constant-time).

这应该会带来O(n)性能(因为集合成员测试是常量时间的)。

#3


10  

a = [1, 2, 3]
b = [2, 3, 4]
a + b - (a & b)
# => [1, 4]

#4


0  

The solution for Array divergences is like:

阵列发散的解为:

a = [1, 2, 3]
b = [2, 3, 4]
(a - b) | (b - a)
# => [1, 4]

You can also read a blog post about Array coherences

您还可以阅读关于数组一致性的博客文章。