What is the best way to remove from the array elements that are repeated. For example, from the array
从重复的数组元素中删除元素的最佳方式是什么?例如,来自数组
a = [4, 3, 3, 1, 6, 6]
need to get
需要
a = [4, 1]
My method works to too slowly with big amount of elements.
我的方法处理大量元素的速度太慢。
arr = [4, 3, 3, 1, 6, 6]
puts arr.join(" ")
nouniq = []
l = arr.length
uniq = nil
for i in 0..(l-1)
for j in 0..(l-1)
if (arr[j] == arr[i]) and ( i != j )
nouniq << arr[j]
end
end
end
arr = (arr - nouniq).compact
puts arr.join(" ")
5 个解决方案
#1
4
a = [4, 3, 3, 1, 6, 6]
a.select{|b| a.count(b) == 1}
#=> [4, 1]
More complicated but faster solution (O(n)
I believe :))
更复杂但更快的解决方案(O(n)我相信:)
a = [4, 3, 3, 1, 6, 6]
ar = []
add = proc{|to, form| to << from[1] if form.uniq.size == from.size }
a.sort!.each_cons(3){|b| add.call(ar, b)}
ar << a[0] if a[0] != a[1]; ar << a[-1] if a[-1] != a[-2]
#2
4
arr = [4, 3, 3, 1, 6, 6]
arr.
group_by {|e| e }.
map {|e, es| [e, es.length] }.
reject {|e, count| count > 1 }.
map(&:first)
# [4, 1]
#3
2
Without introducing the need for a separate copy of the original array and using inject:
不需要单独复制原始数组并使用inject:
[4, 3, 3, 1, 6, 6].inject({}) {|s,v| s[v] ? s.merge({v=>s[v]+1}) : s.merge({v=>1})}.select {|k,v| k if v==1}.keys
=> [4, 1]
#4
1
I needed something like this, so tested a few different approaches. These all return an array of the items that are duplicated in the original array:
我需要这样的东西,所以测试了几种不同的方法。这些都返回原始数组中重复的项目的数组:
module Enumerable
def dups
inject({}) {|h,v| h[v]=h[v].to_i+1; h}.reject{|k,v| v==1}.keys
end
def only_duplicates
duplicates = []
self.each {|each| duplicates << each if self.count(each) > 1}
duplicates.uniq
end
def dups_ej
inject(Hash.new(0)) {|h,v| h[v] += 1; h}.reject{|k,v| v==1}.keys
end
def dedup
duplicates = self.dup
self.uniq.each { |v| duplicates[self.index(v)] = nil }
duplicates.compact.uniq
end
end
Benchark results for 100,000 iterations, first with an array of integers, then an array of strings. Performance will vary depending on the numer of duplicates found, but these tests are with a fixed number of duplicates (~ half array entries are duplicates):
Benchark的结果是100,000次迭代,首先是整数数组,然后是字符串数组。性能将根据所发现的重复的数字而有所不同,但是这些测试是有固定数量的重复的(~半数组条目是重复的):
test_benchmark_integer
user system total real
Enumerable.dups 2.560000 0.040000 2.600000 ( 2.596083)
Enumerable.only_duplicates 6.840000 0.020000 6.860000 ( 6.879830)
Enumerable.dups_ej 2.300000 0.030000 2.330000 ( 2.329113)
Enumerable.dedup 1.700000 0.020000 1.720000 ( 1.724220)
test_benchmark_strings
user system total real
Enumerable.dups 4.650000 0.030000 4.680000 ( 4.722301)
Enumerable.only_duplicates 47.060000 0.150000 47.210000 ( 47.478509)
Enumerable.dups_ej 4.060000 0.030000 4.090000 ( 4.123402)
Enumerable.dedup 3.290000 0.040000 3.330000 ( 3.334401)
..
Finished in 73.190988 seconds.
So of these approaches, it seems the Enumerable.dedup algorithm is the best:
在这些方法中,似乎最优的是可枚举的。dedup算法:
- dup the original array so it is immutable
- 使用原始数组,所以它是不可变的。
- gets the uniq array elements
- 获取uniq数组元素
- for each unique element: nil the first occurence in the dup array
- 对于每个唯一元素:nil表示出现在dup数组中的第一个元素
- compact the result
- 紧凑的结果
If only (array - array.uniq) worked correctly! (it doesn't - it removes everything)
如果(数组- array.uniq)正确工作!(它没有——它删除了一切)
#5
0
Here's my spin on a solution used by Perl programmers using a hash to accumulate counts for each element in the array:
下面是我对Perl程序员使用散列为数组中的每个元素积累计数的解决方案的看法:
ary = [4, 3, 3, 1, 6, 6]
ary.inject({}) { |h,a|
h[a] ||= 0
h[a] += 1
h
}.select{ |k,v| v == 1 }.keys # => [4, 1]
It could be on one line, if that's at all important, by judicious use of semicolons between the lines in the map
.
它可以在一行上,如果这一点很重要的话,在地图上的线之间合理地使用分号。
A little different way is:
一种不同的方式是:
ary.inject({}) { |h,a| h[a] ||= 0; h[a] += 1; h }.map{ |k,v| k if (v==1) }.compact # => [4, 1]
It replaces the select{...}.keys
with map{...}.compact
so it's not really an improvement, and, to me is a bit harder to understand.
它取代了选择{…}。带有map{…}.compact的键,所以它并不是一个真正的改进,而且对我来说有点难以理解。
#1
4
a = [4, 3, 3, 1, 6, 6]
a.select{|b| a.count(b) == 1}
#=> [4, 1]
More complicated but faster solution (O(n)
I believe :))
更复杂但更快的解决方案(O(n)我相信:)
a = [4, 3, 3, 1, 6, 6]
ar = []
add = proc{|to, form| to << from[1] if form.uniq.size == from.size }
a.sort!.each_cons(3){|b| add.call(ar, b)}
ar << a[0] if a[0] != a[1]; ar << a[-1] if a[-1] != a[-2]
#2
4
arr = [4, 3, 3, 1, 6, 6]
arr.
group_by {|e| e }.
map {|e, es| [e, es.length] }.
reject {|e, count| count > 1 }.
map(&:first)
# [4, 1]
#3
2
Without introducing the need for a separate copy of the original array and using inject:
不需要单独复制原始数组并使用inject:
[4, 3, 3, 1, 6, 6].inject({}) {|s,v| s[v] ? s.merge({v=>s[v]+1}) : s.merge({v=>1})}.select {|k,v| k if v==1}.keys
=> [4, 1]
#4
1
I needed something like this, so tested a few different approaches. These all return an array of the items that are duplicated in the original array:
我需要这样的东西,所以测试了几种不同的方法。这些都返回原始数组中重复的项目的数组:
module Enumerable
def dups
inject({}) {|h,v| h[v]=h[v].to_i+1; h}.reject{|k,v| v==1}.keys
end
def only_duplicates
duplicates = []
self.each {|each| duplicates << each if self.count(each) > 1}
duplicates.uniq
end
def dups_ej
inject(Hash.new(0)) {|h,v| h[v] += 1; h}.reject{|k,v| v==1}.keys
end
def dedup
duplicates = self.dup
self.uniq.each { |v| duplicates[self.index(v)] = nil }
duplicates.compact.uniq
end
end
Benchark results for 100,000 iterations, first with an array of integers, then an array of strings. Performance will vary depending on the numer of duplicates found, but these tests are with a fixed number of duplicates (~ half array entries are duplicates):
Benchark的结果是100,000次迭代,首先是整数数组,然后是字符串数组。性能将根据所发现的重复的数字而有所不同,但是这些测试是有固定数量的重复的(~半数组条目是重复的):
test_benchmark_integer
user system total real
Enumerable.dups 2.560000 0.040000 2.600000 ( 2.596083)
Enumerable.only_duplicates 6.840000 0.020000 6.860000 ( 6.879830)
Enumerable.dups_ej 2.300000 0.030000 2.330000 ( 2.329113)
Enumerable.dedup 1.700000 0.020000 1.720000 ( 1.724220)
test_benchmark_strings
user system total real
Enumerable.dups 4.650000 0.030000 4.680000 ( 4.722301)
Enumerable.only_duplicates 47.060000 0.150000 47.210000 ( 47.478509)
Enumerable.dups_ej 4.060000 0.030000 4.090000 ( 4.123402)
Enumerable.dedup 3.290000 0.040000 3.330000 ( 3.334401)
..
Finished in 73.190988 seconds.
So of these approaches, it seems the Enumerable.dedup algorithm is the best:
在这些方法中,似乎最优的是可枚举的。dedup算法:
- dup the original array so it is immutable
- 使用原始数组,所以它是不可变的。
- gets the uniq array elements
- 获取uniq数组元素
- for each unique element: nil the first occurence in the dup array
- 对于每个唯一元素:nil表示出现在dup数组中的第一个元素
- compact the result
- 紧凑的结果
If only (array - array.uniq) worked correctly! (it doesn't - it removes everything)
如果(数组- array.uniq)正确工作!(它没有——它删除了一切)
#5
0
Here's my spin on a solution used by Perl programmers using a hash to accumulate counts for each element in the array:
下面是我对Perl程序员使用散列为数组中的每个元素积累计数的解决方案的看法:
ary = [4, 3, 3, 1, 6, 6]
ary.inject({}) { |h,a|
h[a] ||= 0
h[a] += 1
h
}.select{ |k,v| v == 1 }.keys # => [4, 1]
It could be on one line, if that's at all important, by judicious use of semicolons between the lines in the map
.
它可以在一行上,如果这一点很重要的话,在地图上的线之间合理地使用分号。
A little different way is:
一种不同的方式是:
ary.inject({}) { |h,a| h[a] ||= 0; h[a] += 1; h }.map{ |k,v| k if (v==1) }.compact # => [4, 1]
It replaces the select{...}.keys
with map{...}.compact
so it's not really an improvement, and, to me is a bit harder to understand.
它取代了选择{…}。带有map{…}.compact的键,所以它并不是一个真正的改进,而且对我来说有点难以理解。