查找数组中最常见的字符串

时间:2022-08-28 21:29:17

I have this array, for example (the size is variable):

我有这个数组,例如(大小是可变的):

   x = ["1.111", "1.122", "1.250", "1.111"]

and I need to find the most commom value ("1.111" in this case).

我需要找到最常用的值(在这种情况下为“1.111”)。

Is there an easy way to do that?

有一个简单的方法吗?

Tks in advance!

事先提前!


EDIT #1: Thank you all for the answers!

编辑#1:谢谢大家的答案!


EDIT #2: I've changed my accepted answer based on Z.E.D.'s information. Thank you all again!

编辑#2:我根据Z.E.D.的信息更改了我接受的答案。再次感谢大家!

6 个解决方案

#1


43  

Ruby < 2.2

#!/usr/bin/ruby1.8

def most_common_value(a)
  a.group_by do |e|
    e
  end.values.max_by(&:size).first
end

x = ["1.111", "1.122", "1.250", "1.111"]
p most_common_value(x)    # => "1.111"

Note: Enumberable.max_by is new with Ruby 1.9, but it has been backported to 1.8.7

注意:Enumberable.max_by是Ruby 1.9的新增功能,但它已被反向移植到1.8.7

Ruby >= 2.2

Ruby 2.2 introduces the Object#itself method, with which we can make the code more concise:

Ruby 2.2引入了Object#本身方法,通过它我们可以使代码更简洁:

def most_common_value(a)
  a.group_by(&:itself).values.max_by(&:size).first
end

As a monkey patch

Or as Enumerable#mode:

或者作为Enumerable#模式:

Enumerable.class_eval do
  def mode
    group_by do |e|
      e
    end.values.max_by(&:size).first
  end
end

["1.111", "1.122", "1.250", "1.111"].mode
# => "1.111"

#2


5  

One pass through the hash to accumulate the counts. Use .max() to find the hash entry with the largest value.

一次通过哈希来累积计数。使用.max()查找具有最大值的哈希条目。

#!/usr/bin/ruby

a = Hash.new(0)
["1.111", "1.122", "1.250", "1.111"].each { |num|
  a[num] += 1
}

a.max{ |a,b| a[1] <=> b[1] } # => ["1.111", 2]

or, roll it all into one line:

或者,将它们全部卷成一行:

ary.inject(Hash.new(0)){ |h,i| h[i] += 1; h }.max{ |a,b| a[1] <=> b[1] } # => ["1.111", 2]

If you only want the item back add .first():

如果你只想要重新添加项目.first():

ary.inject(Hash.new(0)){ |h,i| h[i] += 1; h }.max{ |a,b| a[1] <=> b[1] }.first # => "1.111"

The first sample I used is how it would be done in Perl usually. The second is more Ruby-ish. Both work with older versions of Ruby. I wanted to compare them, plus see how Wayne's solution would speed things up so I tested with benchmark:

我使用的第一个样本是如何在Perl中完成的。第二个是Ruby-ish。两者都适用于旧版本的Ruby。我想比较它们,再看看Wayne的解决方案如何加快速度,所以我测试了基准:

#!/usr/bin/env ruby

require 'benchmark'

ary = ["1.111", "1.122", "1.250", "1.111"] * 1000 

def most_common_value(a)
  a.group_by { |e| e }.values.max_by { |values| values.size }.first
end

n = 1000
Benchmark.bm(20) do |x|
  x.report("Hash.new(0)") do
    n.times do 
      a = Hash.new(0)
      ary.each { |num| a[num] += 1 }
      a.max{ |a,b| a[1] <=> b[1] }.first
    end 
  end

  x.report("inject:") do
    n.times do
      ary.inject(Hash.new(0)){ |h,i| h[i] += 1; h }.max{ |a,b| a[1] <=> b[1] }.first
    end
  end

  x.report("most_common_value():") do
    n.times do
      most_common_value(ary)
    end
  end
end

Here's the results:

结果如下:

                          user     system      total        real
Hash.new(0)           2.150000   0.000000   2.150000 (  2.164180)
inject:               2.440000   0.010000   2.450000 (  2.451466)
most_common_value():  1.080000   0.000000   1.080000 (  1.089784)

#3


4  

You could sort the array and then loop over it once. In the loop just keep track of the current item and the number of times it is seen. Once the list ends or the item changes, set max_count == count if count > max_count. And of course keep track of which item has the max_count.

您可以对数组进行排序,然后循环一次。在循环中,只需跟踪当前项目及其显示的次数。列表结束或项目更改后,如果count> max_count,则设置max_count == count。当然要跟踪哪个项目有max_count。

#4


2  

You could create a hashmap that stores the array items as keys with their values being the number of times that element appears in the array.

您可以创建一个hashmap,将数组项存储为键,其值是元素在数组中出现的次数。

Pseudo Code:

伪代码:

["1.111", "1.122", "1.250", "1.111"].each { |num|
  count=your_hash_map.get(num)
  if(item==nil)
    hashmap.put(num,1)
  else
    hashmap.put(num,count+1)
}

As already mentioned, sorting might be faster.

如前所述,排序可能会更快。

#5


2  

Using the default value feature of hashes:

使用散列的默认值功能:

>> x = ["1.111", "1.122", "1.250", "1.111"]
>> h = Hash.new(0)
>> x.each{|i| h[i] += 1 }
>> h.max{|a,b| a[1] <=> b[1] }
["1.111", 2]

#6


0  

It will return most popular value in array

它将返回数组中最受欢迎的值

x.group_by{|a| a }.sort_by{|a,b| b.size<=>a.size}.first[0]

IE:

IE:

x = ["1.111", "1.122", "1.250", "1.111"]
# Most popular
x.group_by{|a| a }.sort_by{|a,b| b.size<=>a.size}.first[0]
#=> "1.111
# How many times
x.group_by{|a| a }.sort_by{|a,b| b.size<=>a.size}.first[1].size
#=> 2

#1


43  

Ruby < 2.2

#!/usr/bin/ruby1.8

def most_common_value(a)
  a.group_by do |e|
    e
  end.values.max_by(&:size).first
end

x = ["1.111", "1.122", "1.250", "1.111"]
p most_common_value(x)    # => "1.111"

Note: Enumberable.max_by is new with Ruby 1.9, but it has been backported to 1.8.7

注意:Enumberable.max_by是Ruby 1.9的新增功能,但它已被反向移植到1.8.7

Ruby >= 2.2

Ruby 2.2 introduces the Object#itself method, with which we can make the code more concise:

Ruby 2.2引入了Object#本身方法,通过它我们可以使代码更简洁:

def most_common_value(a)
  a.group_by(&:itself).values.max_by(&:size).first
end

As a monkey patch

Or as Enumerable#mode:

或者作为Enumerable#模式:

Enumerable.class_eval do
  def mode
    group_by do |e|
      e
    end.values.max_by(&:size).first
  end
end

["1.111", "1.122", "1.250", "1.111"].mode
# => "1.111"

#2


5  

One pass through the hash to accumulate the counts. Use .max() to find the hash entry with the largest value.

一次通过哈希来累积计数。使用.max()查找具有最大值的哈希条目。

#!/usr/bin/ruby

a = Hash.new(0)
["1.111", "1.122", "1.250", "1.111"].each { |num|
  a[num] += 1
}

a.max{ |a,b| a[1] <=> b[1] } # => ["1.111", 2]

or, roll it all into one line:

或者,将它们全部卷成一行:

ary.inject(Hash.new(0)){ |h,i| h[i] += 1; h }.max{ |a,b| a[1] <=> b[1] } # => ["1.111", 2]

If you only want the item back add .first():

如果你只想要重新添加项目.first():

ary.inject(Hash.new(0)){ |h,i| h[i] += 1; h }.max{ |a,b| a[1] <=> b[1] }.first # => "1.111"

The first sample I used is how it would be done in Perl usually. The second is more Ruby-ish. Both work with older versions of Ruby. I wanted to compare them, plus see how Wayne's solution would speed things up so I tested with benchmark:

我使用的第一个样本是如何在Perl中完成的。第二个是Ruby-ish。两者都适用于旧版本的Ruby。我想比较它们,再看看Wayne的解决方案如何加快速度,所以我测试了基准:

#!/usr/bin/env ruby

require 'benchmark'

ary = ["1.111", "1.122", "1.250", "1.111"] * 1000 

def most_common_value(a)
  a.group_by { |e| e }.values.max_by { |values| values.size }.first
end

n = 1000
Benchmark.bm(20) do |x|
  x.report("Hash.new(0)") do
    n.times do 
      a = Hash.new(0)
      ary.each { |num| a[num] += 1 }
      a.max{ |a,b| a[1] <=> b[1] }.first
    end 
  end

  x.report("inject:") do
    n.times do
      ary.inject(Hash.new(0)){ |h,i| h[i] += 1; h }.max{ |a,b| a[1] <=> b[1] }.first
    end
  end

  x.report("most_common_value():") do
    n.times do
      most_common_value(ary)
    end
  end
end

Here's the results:

结果如下:

                          user     system      total        real
Hash.new(0)           2.150000   0.000000   2.150000 (  2.164180)
inject:               2.440000   0.010000   2.450000 (  2.451466)
most_common_value():  1.080000   0.000000   1.080000 (  1.089784)

#3


4  

You could sort the array and then loop over it once. In the loop just keep track of the current item and the number of times it is seen. Once the list ends or the item changes, set max_count == count if count > max_count. And of course keep track of which item has the max_count.

您可以对数组进行排序,然后循环一次。在循环中,只需跟踪当前项目及其显示的次数。列表结束或项目更改后,如果count> max_count,则设置max_count == count。当然要跟踪哪个项目有max_count。

#4


2  

You could create a hashmap that stores the array items as keys with their values being the number of times that element appears in the array.

您可以创建一个hashmap,将数组项存储为键,其值是元素在数组中出现的次数。

Pseudo Code:

伪代码:

["1.111", "1.122", "1.250", "1.111"].each { |num|
  count=your_hash_map.get(num)
  if(item==nil)
    hashmap.put(num,1)
  else
    hashmap.put(num,count+1)
}

As already mentioned, sorting might be faster.

如前所述,排序可能会更快。

#5


2  

Using the default value feature of hashes:

使用散列的默认值功能:

>> x = ["1.111", "1.122", "1.250", "1.111"]
>> h = Hash.new(0)
>> x.each{|i| h[i] += 1 }
>> h.max{|a,b| a[1] <=> b[1] }
["1.111", 2]

#6


0  

It will return most popular value in array

它将返回数组中最受欢迎的值

x.group_by{|a| a }.sort_by{|a,b| b.size<=>a.size}.first[0]

IE:

IE:

x = ["1.111", "1.122", "1.250", "1.111"]
# Most popular
x.group_by{|a| a }.sort_by{|a,b| b.size<=>a.size}.first[0]
#=> "1.111
# How many times
x.group_by{|a| a }.sort_by{|a,b| b.size<=>a.size}.first[1].size
#=> 2