Hey I have an array where each element is a hash containing a few values and a count.
嘿,我有一个数组,每个元素都是一个哈希,包含一些值和一个计数。
result = [
{"count" => 3,"name" => "user1"},
{"count" => 10,"name" => "user2"},
{"count" => 10, "user3"},
{"count" => 2, "user4"}
]
I can sort the array by count as follows:
我可以按照如下方式对数组进行排序:
result = result.sort_by do |r|
r["count"]
end
Now I want to be able to retrieve the top n entries based on count (not just first(n)) Is there an elegant way to do this? So as an example, let n = 1 I would expect a result set of.
现在,我希望能够根据count(不只是第一个(n))检索*n项,是否有一种优雅的方法来实现这一点?作为一个例子,让n = 1我期望结果集。
[{"count" => 10,"name" => "user2"}, {"count" => 10, "user3"}]
since I asked for all entries with the highest score.. if I asked for top 2 highest scores I'd get
因为我要求所有的分数都是最高的。如果我要求最高的2分,我将得到
[{"count" => 10,"name" => "user2"}, {"count" => 10, "user3"}, {"count" => 3, "user1"}]
4 个解决方案
#1
24
Enumerable#group_by
to the rescue (as usual):
可列举的#group_by救援(一如既往):
result.group_by { |r| r["count"] }
.sort_by { |k, v| -k }
.first(2)
.map(&:last)
.flatten
Most of the work is done by the group_by
. The sort_by
simply lines things up so that first(2)
will pick off the groups you want. Then map
with last
will extract the count/name hashes that you started with and the final flatten
will clean up the extra left over arrays.
大部分工作是由group_by完成的。sort_by简单地将其排列起来,这样第一个(2)就会选出你想要的组。然后使用last的map将提取开始时的计数/名称散列,最后的flatten将清理剩下的数组。
#2
2
This solution is not elegant in terms of being concise, but it has better time complexity. In other words, it should execute a lot faster for a very large number of hashes.
这个解决方案在简洁方面并不优雅,但它具有更好的时间复杂度。换句话说,对于大量的散列,它应该执行得更快。
You will need to install the "algorithms" gem in order to use the Heap data structure:
需要安装“算法”gem才能使用堆数据结构:
Heaps are an efficient data structure when you need to find the largest or smallest elements in a group. This particular type of heap is optimal if the value of "n" is much smaller than the total number of pairs.
当您需要查找组中最大或最小的元素时,堆是一种有效的数据结构。如果“n”的值远远小于对的总数,那么这种特殊类型的堆是最优的。
require 'algorithms'
def take_highest(result,n)
max_heap = Containers::Heap.new(result){|x,y| (x["count"] <=> y["count"]) == 1}
last = max_heap.pop
count = 0
highest = [last]
loop do
top = max_heap.pop
break if top.nil?
count += (top["count"] == last["count"] ? 0 : 1)
break if count == n
highest << top
last = top
end
highest
end
#3
2
Starting in Ruby 2.2.0, max_by
takes an extra argument that lets you ask for a certain number of top elements instead of just getting one. Using this, we can improve on mu is too short's answer
从Ruby 2.2.0开始,max_by采用了一个额外的参数,允许您请求一定数量的*元素,而不是只获得一个。利用这个,我们可以改进mu是太短的答案。
result = [
{count: 3, name: 'user1'},
{count: 10, name: 'user2'},
{count: 10, name: 'user3'},
{count: 2, name: 'user4'}
]
p result.group_by { |r| r[:count] }
.max_by(2, &:first)
.flat_map(&:last)
.sort_by { |r| -r[:count] }
# => [{:count=>10, :name=>"user2"}, {:count=>10, :name=>"user3"}, {:count=>3, :name=>"user1"}]
The docs don't say if the array returned by max_by
is sorted. If that turns out to be true though, we could just use reverse
in the last step instead of sorting.
文档没有说明max_by返回的数组是否已排序。如果这是真的,我们可以在最后一步使用逆向排序而不是排序。
#4
2
new_result = result.
sort_by { |r| -r["count"] }.
chunk { |r| r["count"] }.
take(2).
flat_map(&:last)
#=> [{"count"=>10, "name"=>"user3"},
# {"count"=>10, "name"=>"user2"},
# {"count"=> 3 "name"=>"user1"}]
#1
24
Enumerable#group_by
to the rescue (as usual):
可列举的#group_by救援(一如既往):
result.group_by { |r| r["count"] }
.sort_by { |k, v| -k }
.first(2)
.map(&:last)
.flatten
Most of the work is done by the group_by
. The sort_by
simply lines things up so that first(2)
will pick off the groups you want. Then map
with last
will extract the count/name hashes that you started with and the final flatten
will clean up the extra left over arrays.
大部分工作是由group_by完成的。sort_by简单地将其排列起来,这样第一个(2)就会选出你想要的组。然后使用last的map将提取开始时的计数/名称散列,最后的flatten将清理剩下的数组。
#2
2
This solution is not elegant in terms of being concise, but it has better time complexity. In other words, it should execute a lot faster for a very large number of hashes.
这个解决方案在简洁方面并不优雅,但它具有更好的时间复杂度。换句话说,对于大量的散列,它应该执行得更快。
You will need to install the "algorithms" gem in order to use the Heap data structure:
需要安装“算法”gem才能使用堆数据结构:
Heaps are an efficient data structure when you need to find the largest or smallest elements in a group. This particular type of heap is optimal if the value of "n" is much smaller than the total number of pairs.
当您需要查找组中最大或最小的元素时,堆是一种有效的数据结构。如果“n”的值远远小于对的总数,那么这种特殊类型的堆是最优的。
require 'algorithms'
def take_highest(result,n)
max_heap = Containers::Heap.new(result){|x,y| (x["count"] <=> y["count"]) == 1}
last = max_heap.pop
count = 0
highest = [last]
loop do
top = max_heap.pop
break if top.nil?
count += (top["count"] == last["count"] ? 0 : 1)
break if count == n
highest << top
last = top
end
highest
end
#3
2
Starting in Ruby 2.2.0, max_by
takes an extra argument that lets you ask for a certain number of top elements instead of just getting one. Using this, we can improve on mu is too short's answer
从Ruby 2.2.0开始,max_by采用了一个额外的参数,允许您请求一定数量的*元素,而不是只获得一个。利用这个,我们可以改进mu是太短的答案。
result = [
{count: 3, name: 'user1'},
{count: 10, name: 'user2'},
{count: 10, name: 'user3'},
{count: 2, name: 'user4'}
]
p result.group_by { |r| r[:count] }
.max_by(2, &:first)
.flat_map(&:last)
.sort_by { |r| -r[:count] }
# => [{:count=>10, :name=>"user2"}, {:count=>10, :name=>"user3"}, {:count=>3, :name=>"user1"}]
The docs don't say if the array returned by max_by
is sorted. If that turns out to be true though, we could just use reverse
in the last step instead of sorting.
文档没有说明max_by返回的数组是否已排序。如果这是真的,我们可以在最后一步使用逆向排序而不是排序。
#4
2
new_result = result.
sort_by { |r| -r["count"] }.
chunk { |r| r["count"] }.
take(2).
flat_map(&:last)
#=> [{"count"=>10, "name"=>"user3"},
# {"count"=>10, "name"=>"user2"},
# {"count"=> 3 "name"=>"user1"}]