This question already has an answer here:
这个问题已经有了答案:
- python+numpy: efficient way to take the min/max n values and indices from a matrix 2 answers
- python+numpy:从矩阵2中取最小/最大n值和指标的有效方法
I need to find just the the smallest nth element in a 1D numpy.array
.
我需要找到1D numpi .array中最小的第n个元素。
For example:
例如:
a = np.array([90,10,30,40,80,70,20,50,60,0])
I want to get 5th smallest element, so my desired output is 40
.
我想要得到第5个最小的元素,所以我想要的输出是40。
My current solution is this:
我现在的解决方案是:
result = np.max(np.partition(a, 5)[:5])
However, finding 5 smallest elements and then taking the largest one them seems little clumsy to me. Is there a better way to do it? Am I missing a single function that would achieve my goal?
然而,找到5个最小的元素,然后取最大的元素,对我来说似乎有点笨拙。有更好的方法吗?我是否遗漏了一个能够实现我目标的功能?
There are questions with similar titles to this one, but I did not see anything that answered my question.
对于这个问题,也有类似的题目,但我没有看到任何回答我问题的东西。
Edit:
编辑:
I should've mentioned it originally, but performance is very important for me; therefore, heapq
solution though nice would not work for me.
我本来应该提到它,但是性能对我来说很重要;因此,heapq解虽然不错,但对我不起作用。
import numpy as np
import heapq
def find_nth_smallest_old_way(a, n):
return np.max(np.partition(a, n)[:n])
# Solution suggested by Jaime and HYRY
def find_nth_smallest_proper_way(a, n):
return np.partition(a, n-1)[n-1]
def find_nth_smallest_heapq(a, n):
return heapq.nsmallest(n, a)[-1]
#
n_iterations = 10000
a = np.arange(1000)
np.random.shuffle(a)
t1 = timeit('find_nth_smallest_old_way(a, 100)', 'from __main__ import find_nth_smallest_old_way, a', number = n_iterations)
print 'time taken using partition old_way: {}'.format(t1)
t2 = timeit('find_nth_smallest_proper_way(a, 100)', 'from __main__ import find_nth_smallest_proper_way, a', number = n_iterations)
print 'time taken using partition proper way: {}'.format(t2)
t3 = timeit('find_nth_smallest_heapq(a, 100)', 'from __main__ import find_nth_smallest_heapq, a', number = n_iterations)
print 'time taken using heapq : {}'.format(t3)
Result:
结果:
time taken using partition old_way: 0.255564928055
time taken using partition proper way: 0.129678010941
time taken using heapq : 7.81094002724
3 个解决方案
#1
13
Unless I am missing something, what you want to do is:
除非我漏掉了什么,否则你想做的是:
>>> a = np.array([90,10,30,40,80,70,20,50,60,0])
>>> np.partition(a, 4)[4]
40
np.partition(a, k)
will place the k
-th smallest element of a
at a[k]
, smaller values in a[:k]
and larger values in a[k+1:]
. The only thing to be aware of is that, because of the 0 indexing, the fifth element is at index 4.
np。分区(a, k)将k最小的元素a在[k]中,较小的值在a[:k]中,更大的值在a[k+1:]中。唯一需要注意的是,由于0索引,第五个元素位于索引4。
#2
4
You can use heapq.nsmallest
:
您可以使用heapq.nsmallest:
>>> import numpy as np
>>> import heapq
>>>
>>> a = np.array([90,10,30,40,80,70,20,50,60,0])
>>> heapq.nsmallest(5, a)[-1]
40
#3
0
you don't need call numpy.max()
:
您不需要叫numpy.max():
def nsmall(a, n):
return np.partition(a, n)[n]
#1
13
Unless I am missing something, what you want to do is:
除非我漏掉了什么,否则你想做的是:
>>> a = np.array([90,10,30,40,80,70,20,50,60,0])
>>> np.partition(a, 4)[4]
40
np.partition(a, k)
will place the k
-th smallest element of a
at a[k]
, smaller values in a[:k]
and larger values in a[k+1:]
. The only thing to be aware of is that, because of the 0 indexing, the fifth element is at index 4.
np。分区(a, k)将k最小的元素a在[k]中,较小的值在a[:k]中,更大的值在a[k+1:]中。唯一需要注意的是,由于0索引,第五个元素位于索引4。
#2
4
You can use heapq.nsmallest
:
您可以使用heapq.nsmallest:
>>> import numpy as np
>>> import heapq
>>>
>>> a = np.array([90,10,30,40,80,70,20,50,60,0])
>>> heapq.nsmallest(5, a)[-1]
40
#3
0
you don't need call numpy.max()
:
您不需要叫numpy.max():
def nsmall(a, n):
return np.partition(a, n)[n]