如何在一个Numpy数组中计算某个范围内的值?

时间:2021-04-15 21:34:55

I have a NumPy array of values. I want to count how many of these values are in a specific range say x<100 and x>25. I have read about the counter, but it seems to only be valid for specif values not ranges of values. I have searched, but have not found anything regarding my specific problem. If someone could point me towards the proper documentation I would appreciate it. Thank you

我有一个NumPy值数组。我要计算有多少个值在一个特定的范围内比如x<100 x>25。我读过关于计数器的内容,但它似乎只对特定值有效,而不是值的范围。我找过了,但是没有找到任何关于我的具体问题。如果有人能给我指出正确的文件,我会很感激的。谢谢你!

I have tried this

我试了

   X = array(X)
   for X in range(25, 100):
       print(X)

But it just gives me the numbers in between 25 and 99.

但它只给出25到99之间的数字。

EDIT The data I am using was created by another program. I then used a script to read the data and store it as a list. I then took the list and turned it in to an array using array(r).

编辑我正在使用的数据是由另一个程序创建的。然后我使用一个脚本读取数据并将其存储为一个列表。然后,我拿起列表,用数组(r)将它转换为数组。

Edit

编辑

The result of running

运行的结果

 >>> a[0:10]
 array(['29.63827346', '40.61488812', '25.48300065', '26.22910525',
   '42.41172923', '20.15013315', '34.95323355', '13.03604098',
   '29.71097606', '9.53222141'], 
  dtype='<U11')

5 个解决方案

#1


54  

If your array is called a, the number of elements fulfilling 25 < x < 100 is

如果数组被称为a,则满足25 < x < 100的元素的数量为

((25 < a) & (a < 100)).sum()

The expression (25 < a) & (a < 100) results in a Boolean array with the same shape as a with the value True for all elements that satisfy the condition. Summing over this Boolean array treats True values as 1 and False values as 0.

表达式(25 < a)和(a < 100)生成一个与a具有相同形状的布尔数组,所有满足条件的元素的值都为True。对这个布尔数组求和后,真值为1,假值为0。

#2


8  

You could use histogram. Here's a basic usage example:

你可以用直方图。这里有一个基本的用法示例:

>>> import numpy
>>> a = numpy.random.random(size=100) * 100 
>>> numpy.histogram(a, bins=(0.0, 7.3, 22.4, 55.5, 77, 79, 98, 100))
(array([ 8, 14, 34, 31,  0, 12,  1]), 
 array([   0. ,    7.3,   22.4,   55.5,   77. ,   79. ,   98. ,  100. ]))

In your particular case, it would look something like this:

在你的例子中,它看起来是这样的:

>>> numpy.histogram(a, bins=(25, 100))
(array([73]), array([ 25, 100]))

Additionally, when you have a list of strings, you have to explicitly specify the type, so that numpy knows to produce an array of floats instead of a list of strings.

此外,当您有一个字符串列表时,您必须显式地指定类型,以便numpy知道生成一个浮点数组而不是字符串列表。

>>> strings = [str(i) for i in range(10)]
>>> numpy.array(strings)
array(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'], 
      dtype='|S1')
>>> numpy.array(strings, dtype=float)
array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.])

#3


7  

Building on Sven's good approach, you can also do the more direct:

基于斯文的好方法,你也可以做更直接的:

numpy.count_nonzero((25 < a) & (a < 100))

This first creates an array of booleans with one boolean for each input number in array a, and then count the number of non-False (i.e. True) values (which gives the number of matching numbers).

这首先为数组a中的每个输入数字创建一个布尔值数组,然后计算非false(即True)值的数量(它给出匹配数字的数量)。

Note, however, that this approach is twice as slow as Sven's .sum() approach, on an array of 100k numbers (NumPy 1.6.1, Python 2.7.3)–about 300 µs versus 150 µs.

但是要注意,这种方法是慢两倍斯文.sum()方法,在100 k数字数组(NumPy 1.6.1,Python 2.7.3)——300µs与150µs。

#4


4  

Sven's answer is the way to do it if you don't wish to further process matching values.
The following two examples return copies with only the matching values:

Sven的答案是,如果您不希望进一步处理匹配值,那么可以这样做。以下两个示例返回只有匹配值的副本:

np.compress((25 < a) & (a < 100), a).size

Or:

或者:

a[(25 < a) & (a < 100)].size

Example interpreter session:

翻译会话示例:

>>> import numpy as np
>>> a = np.random.randint(200,size=100)
>>> a
array([194, 131,  10, 100, 199, 123,  36,  14,  52, 195, 114, 181, 138,
       144,  70, 185, 127,  52,  41, 126, 159,  39,  68, 118, 124, 119,
        45, 161,  66,  29, 179, 194, 145, 163, 190, 150, 186,  25,  61,
       187,   0,  69,  87,  20, 192,  18, 147,  53,  40, 113, 193, 178,
       104, 170, 133,  69,  61,  48,  84, 121,  13,  49,  11,  29, 136,
       141,  64,  22, 111, 162, 107,  33, 130,  11,  22, 167, 157,  99,
        59,  12,  70, 154,  44,  45, 110, 180, 116,  56, 136,  54, 139,
        26,  77, 128,  55, 143, 133, 137,   3,  83])
>>> np.compress((25 < a) & (a < 100),a).size
34
>>> a[(25 < a) & (a < 100)].size
34

The above examples use a "bit-wise and" (&) to do an element-wise computation along the two boolean arrays which you create for comparison purposes.
Another way to write Sven's excellent answer, for example, is:

上面的示例使用“按位计算”和(&)对您创建的两个布尔数组进行元素计算,以便进行比较。另一种写出斯文精彩答案的方法是:

np.bitwise_and(25 < a, a < 100).sum() 

The boolean arrays contain True values when the condition matches, and False when it doesn't.
A bonus aspect of boolean values is that True is equivalent to 1 and False to 0.

布尔数组在条件匹配时包含真值,在条件不匹配时包含假值。布尔值的另一个好处是,True等于1,False等于0。

#5


2  

I think @Sven Marnach answer is quite nice, because it operates in on the numpy array itself which will be fast and efficient (C implementation).

我认为@Sven Marnach的答案是相当不错的,因为它在numpy数组本身上运行,这将是快速和高效的(C实现)。

I like to put the test into one condition like 25 < x < 100, so I would probably do it something like this:

我喜欢把测试放到一个条件,比如25 < x < 100,所以我可能会这样做:

len([x for x in a.ravel() if 25 < x < 100])

len([x在a.ravel()中表示x,如果25 < x < 100])

#1


54  

If your array is called a, the number of elements fulfilling 25 < x < 100 is

如果数组被称为a,则满足25 < x < 100的元素的数量为

((25 < a) & (a < 100)).sum()

The expression (25 < a) & (a < 100) results in a Boolean array with the same shape as a with the value True for all elements that satisfy the condition. Summing over this Boolean array treats True values as 1 and False values as 0.

表达式(25 < a)和(a < 100)生成一个与a具有相同形状的布尔数组,所有满足条件的元素的值都为True。对这个布尔数组求和后,真值为1,假值为0。

#2


8  

You could use histogram. Here's a basic usage example:

你可以用直方图。这里有一个基本的用法示例:

>>> import numpy
>>> a = numpy.random.random(size=100) * 100 
>>> numpy.histogram(a, bins=(0.0, 7.3, 22.4, 55.5, 77, 79, 98, 100))
(array([ 8, 14, 34, 31,  0, 12,  1]), 
 array([   0. ,    7.3,   22.4,   55.5,   77. ,   79. ,   98. ,  100. ]))

In your particular case, it would look something like this:

在你的例子中,它看起来是这样的:

>>> numpy.histogram(a, bins=(25, 100))
(array([73]), array([ 25, 100]))

Additionally, when you have a list of strings, you have to explicitly specify the type, so that numpy knows to produce an array of floats instead of a list of strings.

此外,当您有一个字符串列表时,您必须显式地指定类型,以便numpy知道生成一个浮点数组而不是字符串列表。

>>> strings = [str(i) for i in range(10)]
>>> numpy.array(strings)
array(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'], 
      dtype='|S1')
>>> numpy.array(strings, dtype=float)
array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.])

#3


7  

Building on Sven's good approach, you can also do the more direct:

基于斯文的好方法,你也可以做更直接的:

numpy.count_nonzero((25 < a) & (a < 100))

This first creates an array of booleans with one boolean for each input number in array a, and then count the number of non-False (i.e. True) values (which gives the number of matching numbers).

这首先为数组a中的每个输入数字创建一个布尔值数组,然后计算非false(即True)值的数量(它给出匹配数字的数量)。

Note, however, that this approach is twice as slow as Sven's .sum() approach, on an array of 100k numbers (NumPy 1.6.1, Python 2.7.3)–about 300 µs versus 150 µs.

但是要注意,这种方法是慢两倍斯文.sum()方法,在100 k数字数组(NumPy 1.6.1,Python 2.7.3)——300µs与150µs。

#4


4  

Sven's answer is the way to do it if you don't wish to further process matching values.
The following two examples return copies with only the matching values:

Sven的答案是,如果您不希望进一步处理匹配值,那么可以这样做。以下两个示例返回只有匹配值的副本:

np.compress((25 < a) & (a < 100), a).size

Or:

或者:

a[(25 < a) & (a < 100)].size

Example interpreter session:

翻译会话示例:

>>> import numpy as np
>>> a = np.random.randint(200,size=100)
>>> a
array([194, 131,  10, 100, 199, 123,  36,  14,  52, 195, 114, 181, 138,
       144,  70, 185, 127,  52,  41, 126, 159,  39,  68, 118, 124, 119,
        45, 161,  66,  29, 179, 194, 145, 163, 190, 150, 186,  25,  61,
       187,   0,  69,  87,  20, 192,  18, 147,  53,  40, 113, 193, 178,
       104, 170, 133,  69,  61,  48,  84, 121,  13,  49,  11,  29, 136,
       141,  64,  22, 111, 162, 107,  33, 130,  11,  22, 167, 157,  99,
        59,  12,  70, 154,  44,  45, 110, 180, 116,  56, 136,  54, 139,
        26,  77, 128,  55, 143, 133, 137,   3,  83])
>>> np.compress((25 < a) & (a < 100),a).size
34
>>> a[(25 < a) & (a < 100)].size
34

The above examples use a "bit-wise and" (&) to do an element-wise computation along the two boolean arrays which you create for comparison purposes.
Another way to write Sven's excellent answer, for example, is:

上面的示例使用“按位计算”和(&)对您创建的两个布尔数组进行元素计算,以便进行比较。另一种写出斯文精彩答案的方法是:

np.bitwise_and(25 < a, a < 100).sum() 

The boolean arrays contain True values when the condition matches, and False when it doesn't.
A bonus aspect of boolean values is that True is equivalent to 1 and False to 0.

布尔数组在条件匹配时包含真值,在条件不匹配时包含假值。布尔值的另一个好处是,True等于1,False等于0。

#5


2  

I think @Sven Marnach answer is quite nice, because it operates in on the numpy array itself which will be fast and efficient (C implementation).

我认为@Sven Marnach的答案是相当不错的,因为它在numpy数组本身上运行,这将是快速和高效的(C实现)。

I like to put the test into one condition like 25 < x < 100, so I would probably do it something like this:

我喜欢把测试放到一个条件,比如25 < x < 100,所以我可能会这样做:

len([x for x in a.ravel() if 25 < x < 100])

len([x在a.ravel()中表示x,如果25 < x < 100])