I have a NumPy array of values. I want to count how many of these values are in a specific range say x<100 and x>25. I have read about the counter, but it seems to only be valid for specif values not ranges of values. I have searched, but have not found anything regarding my specific problem. If someone could point me towards the proper documentation I would appreciate it. Thank you
我有一个NumPy值数组。我要计算有多少个值在一个特定的范围内比如x<100 x>25。我读过关于计数器的内容,但它似乎只对特定值有效,而不是值的范围。我找过了,但是没有找到任何关于我的具体问题。如果有人能给我指出正确的文件,我会很感激的。谢谢你!
I have tried this
X = array(X)
for X in range(25, 100):
But it just gives me the numbers in between 25 and 99.
EDIT The data I am using was created by another program. I then used a script to read the data and store it as a list. I then took the list and turned it in to an array using array(r).
The result of running
>>> a[0:10]
array(['29.63827346', '40.61488812', '25.48300065', '26.22910525',
'42.41172923', '20.15013315', '34.95323355', '13.03604098',
'29.71097606', '9.53222141'],
5 个解决方案
If your array is called a
, the number of elements fulfilling 25 < x < 100
如果数组被称为a,则满足25 < x < 100的元素的数量为
((25 < a) & (a < 100)).sum()
The expression (25 < a) & (a < 100)
results in a Boolean array with the same shape as a
with the value True
for all elements that satisfy the condition. Summing over this Boolean array treats True
values as 1
and False
values as 0
表达式(25 < a)和(a < 100)生成一个与a具有相同形状的布尔数组,所有满足条件的元素的值都为True。对这个布尔数组求和后,真值为1,假值为0。
You could use histogram
. Here's a basic usage example:
>>> import numpy
>>> a = numpy.random.random(size=100) * 100
>>> numpy.histogram(a, bins=(0.0, 7.3, 22.4, 55.5, 77, 79, 98, 100))
(array([ 8, 14, 34, 31, 0, 12, 1]),
array([ 0. , 7.3, 22.4, 55.5, 77. , 79. , 98. , 100. ]))
In your particular case, it would look something like this:
>>> numpy.histogram(a, bins=(25, 100))
(array([73]), array([ 25, 100]))
Additionally, when you have a list of strings, you have to explicitly specify the type, so that numpy
knows to produce an array of floats instead of a list of strings.
>>> strings = [str(i) for i in range(10)]
>>> numpy.array(strings)
array(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'],
>>> numpy.array(strings, dtype=float)
array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])
Building on Sven's good approach, you can also do the more direct:
numpy.count_nonzero((25 < a) & (a < 100))
This first creates an array of booleans with one boolean for each input number in array a
, and then count the number of non-False (i.e. True) values (which gives the number of matching numbers).
Note, however, that this approach is twice as slow as Sven's .sum()
approach, on an array of 100k numbers (NumPy 1.6.1, Python 2.7.3)–about 300 µs versus 150 µs.
但是要注意,这种方法是慢两倍斯文.sum()方法,在100 k数字数组(NumPy 1.6.1,Python 2.7.3)——300µs与150µs。
Sven's answer is the way to do it if you don't wish to further process matching values.
The following two examples return copies with only the matching values:
np.compress((25 < a) & (a < 100), a).size
a[(25 < a) & (a < 100)].size
Example interpreter session:
>>> import numpy as np
>>> a = np.random.randint(200,size=100)
>>> a
array([194, 131, 10, 100, 199, 123, 36, 14, 52, 195, 114, 181, 138,
144, 70, 185, 127, 52, 41, 126, 159, 39, 68, 118, 124, 119,
45, 161, 66, 29, 179, 194, 145, 163, 190, 150, 186, 25, 61,
187, 0, 69, 87, 20, 192, 18, 147, 53, 40, 113, 193, 178,
104, 170, 133, 69, 61, 48, 84, 121, 13, 49, 11, 29, 136,
141, 64, 22, 111, 162, 107, 33, 130, 11, 22, 167, 157, 99,
59, 12, 70, 154, 44, 45, 110, 180, 116, 56, 136, 54, 139,
26, 77, 128, 55, 143, 133, 137, 3, 83])
>>> np.compress((25 < a) & (a < 100),a).size
>>> a[(25 < a) & (a < 100)].size
The above examples use a "bit-wise and" (&) to do an element-wise computation along the two boolean arrays which you create for comparison purposes.
Another way to write Sven's excellent answer, for example, is:
np.bitwise_and(25 < a, a < 100).sum()
The boolean arrays contain True
values when the condition matches, and False
when it doesn't.
A bonus aspect of boolean values is that True
is equivalent to 1 and False
to 0.
I think @Sven Marnach answer is quite nice, because it operates in on the numpy array itself which will be fast and efficient (C implementation).
我认为@Sven Marnach的答案是相当不错的,因为它在numpy数组本身上运行,这将是快速和高效的(C实现)。
I like to put the test into one condition like 25 < x < 100
, so I would probably do it something like this:
我喜欢把测试放到一个条件,比如25 < x < 100,所以我可能会这样做:
len([x for x in a.ravel() if 25 < x < 100])
len([x在a.ravel()中表示x,如果25 < x < 100])
If your array is called a
, the number of elements fulfilling 25 < x < 100
如果数组被称为a,则满足25 < x < 100的元素的数量为
((25 < a) & (a < 100)).sum()
The expression (25 < a) & (a < 100)
results in a Boolean array with the same shape as a
with the value True
for all elements that satisfy the condition. Summing over this Boolean array treats True
values as 1
and False
values as 0
表达式(25 < a)和(a < 100)生成一个与a具有相同形状的布尔数组,所有满足条件的元素的值都为True。对这个布尔数组求和后,真值为1,假值为0。
You could use histogram
. Here's a basic usage example:
>>> import numpy
>>> a = numpy.random.random(size=100) * 100
>>> numpy.histogram(a, bins=(0.0, 7.3, 22.4, 55.5, 77, 79, 98, 100))
(array([ 8, 14, 34, 31, 0, 12, 1]),
array([ 0. , 7.3, 22.4, 55.5, 77. , 79. , 98. , 100. ]))
In your particular case, it would look something like this:
>>> numpy.histogram(a, bins=(25, 100))
(array([73]), array([ 25, 100]))
Additionally, when you have a list of strings, you have to explicitly specify the type, so that numpy
knows to produce an array of floats instead of a list of strings.
>>> strings = [str(i) for i in range(10)]
>>> numpy.array(strings)
array(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'],
>>> numpy.array(strings, dtype=float)
array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])
Building on Sven's good approach, you can also do the more direct:
numpy.count_nonzero((25 < a) & (a < 100))
This first creates an array of booleans with one boolean for each input number in array a
, and then count the number of non-False (i.e. True) values (which gives the number of matching numbers).
Note, however, that this approach is twice as slow as Sven's .sum()
approach, on an array of 100k numbers (NumPy 1.6.1, Python 2.7.3)–about 300 µs versus 150 µs.
但是要注意,这种方法是慢两倍斯文.sum()方法,在100 k数字数组(NumPy 1.6.1,Python 2.7.3)——300µs与150µs。
Sven's answer is the way to do it if you don't wish to further process matching values.
The following two examples return copies with only the matching values:
np.compress((25 < a) & (a < 100), a).size
a[(25 < a) & (a < 100)].size
Example interpreter session:
>>> import numpy as np
>>> a = np.random.randint(200,size=100)
>>> a
array([194, 131, 10, 100, 199, 123, 36, 14, 52, 195, 114, 181, 138,
144, 70, 185, 127, 52, 41, 126, 159, 39, 68, 118, 124, 119,
45, 161, 66, 29, 179, 194, 145, 163, 190, 150, 186, 25, 61,
187, 0, 69, 87, 20, 192, 18, 147, 53, 40, 113, 193, 178,
104, 170, 133, 69, 61, 48, 84, 121, 13, 49, 11, 29, 136,
141, 64, 22, 111, 162, 107, 33, 130, 11, 22, 167, 157, 99,
59, 12, 70, 154, 44, 45, 110, 180, 116, 56, 136, 54, 139,
26, 77, 128, 55, 143, 133, 137, 3, 83])
>>> np.compress((25 < a) & (a < 100),a).size
>>> a[(25 < a) & (a < 100)].size
The above examples use a "bit-wise and" (&) to do an element-wise computation along the two boolean arrays which you create for comparison purposes.
Another way to write Sven's excellent answer, for example, is:
np.bitwise_and(25 < a, a < 100).sum()
The boolean arrays contain True
values when the condition matches, and False
when it doesn't.
A bonus aspect of boolean values is that True
is equivalent to 1 and False
to 0.
I think @Sven Marnach answer is quite nice, because it operates in on the numpy array itself which will be fast and efficient (C implementation).
我认为@Sven Marnach的答案是相当不错的,因为它在numpy数组本身上运行,这将是快速和高效的(C实现)。
I like to put the test into one condition like 25 < x < 100
, so I would probably do it something like this:
我喜欢把测试放到一个条件,比如25 < x < 100,所以我可能会这样做:
len([x for x in a.ravel() if 25 < x < 100])
len([x在a.ravel()中表示x,如果25 < x < 100])