Let's say we have a numpy array:
假设我们有一个numpy数组
import numpy as np
arr = np.array([[ 5, 9],[14, 23],[26, 4],[ 5, 26]])
I want to replace each element with its number of occurrences,
我想用出现的次数替换每个元素,
unique0, counts0= np.unique(arr.flatten(), return_counts=True)
print (unique0, counts0)
(array([ 4, 5, 9, 14, 23, 26]), array([1, 2, 1, 1, 1, 2]))
(数组((4、5、9、14、23、26]),数组([1、2、1,1,1,2)))
so 4 should be replaced by 1, 5 by 2, etc to get:
所以4应该被1 5换成2,等等来得到:
[[ 2, 1],[1, 1],[2, 1],[ 2, 2]]
[2, 1],[1, 1],[2, 1],[2, 2]
Is there any way to achieve this in numpy?
有什么方法可以在numpy中实现这个吗?
2 个解决方案
#1
3
Use the other optional argument return_inverse
with np.unique
to tag all elements based on their uniqueness and then map those with the counts to give us our desired output, like so -
使用另一个可选参数return_reverse with np。根据它们的唯一性来标记所有元素,然后映射那些有计数的元素,以提供我们所需的输出,比如这样。
_, idx, counts0 = np.unique(arr, return_counts=True,return_inverse=True)
out = counts0[idx].reshape(arr.shape)
Sample run -
样本运行-
In [100]: arr
Out[100]:
array([[ 5, 9],
[14, 23],
[26, 4],
[ 5, 26]])
In [101]: _, idx, counts0 = np.unique(arr, return_counts=True,return_inverse=True)
In [102]: counts0[idx].reshape(arr.shape)
Out[102]:
array([[2, 1],
[1, 1],
[2, 1],
[2, 2]])
#2
1
This is an alternative solution since @Divakar's answer does not work in version <1.9:
这是另一种解决方案,因为@Divakar的答案在版本<1.9中不起作用:
In [1]: import numpy as np
In [2]: arr = np.array([[ 5, 9],[14, 23],[26, 4],[ 5, 26]])
In [3]: np.bincount(arr.flatten())[arr]
Out[3]:
array([[2, 1],
[1, 1],
[2, 1],
[2, 2]])
To test for speed (with 10000 random integers):
测试速度(10000个随机整数):
def replace_unique(arr):
_, idx, counts0 = np.unique(arr,return_counts=True,return_inverse=True)
return counts0[idx].reshape(arr.shape)
def replace_bincount(arr):
return np.bincount(arr.flatten())[arr]
arr = np.random.random_integers(30,size=[10000,2])
%timeit -n 1000 replace_bincount(arr)
# 1000 loops, best of 3: 68.3 µs per loop
%timeit -n 1000 replace_unique(arr)
# 1000 loops, best of 3: 922 µs per loop
so bincount
method is ~14 times faster than unique
method.
所以bincount方法比unique方法快14倍。
#1
3
Use the other optional argument return_inverse
with np.unique
to tag all elements based on their uniqueness and then map those with the counts to give us our desired output, like so -
使用另一个可选参数return_reverse with np。根据它们的唯一性来标记所有元素,然后映射那些有计数的元素,以提供我们所需的输出,比如这样。
_, idx, counts0 = np.unique(arr, return_counts=True,return_inverse=True)
out = counts0[idx].reshape(arr.shape)
Sample run -
样本运行-
In [100]: arr
Out[100]:
array([[ 5, 9],
[14, 23],
[26, 4],
[ 5, 26]])
In [101]: _, idx, counts0 = np.unique(arr, return_counts=True,return_inverse=True)
In [102]: counts0[idx].reshape(arr.shape)
Out[102]:
array([[2, 1],
[1, 1],
[2, 1],
[2, 2]])
#2
1
This is an alternative solution since @Divakar's answer does not work in version <1.9:
这是另一种解决方案,因为@Divakar的答案在版本<1.9中不起作用:
In [1]: import numpy as np
In [2]: arr = np.array([[ 5, 9],[14, 23],[26, 4],[ 5, 26]])
In [3]: np.bincount(arr.flatten())[arr]
Out[3]:
array([[2, 1],
[1, 1],
[2, 1],
[2, 2]])
To test for speed (with 10000 random integers):
测试速度(10000个随机整数):
def replace_unique(arr):
_, idx, counts0 = np.unique(arr,return_counts=True,return_inverse=True)
return counts0[idx].reshape(arr.shape)
def replace_bincount(arr):
return np.bincount(arr.flatten())[arr]
arr = np.random.random_integers(30,size=[10000,2])
%timeit -n 1000 replace_bincount(arr)
# 1000 loops, best of 3: 68.3 µs per loop
%timeit -n 1000 replace_unique(arr)
# 1000 loops, best of 3: 922 µs per loop
so bincount
method is ~14 times faster than unique
method.
所以bincount方法比unique方法快14倍。