根据规则替换多维numpy数组的元素

时间:2021-10-04 21:21:52

Let's say we have a numpy array:

假设我们有一个numpy数组

import numpy as np
arr = np.array([[ 5,  9],[14, 23],[26,  4],[ 5, 26]])

I want to replace each element with its number of occurrences,

我想用出现的次数替换每个元素,

 unique0, counts0= np.unique(arr.flatten(), return_counts=True)
 print (unique0, counts0)

(array([ 4, 5, 9, 14, 23, 26]), array([1, 2, 1, 1, 1, 2]))

(数组((4、5、9、14、23、26]),数组([1、2、1,1,1,2)))

so 4 should be replaced by 1, 5 by 2, etc to get:

所以4应该被1 5换成2,等等来得到:

[[ 2, 1],[1, 1],[2, 1],[ 2, 2]]

[2, 1],[1, 1],[2, 1],[2, 2]

Is there any way to achieve this in numpy?

有什么方法可以在numpy中实现这个吗?

2 个解决方案

#1


3  

Use the other optional argument return_inverse with np.unique to tag all elements based on their uniqueness and then map those with the counts to give us our desired output, like so -

使用另一个可选参数return_reverse with np。根据它们的唯一性来标记所有元素,然后映射那些有计数的元素,以提供我们所需的输出,比如这样。

_, idx, counts0 = np.unique(arr, return_counts=True,return_inverse=True)
out = counts0[idx].reshape(arr.shape)

Sample run -

样本运行-

In [100]: arr
Out[100]: 
array([[ 5,  9],
       [14, 23],
       [26,  4],
       [ 5, 26]])

In [101]: _, idx, counts0 = np.unique(arr, return_counts=True,return_inverse=True)

In [102]: counts0[idx].reshape(arr.shape)
Out[102]: 
array([[2, 1],
       [1, 1],
       [2, 1],
       [2, 2]])

#2


1  

This is an alternative solution since @Divakar's answer does not work in version <1.9:

这是另一种解决方案,因为@Divakar的答案在版本<1.9中不起作用:

 In [1]: import numpy as np

 In [2]: arr = np.array([[ 5,  9],[14, 23],[26,  4],[ 5, 26]])

 In [3]: np.bincount(arr.flatten())[arr]
 Out[3]: 
 array([[2, 1],
        [1, 1],
        [2, 1],
        [2, 2]])

To test for speed (with 10000 random integers):

测试速度(10000个随机整数):

def replace_unique(arr):
    _, idx, counts0 = np.unique(arr,return_counts=True,return_inverse=True)
    return counts0[idx].reshape(arr.shape)

def replace_bincount(arr):
    return np.bincount(arr.flatten())[arr]

arr = np.random.random_integers(30,size=[10000,2])


%timeit -n 1000 replace_bincount(arr)
# 1000 loops, best of 3: 68.3 µs per loop
%timeit -n 1000 replace_unique(arr)
# 1000 loops, best of 3: 922 µs per loop

so bincount method is ~14 times faster than unique method.

所以bincount方法比unique方法快14倍。

#1


3  

Use the other optional argument return_inverse with np.unique to tag all elements based on their uniqueness and then map those with the counts to give us our desired output, like so -

使用另一个可选参数return_reverse with np。根据它们的唯一性来标记所有元素,然后映射那些有计数的元素,以提供我们所需的输出,比如这样。

_, idx, counts0 = np.unique(arr, return_counts=True,return_inverse=True)
out = counts0[idx].reshape(arr.shape)

Sample run -

样本运行-

In [100]: arr
Out[100]: 
array([[ 5,  9],
       [14, 23],
       [26,  4],
       [ 5, 26]])

In [101]: _, idx, counts0 = np.unique(arr, return_counts=True,return_inverse=True)

In [102]: counts0[idx].reshape(arr.shape)
Out[102]: 
array([[2, 1],
       [1, 1],
       [2, 1],
       [2, 2]])

#2


1  

This is an alternative solution since @Divakar's answer does not work in version <1.9:

这是另一种解决方案,因为@Divakar的答案在版本<1.9中不起作用:

 In [1]: import numpy as np

 In [2]: arr = np.array([[ 5,  9],[14, 23],[26,  4],[ 5, 26]])

 In [3]: np.bincount(arr.flatten())[arr]
 Out[3]: 
 array([[2, 1],
        [1, 1],
        [2, 1],
        [2, 2]])

To test for speed (with 10000 random integers):

测试速度(10000个随机整数):

def replace_unique(arr):
    _, idx, counts0 = np.unique(arr,return_counts=True,return_inverse=True)
    return counts0[idx].reshape(arr.shape)

def replace_bincount(arr):
    return np.bincount(arr.flatten())[arr]

arr = np.random.random_integers(30,size=[10000,2])


%timeit -n 1000 replace_bincount(arr)
# 1000 loops, best of 3: 68.3 µs per loop
%timeit -n 1000 replace_unique(arr)
# 1000 loops, best of 3: 922 µs per loop

so bincount method is ~14 times faster than unique method.

所以bincount方法比unique方法快14倍。