在两个numpy数组中具有唯一值组合的标签区域?

I have two labelled 2D numpy arrays a and b with identical shapes. I would like to re-label the array b by something similar to a GIS geometric union of the two arrays, such that cells with unique combination of values in array a and b are assigned new unique IDs:

我有两个标记为2D numpy数组a和b，形状相同。我想将数组b重新标记为类似于两个数组的GIS几何联盟，这样在数组a和b中具有唯一值组合的单元格将被分配新的唯一id:

I'm not concerned with the specific numbering of the regions in the output, so long as the values are all unique. I have attached sample arrays and desired outputs below: my real datasets are much larger, with both arrays having integer labels which range from "1" to "200000". So far I've experimented with concatenating the array IDs to form unique combinations of values, but ideally I would like to output a simple set of new IDs in the form of 1, 2, 3..., etc.

我不关心输出中区域的特定编号，只要值都是唯一的。我附上了示例数组和期望的输出:我的真实数据集要大得多，两个数组都有从“1”到“200000”的整数标签。到目前为止，我已经尝试将数组id连接起来，以形成唯一的值组合，但理想情况下，我希望以1,2,3的形式输出一组简单的新id……等。

import numpy as np
import matplotlib.pyplot as plt

# Example labelled arrays a and b
input_a = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                    [0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 0],
                    [0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 0],
                    [0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 0],
                    [0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 0],
                    [0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 0],
                    [0, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 0],
                    [0, 0, 3, 3, 3, 3, 2, 2, 2, 2, 0, 0],
                    [0, 0, 3, 3, 3, 3, 2, 2, 2, 2, 0, 0],
                    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])

input_b = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                    [0, 0, 1, 1, 1, 3, 3, 3, 3, 3, 0, 0],
                    [0, 0, 1, 1, 1, 3, 3, 3, 3, 3, 0, 0],
                    [0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 0, 0],
                    [0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 0, 0],
                    [0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 0, 0],
                    [0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 0, 0],
                    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])

# Plot inputs
plt.imshow(input_a, cmap="spectral", interpolation='nearest')
plt.imshow(input_b, cmap="spectral", interpolation='nearest')

# Desired output, union of a and b
output = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                   [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                   [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                   [0, 0, 1, 1, 1, 2, 3, 3, 3, 3, 0, 0],
                   [0, 0, 1, 1, 1, 2, 3, 3, 3, 3, 0, 0],
                   [0, 0, 1, 1, 1, 4, 7, 7, 7, 7, 0, 0],
                   [0, 0, 5, 5, 5, 6, 7, 7, 7, 7, 0, 0],
                   [0, 0, 5, 5, 5, 6, 7, 7, 7, 7, 0, 0],
                   [0, 0, 5, 5, 5, 6, 7, 7, 7, 7, 0, 0],
                   [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                   [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                   [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])

# Plot desired output
plt.imshow(output, cmap="spectral", interpolation='nearest')

2 个解决方案

#1

If I understood the circumstances correctly, you are looking to have unique pairings from a and b. So, 1 from a and 1 from b would have one unique tag in the output; 1 from a and 3 from b would have another unique tag in the output. Also looking at the desired output in the question, it seems that there is an additional conditional situation here that if b is zero, the output is to be zero as well irrespective of the unique pairings.

如果我理解正确的话，您是在寻找来自a和b的唯一配对，因此，来自a的1和来自b的1在输出中有一个唯一的标记;从a到3,b在输出中会有另一个唯一的标记。同样考虑问题中的期望输出，这里似乎有一个附加的条件如果b是零，那么输出也将是零，不管唯一对是什么。

The following implementation tries to solve all of that -

下面的实现试图解决所有这些问题

c = a*(b.max()+1) + b
c[b==0] = 0
_,idx = np.unique(c,return_inverse= True)
out = idx.reshape(b.shape)

Sample run -

样本运行-

In [21]: a
Out[21]: 
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 0],
       [0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 0],
       [0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 0],
       [0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 0],
       [0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 0],
       [0, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 0],
       [0, 0, 3, 3, 3, 3, 2, 2, 2, 2, 0, 0],
       [0, 0, 3, 3, 3, 3, 2, 2, 2, 2, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])

In [22]: b
Out[22]: 
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 1, 1, 3, 3, 3, 3, 3, 0, 0],
       [0, 0, 1, 1, 1, 3, 3, 3, 3, 3, 0, 0],
       [0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 0, 0],
       [0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 0, 0],
       [0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 0, 0],
       [0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])

In [23]: out
Out[23]: 
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 1, 1, 3, 5, 5, 5, 5, 0, 0],
       [0, 0, 1, 1, 1, 3, 5, 5, 5, 5, 0, 0],
       [0, 0, 1, 1, 1, 2, 4, 4, 4, 4, 0, 0],
       [0, 0, 6, 6, 6, 7, 4, 4, 4, 4, 0, 0],
       [0, 0, 6, 6, 6, 7, 4, 4, 4, 4, 0, 0],
       [0, 0, 6, 6, 6, 7, 4, 4, 4, 4, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])

Sample plot -

样图,

# Plot inputs
plt.figure()                                                    
plt.imshow(a, cmap="spectral", interpolation='nearest')
plt.figure() 
plt.imshow(b, cmap="spectral", interpolation='nearest')

# Plot output
plt.figure()
plt.imshow(out, cmap="spectral", interpolation='nearest')

#2

Here is a way to do it conceptually in terms of set union, but not to GIS geometric union, since that was mentioned after I answered.

这里有一种概念上的方法用集合结合来做，但不是用GIS几何结合，因为我回答后提到了。

Make a list of all possible unique 2-tuples of values with one from a and the other from b in that order. Map each tuple in that list to its index in it. Create the union array using that map.

列出所有可能的唯一的两个值元组，其中一个来自a，另一个来自b。将列表中的每个元组映射到其中的索引。使用该映射创建联合数组。

For example say a and b are arrays each containing values in range(4) and assume for simplicity they have the same shape. Then:

例如，假设a和b是数组，每个数组都包含范围(4)中的值，为了简单起见，假设它们具有相同的形状。然后:

v = range(4)
from itertools import permutations
p = list(permutations(v,2))
m = {}
for i,x in enumerate(p):
    m[x] = i
union = np.empty_like(a)
for i,x in np.ndenumerate(a):
    union[i] = m[(x,b[i])]

For demonstration, generating a and b with

为了演示，生成a和b

np.random.randint(4, size=(3, 3))

produced:

生产:

a = array([[3, 0, 3],
           [1, 3, 2],
           [0, 0, 3]])

b = array([[1, 3, 1],
           [0, 0, 1],
           [2, 3, 0]])

m = {(0, 1): 0,
     (0, 2): 1,
     (0, 3): 2,
     (1, 0): 3,
     (1, 2): 4,
     (1, 3): 5,
     (2, 0): 6,
     (2, 1): 7,
     (2, 3): 8,
     (3, 0): 9,
     (3, 1): 10,
     (3, 2): 11}

union = array([[10,  2, 10],
               [ 3,  9,  7],
               [ 1,  2,  9]])

In this case the property that a union should be bigger or equal to its composits is reflected in increased numerical values rather than increase in number of elements.

在这种情况下，一个联合体应该大于或等于它的组合的属性反映在数值的增加而不是元素的增加。

#1