Python:使用三元组数据填充Numpy二维数组

时间:2021-04-07 21:20:31

I have a lot of data in database under (x, y, value) triplet form.
I would like to be able to create dynamically a 2d numpy array from this data by setting value at the coords (x,y) of the array.

我在(x,y,value)三元组形式的数据库中有很多数据。我希望能够通过在数组的坐标(x,y)处设置值,从该数据动态创建2d numpy数组。

For instance if I have :

例如,如果我有:

(0,0,8)
(0,1,5)
(0,2,3)
(1,0,4)
(1,1,0)
(1,2,0)
(2,0,1)
(2,1,2)
(2,2,5)

The resulting array should be :

结果数组应该是:

Array([[8,5,3],[4,0,0],[1,2,5]])

I'm new to numpy, is there any method in numpy to do so ? If not, what approach would you advice to do this ?

我是numpy的新手,有什么方法可以这么做吗?如果没有,您会建议采取什么方法?

3 个解决方案

#1


3  

Extending the answer from @MaxU, in case the coordinates are not ordered in a grid fashion (or in case some coordinates are missing), you can create your array as follows:

从@MaxU扩展答案,如果坐标没有以网格方式排序(或者如果缺少某些坐标),您可以按如下方式创建数组:

import numpy as np

a = np.array([(0,0,8),(0,1,5),(0,2,3),
              (1,0,4),(1,1,0),(1,2,0),
              (2,0,1),(2,1,2),(2,2,5)])

Here a represents your coordinates. It is an (N, 3) array, where N is the number of coordinates (it doesn't have to contain ALL the coordinates). The first column of a (a[:, 0]) contains the Y positions while the second columne (a[:, 1]) contains the X positions. Similarly, the last column (a[:, 2]) contains your values.

这里a代表你的坐标。它是一个(N,3)数组,其中N是坐标数(它不必包含所有坐标)。 a(a [:,0])的第一列包含Y位置,而第二列(a [:,1])包含X位置。同样,最后一列(a [:,2])包含您的值。

Then you can extract the maximum dimensions of your target array:

然后,您可以提取目标数组的最大尺寸:

# Maximum Y and X coordinates
ymax = a[:, 0].max()
xmax = a[:, 1].max()

# Target array
target = np.zeros((ymax+1, xmax+1), a.dtype)

And finally, fill the array with data from your coordinates:

最后,使用坐标中的数据填充数组:

target[a[:, 0], a[:, 1]] = a[:, 2]

The line above sets values in target at a[:, 0] (all Y) and a[:, 1] (all X) locations to their corresponding a[:, 2] value (your value).

上面的行将[:,0](所有Y)和[:,1](所有X)位置的目标值设置为相应的[:,2]值(您的值)。

>>> target
array([[8, 5, 3],
       [4, 0, 0],
       [1, 2, 5]])

Additionally, if you have missing coordinates, and you want to replace those missing values by some number, you can initialize the array as:

此外,如果您缺少坐标,并且想要用某个数字替换这些缺失值,则可以将数组初始化为:

default_value = -1
target = np.full((ymax+1, xmax+1), default_value, a.type)

This way, the coordinates not present in your list will be filled with -1 in the target array/

这样,列表中不存在的坐标将在目标数组中填充-1

#2


2  

is that what you want?

那是你要的吗?

In [37]: a = np.array([(0,0,8)
   ....:              ,(0,1,5)
   ....:              ,(0,2,3)
   ....:              ,(1,0,4)
   ....:              ,(1,1,0)
   ....:              ,(1,2,0)
   ....:              ,(2,0,1)
   ....:              ,(2,1,2)
   ....:              ,(2,2,5)])

In [38]:

In [38]: a
Out[38]:
array([[0, 0, 8],
       [0, 1, 5],
       [0, 2, 3],
       [1, 0, 4],
       [1, 1, 0],
       [1, 2, 0],
       [2, 0, 1],
       [2, 1, 2],
       [2, 2, 5]])

In [39]:

In [39]: a[:, 2].reshape(3,len(a)//3)
Out[39]:
array([[8, 5, 3],
       [4, 0, 0],
       [1, 2, 5]])

or a bit more flexible (after your comment):

或者更灵活(在你的评论之后):

In [48]: a[:, 2].reshape([int(len(a) ** .5)] * 2)
Out[48]:
array([[8, 5, 3],
       [4, 0, 0],
       [1, 2, 5]])

Explanation:

说明:

this gives you the 3rd column (value):

这给你第3列(值):

In [42]: a[:, 2]
Out[42]: array([8, 5, 3, 4, 0, 0, 1, 2, 5])


In [49]: [int(len(a) ** .5)]
Out[49]: [3]

In [50]: [int(len(a) ** .5)] * 2
Out[50]: [3, 3]

#3


2  

Why not using sparse matrices? (which is pretty much the format of your triplets.)

为什么不使用稀疏矩阵? (这几乎是你的三胞胎的格式。)

First split the triplets in rows, columns, and data using numpy.hsplit(). (Use numpy.squeeze() to convert the resulting 2d arrays to 1d arrays.)

首先使用numpy.hsplit()在行,列和数据中拆分三元组。 (使用numpy.squeeze()将生成的2d数组转换为1d数组。)

>>> row, col, data = [np.squeeze(splt) for splt
...                   in np.hsplit(tripets, tripets.shape[-1])]

Use the sparse matrix in coordinate format, and convert it to an array.

以坐标格式使用稀疏矩阵,并将其转换为数组。

>>> from scipy.sparse import coo_matrix
>>> coo_matrix((data, (row, col))).toarray()
array([[8, 5, 3],
       [4, 0, 0],
       [1, 2, 5]])

#1


3  

Extending the answer from @MaxU, in case the coordinates are not ordered in a grid fashion (or in case some coordinates are missing), you can create your array as follows:

从@MaxU扩展答案,如果坐标没有以网格方式排序(或者如果缺少某些坐标),您可以按如下方式创建数组:

import numpy as np

a = np.array([(0,0,8),(0,1,5),(0,2,3),
              (1,0,4),(1,1,0),(1,2,0),
              (2,0,1),(2,1,2),(2,2,5)])

Here a represents your coordinates. It is an (N, 3) array, where N is the number of coordinates (it doesn't have to contain ALL the coordinates). The first column of a (a[:, 0]) contains the Y positions while the second columne (a[:, 1]) contains the X positions. Similarly, the last column (a[:, 2]) contains your values.

这里a代表你的坐标。它是一个(N,3)数组,其中N是坐标数(它不必包含所有坐标)。 a(a [:,0])的第一列包含Y位置,而第二列(a [:,1])包含X位置。同样,最后一列(a [:,2])包含您的值。

Then you can extract the maximum dimensions of your target array:

然后,您可以提取目标数组的最大尺寸:

# Maximum Y and X coordinates
ymax = a[:, 0].max()
xmax = a[:, 1].max()

# Target array
target = np.zeros((ymax+1, xmax+1), a.dtype)

And finally, fill the array with data from your coordinates:

最后,使用坐标中的数据填充数组:

target[a[:, 0], a[:, 1]] = a[:, 2]

The line above sets values in target at a[:, 0] (all Y) and a[:, 1] (all X) locations to their corresponding a[:, 2] value (your value).

上面的行将[:,0](所有Y)和[:,1](所有X)位置的目标值设置为相应的[:,2]值(您的值)。

>>> target
array([[8, 5, 3],
       [4, 0, 0],
       [1, 2, 5]])

Additionally, if you have missing coordinates, and you want to replace those missing values by some number, you can initialize the array as:

此外,如果您缺少坐标,并且想要用某个数字替换这些缺失值,则可以将数组初始化为:

default_value = -1
target = np.full((ymax+1, xmax+1), default_value, a.type)

This way, the coordinates not present in your list will be filled with -1 in the target array/

这样,列表中不存在的坐标将在目标数组中填充-1

#2


2  

is that what you want?

那是你要的吗?

In [37]: a = np.array([(0,0,8)
   ....:              ,(0,1,5)
   ....:              ,(0,2,3)
   ....:              ,(1,0,4)
   ....:              ,(1,1,0)
   ....:              ,(1,2,0)
   ....:              ,(2,0,1)
   ....:              ,(2,1,2)
   ....:              ,(2,2,5)])

In [38]:

In [38]: a
Out[38]:
array([[0, 0, 8],
       [0, 1, 5],
       [0, 2, 3],
       [1, 0, 4],
       [1, 1, 0],
       [1, 2, 0],
       [2, 0, 1],
       [2, 1, 2],
       [2, 2, 5]])

In [39]:

In [39]: a[:, 2].reshape(3,len(a)//3)
Out[39]:
array([[8, 5, 3],
       [4, 0, 0],
       [1, 2, 5]])

or a bit more flexible (after your comment):

或者更灵活(在你的评论之后):

In [48]: a[:, 2].reshape([int(len(a) ** .5)] * 2)
Out[48]:
array([[8, 5, 3],
       [4, 0, 0],
       [1, 2, 5]])

Explanation:

说明:

this gives you the 3rd column (value):

这给你第3列(值):

In [42]: a[:, 2]
Out[42]: array([8, 5, 3, 4, 0, 0, 1, 2, 5])


In [49]: [int(len(a) ** .5)]
Out[49]: [3]

In [50]: [int(len(a) ** .5)] * 2
Out[50]: [3, 3]

#3


2  

Why not using sparse matrices? (which is pretty much the format of your triplets.)

为什么不使用稀疏矩阵? (这几乎是你的三胞胎的格式。)

First split the triplets in rows, columns, and data using numpy.hsplit(). (Use numpy.squeeze() to convert the resulting 2d arrays to 1d arrays.)

首先使用numpy.hsplit()在行,列和数据中拆分三元组。 (使用numpy.squeeze()将生成的2d数组转换为1d数组。)

>>> row, col, data = [np.squeeze(splt) for splt
...                   in np.hsplit(tripets, tripets.shape[-1])]

Use the sparse matrix in coordinate format, and convert it to an array.

以坐标格式使用稀疏矩阵,并将其转换为数组。

>>> from scipy.sparse import coo_matrix
>>> coo_matrix((data, (row, col))).toarray()
array([[8, 5, 3],
       [4, 0, 0],
       [1, 2, 5]])