如何在numpy结构化数组中返回多个列的视图

I can see several columns (fields) at once in a numpy structured array by indexing with a list of the field names, for example

我可以通过索引字段名称列表，在numpy结构化数组中一次看到多个列（字段），例如

import numpy as np

a = np.array([(1.5, 2.5, (1.0,2.0)), (3.,4.,(4.,5.)), (1.,3.,(2.,6.))],
        dtype=[('x',float), ('y',float), ('value',float,(2,2))])

print a[['x','y']]
#[(1.5, 2.5) (3.0, 4.0) (1.0, 3.0)]

print a[['x','y']].dtype
#[('x', '<f4') ('y', '<f4')])

But the problem is that it seems to be a copy rather than a view:

但问题是它似乎是一个副本而不是一个视图：

b = a[['x','y']]
b[0] = (9.,9.)

print b
#[(9.0, 9.0) (3.0, 4.0) (1.0, 3.0)]

print a[['x','y']]
#[(1.5, 2.5) (3.0, 4.0) (1.0, 3.0)]

If I only select one column, it's a view:

如果我只选择一列，那就是一个视图：

c = x['y']
c[0] = 99.

print c
#[ 99.  4.   3. ]

print a['y']
#[ 99.  4.   3. ]

Is there any way I can get the view behavior for more than one column at once?

有什么方法可以一次获得多个列的查看行为吗？

I have two workarounds, one is to just loop through the columns, the other is to create a hierarchical dtype, so that the one column actually returns a structured array with the two (or more) fields that I want. Unfortunately, zip also returns a copy, so I can't do:

我有两个解决方法，一个是循环遍历列，另一个是创建分层dtype，以便一列实际返回一个带有我想要的两个（或更多）字段的结构化数组。不幸的是，zip也会返回一个副本，所以我做不到：

x = a['x']; y = a['y']
z = zip(x,y)
z[0] = (9.,9.)

4 个解决方案

#1

You can create a dtype object contains only the fields that you want, and use numpy.ndarray() to create a view of original array:

您可以创建一个只包含所需字段的dtype对象，并使用numpy.ndarray（）创建原始数组的视图：

import numpy as np
strc = np.zeros(3, dtype=[('x', int), ('y', float), ('z', int), ('t', "i8")])

def fields_view(arr, fields):
    dtype2 = np.dtype({name:arr.dtype.fields[name] for name in fields})
    return np.ndarray(arr.shape, dtype2, arr, 0, arr.strides)

v1 = fields_view(strc, ["x", "z"])
v1[0] = 10, 100

v2 = fields_view(strc, ["y", "z"])
v2[1:] = [(3.14, 7)]

v3 = fields_view(strc, ["x", "t"])

v3[1:] = [(1000, 2**16)]

print(strc)

here is the output:

这是输出：

[(10, 0.0, 100, 0L) (1000, 3.14, 7, 65536L) (1000, 3.14, 7, 65536L)]

#2

Building on @HYRY's answer, you could also use ndarray's method getfield:

在@ HYRY的答案的基础上，你也可以使用ndarray的方法getfield：

def fields_view(array, fields):
    return array.getfield(numpy.dtype(
        {name: array.dtype.fields[name] for name in fields}
    ))

#3

I don't think there is an easy way to achieve what you want. In general, you cannot take an arbitrary view into an array. Try the following:

我认为没有一种简单的方法可以达到你想要的效果。通常，您不能将任意视图放入数组中。请尝试以下方法：

>>> a
array([(1.5, 2.5, [[1.0, 2.0], [1.0, 2.0]]),
       (3.0, 4.0, [[4.0, 5.0], [4.0, 5.0]]),
       (1.0, 3.0, [[2.0, 6.0], [2.0, 6.0]])], 
      dtype=[('x', '<f8'), ('y', '<f8'), ('value', '<f8', (2, 2))])
>>> a.view(float)
array([ 1.5,  2.5,  1. ,  2. ,  1. ,  2. ,  3. ,  4. ,  4. ,  5. ,  4. ,
        5. ,  1. ,  3. ,  2. ,  6. ,  2. ,  6. ])

The float view of your record array shows you how the actual data is stored in memory. A view into this data has to be expressible as a combination of a shape, strides and offset into the above data. So if you wanted, for instance, a view of 'x' and 'y' only, you could do the following:

记录数组的float视图显示实际数据如何存储在内存中。对这些数据的看法必须表现为形状，步幅和偏移到上述数据中的组合。因此，如果您只想要“x”和“y”视图，则可以执行以下操作：

>>> from numpy.lib.stride_tricks import as_strided
>>> b = as_strided(a.view(float), shape=a.shape + (2,),
                   strides=a.strides + a.view(float).strides)
>>> b
array([[ 1.5,  2.5],
       [ 3. ,  4. ],
       [ 1. ,  3. ]])

The as_strided does the same as the perhaps easier to understand:

as_strided与可能更容易理解的相同：

>>> bb = a.view(float).reshape(a.shape + (-1,))[:, :2]
>>> bb
array([[ 1.5,  2.5],
       [ 3. ,  4. ],
       [ 1. ,  3. ]])

Either of this is a view into a:

这两者都是一个视图：

>>> b[0,0] =0
>>> a
array([(0.0, 2.5, [[0.0, 2.0], [1.0, 2.0]]),
       (3.0, 4.0, [[4.0, 5.0], [4.0, 5.0]]),
       (1.0, 3.0, [[2.0, 6.0], [2.0, 6.0]])], 
      dtype=[('x', '<f8'), ('y', '<f8'), ('value', '<f8', (2, 2))])
>>> bb[2, 1] = 0
>>> a
array([(0.0, 2.5, [[0.0, 2.0], [1.0, 2.0]]),
       (3.0, 4.0, [[4.0, 5.0], [4.0, 5.0]]),
       (1.0, 0.0, [[2.0, 6.0], [2.0, 6.0]])], 
      dtype=[('x', '<f8'), ('y', '<f8'), ('value', '<f8', (2, 2))])

It would be nice if either of this could be converted into a record array, but numpy refuses to do so, the reason not being all that clear to me:

如果其中任何一个都可以转换成一个记录数组，那会很好，但numpy拒绝这样做，这个原因对我来说并不是那么清楚：

>>> b.view([('x',float), ('y',float)])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: new type not compatible with array.

Of course what works (sort of) for 'x' and 'y' would not work, for instance, for 'x' and 'value', so in general the answer is: it cannot be done.

当然，对于'x'和'y'来说，有效的（有点）不会起作用，例如，对于'x'和'value'，所以通常答案是：它无法完成。

#4

As of Numpy version 1.13, the code you propose will return a view. See 'NumPy 1.12.0 Release Notes->Future Changes->Multiple-field manipulation of structured arrays' on this page:

从Numpy版本1.13开始，您建议的代码将返回一个视图。请参阅此页面上的'NumPy 1.12.0发行说明 - >未来变更 - >结构化数组的多字段操作'：

https://docs.scipy.org/doc/numpy-dev/release.html

#1