numpy order数组切片索引如何？

I have an np.array data of shape (28,8,20), and I only need certain entries from it, so I'm taking a slice:

我有一个形状的np.array数据（28,8,20），我只需要它的某些条目，所以我正在切片：

In [41]: index = np.array([ 5,  6,  7,  8,  9, 10, 11, 17, 18, 19])
In [42]: extract = data[:,:,index]
In [43]: extract.shape
Out[43]: (28, 8, 10)

So far so good, everything as it should be. But now I wand to look at just the first two entries on the last index for the first line:

到目前为止一切都那么好，一切都应该如此。但是现在我只想看看第一行最后一个索引的前两个条目：

In [45]: extract[0,:,np.array([0,1])].shape
Out[45]: (2, 8)

Wait, that should be (8,2). It switched the indices around, even though it did not when I sliced the last time! According to my understanding, the following should act the same way:

等等，应该是（8,2）。它改变了指数，即使我最后一次切片时没有！根据我的理解，以下应采取相同的方式：

In [46]: extract[0,:,:2].shape
Out[46]: (8, 2)

... but it gives me exactly what I wanted! As long as I have a 3D-array, though, both methods seem to be equivalent:

......但它给了我我想要的东西！但是，只要我有一个3D数组，这两种方法似乎都是等价的：

In [47]: extract[:,:,np.array([0,1])].shape
Out[47]: (28, 8, 2)

In [48]: extract[:,:,:2].shape
Out[48]: (28, 8, 2)

So what do I do if I want not just the first two entries but an irregular list? I could of course transpose the matrix after the operation but this seems very counter-intuitive. A better solution to my problem is this (though there might be a more elegant one):

那么，如果我不仅需要前两个条目而且需要不规则列表，我该怎么办？我当然可以在操作后转置矩阵，但这似乎非常违反直觉。我的问题的一个更好的解决方案是这个（尽管可能有更优雅的一个）：

In [64]: extract[0][:,[0,1]].shape
Out[64]: (8, 2)

Which brings us to the actual

这把我们带到了实际

question:

I wonder what the reason for this behaviour is? Whoever decided that this is how it should work probably knew more about programming than I do and thought that this is consistent in some way that I am entirely missing. And I will likely keep hitting my head on this unless I have a way to make sense of it.

我想知道这种行为的原因是什么？无论谁决定它应该如何工作，可能比我更了解编程，并认为这在某种程度上是一致的，我完全没有。除非我有办法理解它，否则我可能会继续关注这个问题。

2 个解决方案

#1

This is a case of (advanced) partial indexing. There are 2 indexed arrays, and 1 slice

这是（高级）部分索引的情况。有2个索引数组和1个切片

If the indexing subspaces are separated (by slice objects), then the broadcasted indexing space is first, followed by the sliced subspace of x.

如果索引子空间是分开的（通过切片对象），则首先是广播的索引空间，然后是x的切片子空间。

http://docs.scipy.org/doc/numpy-1.8.1/reference/arrays.indexing.html#advanced-indexing

The advanced indexing example notes, when the ind_1, ind_2 broadcastable subspace is shape (2,3,4) that:

高级索引示例注意到，当ind_1，ind_2可广播子空间的形状（2,3,4）表示：

However, x[:,ind_1,:,ind_2] has shape (2,3,4,10,30,50) because there is no unambiguous place to drop in the indexing subspace, thus it is tacked-on to the beginning. It is always possible to use .transpose() to move the subspace anywhere desired.

但是，x [：，ind_1，：，ind_2]具有形状（2,3,4,10,30,50），因为在索引子空间中没有明确的位置，因此它被添加到开头。始终可以使用.transpose（）在任何需要的位置移动子空间。

In other words, this indexing is not the same as x[:, ind_1][[:,ind_2]. The 2 arrays operate jointly to define a (2,3,4) subspace.

换句话说，这个索引与x [：，ind_1] [[：，ind_2]不同。 2个阵列共同操作以定义（2,3,4）子空间。

In your example, extract[0,:,np.array([0,1])] is understood to mean, select a (2,) subspace (the [0] and [0,1] act jointly, not sequentially), and combine that in some way with the middle dimension.

在你的例子中，提取[0，：，np.array（[0,1]）]被理解为意味着，选择一个（2，）子空间（[0]和[0,1]共同行动，而不是顺序），并以某种方式将其与中间维度相结合。

A more elaborate example would be extract[[1,0],:,[[0,1],[1,0]]], which produces a (2,2,8) array. This is a (2,2) subspace of the 1st and last dimensions, plus the middle one. On the other hand, X[[1,0]][:,:,[[0,1],[1,0]]] produces a (2,8,2,2), selecting from the 1st and last dimensions separately.

一个更精细的例子是提取[[1,0]，：，[[0,1]，[1,0]]]，它产生一个（2,2,8）数组。这是第一维和最后一维的（2,2）子空间，加上中间维。另一方面，X [[1,0]] [：，：，[[0,1]，[1,0]]]产生一个（2,8,2,2），从第一个和最后一个选择尺寸分开。

The key difference is whether the indexed selections operate sequential or jointly. The `[...][...] syntax is already available to operate sequentially. Advanced indexing gives you a way indexing jointly.

关键的区别在于索引选择是顺序操作还是联合操作。 [...] [...]语法已经可以按顺序运行。高级索引为您提供了一种联合索引方式。

#2

You're right, that's weird. I can only hazard a guess here. I think it's related to the fact that a[[0,1],[0,1],[0,1]].shape is (2,) rather than (2,2,2) and that a[0,1,[0,1,2]] really means a[[0,0,0],[1,1,1],[0,1,2]] which evaluates to array([a[0,1,0],a[0,1,1],a[0,1,2]]). That is, you step through lists-as-indices for each dimension in parallel, with length-one lists and scalars being broadcast to match the longest.

你是对的，这很奇怪。我只能冒这个猜测。我认为这与[[0,1]，[0,1]，[0,1]]。形状是（2，）而不是（2,2,2）并且a [0， 1，[0,1,2]]实际上意味着[[0,0,0]，[1,1,1]，[0,1,2]]，它的计算结果为（[a [0,1， 0]，A [0,1,1]，A [0,1,2]]）。也就是说，您逐步浏览每个维度的列表 - 索引，其中长度为一的列表和标量被广播以匹配最长的。

Conceptually, that would make your extract[0,:,[0,1]] equivalent to extract[[0,0],[slice(None),slice(None)],[0,1]] (that syntax isn't accepted if you specify it manually, though). After stepping through the indices, that would evaluate to array([extract[0,slice(None),0],extract[0,slice(None),1]). Each of the inner extracts evaluate to a shape (8,) array, so the full result is shape (2,8).

从概念上讲，这将使你的提取[0，：，[0,1]]等同于提取[[0,0]，[slice（None），slice（None）]，[0,1]]（该语法不是但是，如果您手动指定它，则不会接受。单步执行索引后，将评估为数组（[extract [0，slice（None），0]，extract [0，slice（None），1]）。每个内部提取都评估为一个形状（8，）数组，因此完整的结果是形状（2,8）。

So to conclude I think it is a side-effect of the broadcasting that is done to make all the dimensions have an index list of the same length, which leads to : being broadcast too. That is my hypothesis, but I haven't looked at the inner workings of how numpy does this. Perhaps an expert will come along with a better explanation.

总而言之，我认为广播的副作用是使所有维度都具有相同长度的索引列表，这导致：广播也是如此。这是我的假设，但我还没有看到numpy如何做到这一点的内部运作。也许专家会提出更好的解释。

This hypothesis does not explain why extract[:,:,[0,1]] does not result in the same behavior. I would have to postulate that the case of only leading ":" being special-cased to avoid participating in the list index logic.

这个假设并不能解释为什么提取物[：，：，[0,1]]不会导致相同的行为。我必须假设只有前导“：”的情况是特殊的，以避免参与列表索引逻辑。

#1