Say I have a 3 dimensional numpy array:
说我有一个3维numpy数组:
np.random.seed(1145)
A = np.random.random((5,5,5))
and I have two lists of indices corresponding to the 2nd and 3rd dimensions:
我有两个与第二和第三维对应的索引列表:
second = [1,2]
third = [3,4]
and I want to select the elements in the numpy array corresponding to
我想选择numpy数组中对应的元素
A[:][second][third]
so the shape of the sliced array would be (5,2,2)
and
所以切片阵列的形状为(5,2,2)和
A[:][second][third].flatten()
would be equivalent to to:
相当于:
In [226]:
for i in range(5):
for j in second:
for k in third:
print A[i][j][k]
0.556091074129
0.622016249651
0.622530505868
0.914954716368
0.729005532319
0.253214472335
0.892869371179
0.98279375528
0.814240066639
0.986060321906
0.829987410941
0.776715489939
0.404772469431
0.204696635072
0.190891168574
0.869554447412
0.364076117846
0.04760811817
0.440210532601
0.981601369658
Is there a way to slice a numpy array in this way? So far when I try A[:][second][third]
I get IndexError: index 3 is out of bounds for axis 0 with size 2
because the [:]
for the first dimension seems to be ignored.
有没有办法以这种方式切割numpy数组?到目前为止,当我尝试A [:] [second] [third]时,我得到IndexError:索引3超出了轴0的大小为2,因为第一维的[:]似乎被忽略了。
3 个解决方案
#1
8
Numpy uses multiple indexing, so instead of A[1][2][3]
, you can--and should--use A[1,2,3]
.
Numpy使用多个索引,因此您可以 - 而且应该 - 使用A [1,2,3]而不是A [1] [2] [3]。
You might then think you could do A[:, second, third]
, but the numpy indices are broadcast, and broadcasting second
and third
(two one-dimensional sequences) ends up being the numpy equivalent of zip
, so the result has shape (5, 2)
.
你可能认为你可以做A [:,第二,第三],但是numpy索引是广播的,广播第二和第三(两个一维序列)最终是zip的numpy等价物,所以结果有形( 5,2)。
What you really want is to index with, in effect, the outer product of second
and third
. You can do this with broadcasting by making one of them, say second
into a two-dimensional array with shape (2,1). Then the shape that results from broadcasting second
and third
together is (2,2)
.
你真正想要的是实际上使用第二和第三的外积进行索引。您可以通过制作其中一个广播来实现这一点,然后再将其作为具有形状(2,1)的二维数组。然后,由第二和第三广播一起产生的形状是(2,2)。
For example:
例如:
In [8]: import numpy as np
In [9]: a = np.arange(125).reshape(5,5,5)
In [10]: second = [1,2]
In [11]: third = [3,4]
In [12]: s = a[:, np.array(second).reshape(-1,1), third]
In [13]: s.shape
Out[13]: (5, 2, 2)
Note that, in this specific example, the values in second
and third
are sequential. If that is typical, you can simply use slices:
注意,在该具体示例中,第二和第三中的值是顺序的。如果这是典型的,您可以简单地使用切片:
In [14]: s2 = a[:, 1:3, 3:5]
In [15]: s2.shape
Out[15]: (5, 2, 2)
In [16]: np.all(s == s2)
Out[16]: True
There are a couple very important difference in those two methods.
这两种方法有两个非常重要的区别。
- The first method would also work with indices that are not equivalent to slices. For example, it would work if
second = [0, 2, 3]
. (Sometimes you'll see this style of indexing referred to as "fancy indexing".) - 第一种方法也适用于与切片不等效的索引。例如,如果second = [0,2,3],它将起作用。 (有时你会看到这种索引方式被称为“花式索引”。)
- In the first method (using broadcasting and "fancy indexing"), the data is a copy of the original array. In the second method (using only slices), the array
s2
is a view into the same block of memory used bya
. An in-place change in one will change them both. - 在第一种方法(使用广播和“花式索引”)中,数据是原始数组的副本。在第二种方法(仅使用切片)中,数组s2是a使用的同一内存块的视图。一个就地改变将改变它们。
#2
3
One way would be to use np.ix_
:
一种方法是使用np.ix_:
>>> out = A[np.ix_(range(A.shape[0]),second, third)]
>>> out.shape
(5, 2, 2)
>>> manual = [A[i,j,k] for i in range(5) for j in second for k in third]
>>> (out.ravel() == manual).all()
True
Downside is that you have to specify the missing coordinate ranges explicitly, but you could wrap that into a function.
缺点是您必须明确指定缺少的坐标范围,但您可以将其包装到函数中。
#3
1
I think there are three problems with your approach:
我认为你的方法有三个问题:
- Both
second
andthird
should beslices
- 第二和第三都应该是切片
- Since the 'to' index is exclusive, they should go from
1
to3
and from3
to5
- 由于'to'索引是独占的,因此它们应该从1到3,从3到5
- Instead of
A[:][second][third]
, you should useA[:,second,third]
- 而不是A [:] [second] [third],你应该使用A [:,second,third]
Try this:
尝试这个:
>>> np.random.seed(1145)
>>> A = np.random.random((5,5,5))
>>> second = slice(1,3)
>>> third = slice(3,5)
>>> A[:,second,third].shape
(5, 2, 2)
>>> A[:,second,third].flatten()
array([ 0.43285482, 0.80820122, 0.64878266, 0.62689481, 0.01298507,
0.42112921, 0.23104051, 0.34601169, 0.24838564, 0.66162209,
0.96115751, 0.07338851, 0.33109539, 0.55168356, 0.33925748,
0.2353348 , 0.91254398, 0.44692211, 0.60975602, 0.64610556])
#1
8
Numpy uses multiple indexing, so instead of A[1][2][3]
, you can--and should--use A[1,2,3]
.
Numpy使用多个索引,因此您可以 - 而且应该 - 使用A [1,2,3]而不是A [1] [2] [3]。
You might then think you could do A[:, second, third]
, but the numpy indices are broadcast, and broadcasting second
and third
(two one-dimensional sequences) ends up being the numpy equivalent of zip
, so the result has shape (5, 2)
.
你可能认为你可以做A [:,第二,第三],但是numpy索引是广播的,广播第二和第三(两个一维序列)最终是zip的numpy等价物,所以结果有形( 5,2)。
What you really want is to index with, in effect, the outer product of second
and third
. You can do this with broadcasting by making one of them, say second
into a two-dimensional array with shape (2,1). Then the shape that results from broadcasting second
and third
together is (2,2)
.
你真正想要的是实际上使用第二和第三的外积进行索引。您可以通过制作其中一个广播来实现这一点,然后再将其作为具有形状(2,1)的二维数组。然后,由第二和第三广播一起产生的形状是(2,2)。
For example:
例如:
In [8]: import numpy as np
In [9]: a = np.arange(125).reshape(5,5,5)
In [10]: second = [1,2]
In [11]: third = [3,4]
In [12]: s = a[:, np.array(second).reshape(-1,1), third]
In [13]: s.shape
Out[13]: (5, 2, 2)
Note that, in this specific example, the values in second
and third
are sequential. If that is typical, you can simply use slices:
注意,在该具体示例中,第二和第三中的值是顺序的。如果这是典型的,您可以简单地使用切片:
In [14]: s2 = a[:, 1:3, 3:5]
In [15]: s2.shape
Out[15]: (5, 2, 2)
In [16]: np.all(s == s2)
Out[16]: True
There are a couple very important difference in those two methods.
这两种方法有两个非常重要的区别。
- The first method would also work with indices that are not equivalent to slices. For example, it would work if
second = [0, 2, 3]
. (Sometimes you'll see this style of indexing referred to as "fancy indexing".) - 第一种方法也适用于与切片不等效的索引。例如,如果second = [0,2,3],它将起作用。 (有时你会看到这种索引方式被称为“花式索引”。)
- In the first method (using broadcasting and "fancy indexing"), the data is a copy of the original array. In the second method (using only slices), the array
s2
is a view into the same block of memory used bya
. An in-place change in one will change them both. - 在第一种方法(使用广播和“花式索引”)中,数据是原始数组的副本。在第二种方法(仅使用切片)中,数组s2是a使用的同一内存块的视图。一个就地改变将改变它们。
#2
3
One way would be to use np.ix_
:
一种方法是使用np.ix_:
>>> out = A[np.ix_(range(A.shape[0]),second, third)]
>>> out.shape
(5, 2, 2)
>>> manual = [A[i,j,k] for i in range(5) for j in second for k in third]
>>> (out.ravel() == manual).all()
True
Downside is that you have to specify the missing coordinate ranges explicitly, but you could wrap that into a function.
缺点是您必须明确指定缺少的坐标范围,但您可以将其包装到函数中。
#3
1
I think there are three problems with your approach:
我认为你的方法有三个问题:
- Both
second
andthird
should beslices
- 第二和第三都应该是切片
- Since the 'to' index is exclusive, they should go from
1
to3
and from3
to5
- 由于'to'索引是独占的,因此它们应该从1到3,从3到5
- Instead of
A[:][second][third]
, you should useA[:,second,third]
- 而不是A [:] [second] [third],你应该使用A [:,second,third]
Try this:
尝试这个:
>>> np.random.seed(1145)
>>> A = np.random.random((5,5,5))
>>> second = slice(1,3)
>>> third = slice(3,5)
>>> A[:,second,third].shape
(5, 2, 2)
>>> A[:,second,third].flatten()
array([ 0.43285482, 0.80820122, 0.64878266, 0.62689481, 0.01298507,
0.42112921, 0.23104051, 0.34601169, 0.24838564, 0.66162209,
0.96115751, 0.07338851, 0.33109539, 0.55168356, 0.33925748,
0.2353348 , 0.91254398, 0.44692211, 0.60975602, 0.64610556])