I want to understand how this ndarray.sum(axis=) works. I know that axis=0 is for columns and axis=1 is for rows. But in case of 3 dimensions(3 axes) its difficult to interpret below result.
我想知道ndarray.sum(axis=)是如何工作的。我知道轴=0表示列,轴=1表示行。但在三维空间(3个轴)情况下,其结果很难解释。
arr = np.arange(0,30).reshape(2,3,5)
arr
Out[1]:
array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]],
[[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29]]])
arr.sum(axis=0)
Out[2]:
array([[15, 17, 19, 21, 23],
[25, 27, 29, 31, 33],
[35, 37, 39, 41, 43]])
arr.sum(axis=1)
Out[8]:
array([[15, 18, 21, 24, 27],
[60, 63, 66, 69, 72]])
arr.sum(axis=2)
Out[3]:
array([[ 10, 35, 60],
[ 85, 110, 135]])
Here in this example of 3 axes array of shape(2,3,5), there are 3 rows and 5 columns. But if i look at this array as whole, seems like only two rows (both with 3 array elements).
在这个形状为3个轴(2,3,5)的示例中,有3行5列。但是如果我把这个数组看作整体,就好像只有两行(都包含3个数组元素)。
Can anyone please explain how this sum works on array of 3 or more axes(dimensions).
谁能解释一下这个和是如何作用于3个或更多轴(维度)的数组的。
6 个解决方案
#1
3
numpy
displays a (2,3,5) array as 2 blocks of 3x5 arrays (3 rows, 5 columns). Or call them 'planes' (MATLAB would show it as 5 blocks of 2x3).
numpy将(2,3,5)数组显示为3x5数组(3行,5列)的两个块。或者叫它们“平面”(MATLAB会显示为5个block的2x3)。
The numpy
display also matches a nested list - a list of two sublists; each with 3 sublists. Each of those is 5 elements long.
numpy显示还匹配一个嵌套列表—两个子列表的列表;每3子列表。每一个都有5个元素长。
In the 3x5 2d case, axis 0 sums along the 3
dimension, resulting in a 5 element array. The descriptions 'sum over rows' or 'sum along colulmns' are a little vague in English. Focus on the results, the change in shape, and which values are being summed, not on the description.
在3x5二维情况下,轴0沿3维求和,得到5个元素数组。在英语中,“对行求和”或“对colulmns求和”的描述有点模糊。关注结果、形状的变化,以及正在求和的值,而不是描述。
Back to the 3d case:
回到3d情况:
With axis=0
, it sums along the 1st dimension, effectively removing it, leaving us with a 3x5 array. 0+15=16, 1+16=17 etc
.
轴=0时,它沿着第一个维度求和,有效地移除它,留给我们一个3x5的数组。1 + 16 = 0 + 15 = 16日17日等。
Axis 1, condenses the size 3
dimension, result is 2x5. 0+5+10=15, etc
.
轴1,压缩尺寸3维,结果是2x5。0 + 5 + 10 = 15,等等。
Axis 2, condense the size 5
dimenson, result is 2x3, sum((0,1,2,3,4))
轴2,压缩尺寸5维,结果是2x3,和(0,1,2,3,4)
Your example is good, since the 3 dimensions are different, and it is easier to see which one was eliminated during the sum.
你的例子很好,因为这三个维度是不同的,很容易看出哪个在求和过程中被消掉了。
With 2d there's some ambiguity; 'sum over rows' - does that mean the rows are eliminated or retained? With 3d there's no ambiguity; with axis=0, you can only remove it, leaving the other 2.
2d有一些歧义;“对行求和”——这是否意味着删除或保留了行?使用3d,没有歧义;当坐标轴=0时,你只能移除它,剩下的是2。
#2
4
If you want to keep the dimensions you can specify keepdims
:
如果你想保留尺寸,你可以指定keepdims:
>>> arr = np.arange(0,30).reshape(2,3,5)
>>> arr.sum(axis=0, keepdims=True)
array([[[15, 17, 19, 21, 23],
[25, 27, 29, 31, 33],
[35, 37, 39, 41, 43]]])
Otherwise the axis you sum along is removed from the shape. An easy way to keep track of this is using the numpy.ndarray.shape
property:
否则,您沿着的轴将从形状中删除。一个简单的跟踪方法是使用numpi .ndarray。形状属性:
>>> arr.shape
(2, 3, 5)
>>> arr.sum(axis=0).shape
(3, 5) # the first entry (index = axis = 0) dimension was removed
>>> arr.sum(axis=1).shape
(2, 5) # the second entry (index = axis = 1) was removed
You can also sum along multiple axis if you want (reducing the dimensionality by the amount of specified axis):
如果你愿意,你也可以沿着多个轴求和(将维度减少指定轴的数量):
>>> arr.sum(axis=(0, 1))
array([75, 81, 87, 93, 99])
>>> arr.sum(axis=(0, 1)).shape
(5, ) # first and second entry is removed
#3
2
Here is another way to interpret this. You can consider a multi-dimensional array as a tensor, T[i][j][k]
, while i, j, k represents axis 0,1,2
respectively.
这是另一种解释。你可以把一个多维数组看作一个张量,T[i][j][k],而i, j, k表示轴0,1,2。
T.sum(axis = 0)
mathematically will be equivalent to:
T。数学上的和(轴= 0)等于:
Similary, T.sum(axis = 1)
:
同样,T。总和(轴= 1):
And, T.sum(axis = 2)
:
而且,T。总和(轴= 2):
So in another word, the axis will be summed over, for instance, axis = 0
, the first index will be summed over. If written in a for loop:
换句话说,轴将被求和,例如,轴= 0,第一个指标将被求和。如果写在for循环:
result[j][k] = sum(T[i][j][k] for i in range(T.shape[0])) for all j,k
for axis = 1
:
轴= 1:
result[i][k] = sum(T[i][j][k] for j in range(T.shape[1])) for all i,k
etc.
等。
#4
0
The axis you specify is the one that is effectively removed. So given a shape of (2,3,5)
, axis 0 gives (3,5)
, axis 1 gives (2,5)
, etc. This extends to any number of dimensions.
您指定的轴是有效删除的轴。已知(2,3,5)轴0表示(3,5)轴1表示(2,5)
#5
0
You seem to be confused by the output style of numpy arrays. The "row" of the output is almost always the last index, not the first. Example:
您似乎对numpy数组的输出样式感到困惑。输出的“行”几乎总是最后一个索引,而不是第一个。例子:
x=np.arange(1,4)
y=np.arange(10,31,10)
z=np.arange(100,301,100)
xy=x[:,None]+y[None,:]
xy
Out[100]:
array([[11, 21, 31],
[12, 22, 32],
[13, 23, 33]])
Notice the tens place increments on the row, not the column, even though y is the second index.
注意十位的增量在行上,而不是列上,即使y是第二个索引。
xyz=x[:,None,None]+y[None,:,None]+z[None,None,:]
xyz
Out[102]:
array([[[111, 211, 311],
[121, 221, 321],
[131, 231, 331]],
[[112, 212, 312],
[122, 222, 322],
[132, 232, 332]],
[[113, 213, 313],
[123, 223, 323],
[133, 233, 333]]])
Now the hundred's place increments in the row, even though z is the last index. This can be somewhat counter-intuitive to beginners.
现在百位在行中递增,即使z是最后一个索引。这对初学者来说可能有点反直觉。
Thus when you do np.sum(x,index=-1)
you will always sum over the "rows" as shown in the np.array([])
format. Looking at the arr.sum(axis=2)[0,0]
that's 0+1+2+3+4=10
.
因此,当您执行np.sum(x,index=-1)时,总是会对“行”进行求和,如np.array([])格式所示。看arr。sum(轴=2)[0,0]= 0+1+2+3+4=10。
#6
0
Think of a multi-dimensional array as a tree. Each dimension is a level in the tree. Each grouping at that level is a node. A sum along a specific axis (say axis=4) means coalescing (overlaying) all nodes at that level into a single node (under their respective parents). Sub-trees rooted at the overlaid nodes at that level are stacked on top of each other. All overlapping nodes' values are added together.
Picture: https://ibb.co/dg3P3w
把多维数组想象成一棵树。每个维度都是树中的一个级别。该级别上的每个分组都是一个节点。沿特定轴的和(例如axis=4)意味着将该级别的所有节点合并(叠加)为单个节点(在各自的父节点下)。在这一层的重叠节点上的子树相互叠加。将所有重叠节点的值相加。图片:https://ibb.co/dg3P3w
#1
3
numpy
displays a (2,3,5) array as 2 blocks of 3x5 arrays (3 rows, 5 columns). Or call them 'planes' (MATLAB would show it as 5 blocks of 2x3).
numpy将(2,3,5)数组显示为3x5数组(3行,5列)的两个块。或者叫它们“平面”(MATLAB会显示为5个block的2x3)。
The numpy
display also matches a nested list - a list of two sublists; each with 3 sublists. Each of those is 5 elements long.
numpy显示还匹配一个嵌套列表—两个子列表的列表;每3子列表。每一个都有5个元素长。
In the 3x5 2d case, axis 0 sums along the 3
dimension, resulting in a 5 element array. The descriptions 'sum over rows' or 'sum along colulmns' are a little vague in English. Focus on the results, the change in shape, and which values are being summed, not on the description.
在3x5二维情况下,轴0沿3维求和,得到5个元素数组。在英语中,“对行求和”或“对colulmns求和”的描述有点模糊。关注结果、形状的变化,以及正在求和的值,而不是描述。
Back to the 3d case:
回到3d情况:
With axis=0
, it sums along the 1st dimension, effectively removing it, leaving us with a 3x5 array. 0+15=16, 1+16=17 etc
.
轴=0时,它沿着第一个维度求和,有效地移除它,留给我们一个3x5的数组。1 + 16 = 0 + 15 = 16日17日等。
Axis 1, condenses the size 3
dimension, result is 2x5. 0+5+10=15, etc
.
轴1,压缩尺寸3维,结果是2x5。0 + 5 + 10 = 15,等等。
Axis 2, condense the size 5
dimenson, result is 2x3, sum((0,1,2,3,4))
轴2,压缩尺寸5维,结果是2x3,和(0,1,2,3,4)
Your example is good, since the 3 dimensions are different, and it is easier to see which one was eliminated during the sum.
你的例子很好,因为这三个维度是不同的,很容易看出哪个在求和过程中被消掉了。
With 2d there's some ambiguity; 'sum over rows' - does that mean the rows are eliminated or retained? With 3d there's no ambiguity; with axis=0, you can only remove it, leaving the other 2.
2d有一些歧义;“对行求和”——这是否意味着删除或保留了行?使用3d,没有歧义;当坐标轴=0时,你只能移除它,剩下的是2。
#2
4
If you want to keep the dimensions you can specify keepdims
:
如果你想保留尺寸,你可以指定keepdims:
>>> arr = np.arange(0,30).reshape(2,3,5)
>>> arr.sum(axis=0, keepdims=True)
array([[[15, 17, 19, 21, 23],
[25, 27, 29, 31, 33],
[35, 37, 39, 41, 43]]])
Otherwise the axis you sum along is removed from the shape. An easy way to keep track of this is using the numpy.ndarray.shape
property:
否则,您沿着的轴将从形状中删除。一个简单的跟踪方法是使用numpi .ndarray。形状属性:
>>> arr.shape
(2, 3, 5)
>>> arr.sum(axis=0).shape
(3, 5) # the first entry (index = axis = 0) dimension was removed
>>> arr.sum(axis=1).shape
(2, 5) # the second entry (index = axis = 1) was removed
You can also sum along multiple axis if you want (reducing the dimensionality by the amount of specified axis):
如果你愿意,你也可以沿着多个轴求和(将维度减少指定轴的数量):
>>> arr.sum(axis=(0, 1))
array([75, 81, 87, 93, 99])
>>> arr.sum(axis=(0, 1)).shape
(5, ) # first and second entry is removed
#3
2
Here is another way to interpret this. You can consider a multi-dimensional array as a tensor, T[i][j][k]
, while i, j, k represents axis 0,1,2
respectively.
这是另一种解释。你可以把一个多维数组看作一个张量,T[i][j][k],而i, j, k表示轴0,1,2。
T.sum(axis = 0)
mathematically will be equivalent to:
T。数学上的和(轴= 0)等于:
Similary, T.sum(axis = 1)
:
同样,T。总和(轴= 1):
And, T.sum(axis = 2)
:
而且,T。总和(轴= 2):
So in another word, the axis will be summed over, for instance, axis = 0
, the first index will be summed over. If written in a for loop:
换句话说,轴将被求和,例如,轴= 0,第一个指标将被求和。如果写在for循环:
result[j][k] = sum(T[i][j][k] for i in range(T.shape[0])) for all j,k
for axis = 1
:
轴= 1:
result[i][k] = sum(T[i][j][k] for j in range(T.shape[1])) for all i,k
etc.
等。
#4
0
The axis you specify is the one that is effectively removed. So given a shape of (2,3,5)
, axis 0 gives (3,5)
, axis 1 gives (2,5)
, etc. This extends to any number of dimensions.
您指定的轴是有效删除的轴。已知(2,3,5)轴0表示(3,5)轴1表示(2,5)
#5
0
You seem to be confused by the output style of numpy arrays. The "row" of the output is almost always the last index, not the first. Example:
您似乎对numpy数组的输出样式感到困惑。输出的“行”几乎总是最后一个索引,而不是第一个。例子:
x=np.arange(1,4)
y=np.arange(10,31,10)
z=np.arange(100,301,100)
xy=x[:,None]+y[None,:]
xy
Out[100]:
array([[11, 21, 31],
[12, 22, 32],
[13, 23, 33]])
Notice the tens place increments on the row, not the column, even though y is the second index.
注意十位的增量在行上,而不是列上,即使y是第二个索引。
xyz=x[:,None,None]+y[None,:,None]+z[None,None,:]
xyz
Out[102]:
array([[[111, 211, 311],
[121, 221, 321],
[131, 231, 331]],
[[112, 212, 312],
[122, 222, 322],
[132, 232, 332]],
[[113, 213, 313],
[123, 223, 323],
[133, 233, 333]]])
Now the hundred's place increments in the row, even though z is the last index. This can be somewhat counter-intuitive to beginners.
现在百位在行中递增,即使z是最后一个索引。这对初学者来说可能有点反直觉。
Thus when you do np.sum(x,index=-1)
you will always sum over the "rows" as shown in the np.array([])
format. Looking at the arr.sum(axis=2)[0,0]
that's 0+1+2+3+4=10
.
因此,当您执行np.sum(x,index=-1)时,总是会对“行”进行求和,如np.array([])格式所示。看arr。sum(轴=2)[0,0]= 0+1+2+3+4=10。
#6
0
Think of a multi-dimensional array as a tree. Each dimension is a level in the tree. Each grouping at that level is a node. A sum along a specific axis (say axis=4) means coalescing (overlaying) all nodes at that level into a single node (under their respective parents). Sub-trees rooted at the overlaid nodes at that level are stacked on top of each other. All overlapping nodes' values are added together.
Picture: https://ibb.co/dg3P3w
把多维数组想象成一棵树。每个维度都是树中的一个级别。该级别上的每个分组都是一个节点。沿特定轴的和(例如axis=4)意味着将该级别的所有节点合并(叠加)为单个节点(在各自的父节点下)。在这一层的重叠节点上的子树相互叠加。将所有重叠节点的值相加。图片:https://ibb.co/dg3P3w