在numpy数组中沿轴求和。

时间:2022-08-09 21:23:36

I want to understand how this ndarray.sum(axis=) works. I know that axis=0 is for columns and axis=1 is for rows. But in case of 3 dimensions(3 axes) its difficult to interpret below result.

我想知道ndarray.sum(axis=)是如何工作的。我知道轴=0表示列,轴=1表示行。但在三维空间(3个轴)情况下,其结果很难解释。

arr = np.arange(0,30).reshape(2,3,5)

arr
Out[1]: 
array([[[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14]],

       [[15, 16, 17, 18, 19],
        [20, 21, 22, 23, 24],
        [25, 26, 27, 28, 29]]])

arr.sum(axis=0)
Out[2]: 
array([[15, 17, 19, 21, 23],
       [25, 27, 29, 31, 33],
       [35, 37, 39, 41, 43]])


arr.sum(axis=1)
Out[8]: 
array([[15, 18, 21, 24, 27],
       [60, 63, 66, 69, 72]])

arr.sum(axis=2)
Out[3]: 
array([[ 10,  35,  60],
       [ 85, 110, 135]])

Here in this example of 3 axes array of shape(2,3,5), there are 3 rows and 5 columns. But if i look at this array as whole, seems like only two rows (both with 3 array elements).

在这个形状为3个轴(2,3,5)的示例中,有3行5列。但是如果我把这个数组看作整体,就好像只有两行(都包含3个数组元素)。

Can anyone please explain how this sum works on array of 3 or more axes(dimensions).

谁能解释一下这个和是如何作用于3个或更多轴(维度)的数组的。

6 个解决方案

#1


3  

numpy displays a (2,3,5) array as 2 blocks of 3x5 arrays (3 rows, 5 columns). Or call them 'planes' (MATLAB would show it as 5 blocks of 2x3).

numpy将(2,3,5)数组显示为3x5数组(3行,5列)的两个块。或者叫它们“平面”(MATLAB会显示为5个block的2x3)。

The numpy display also matches a nested list - a list of two sublists; each with 3 sublists. Each of those is 5 elements long.

numpy显示还匹配一个嵌套列表—两个子列表的列表;每3子列表。每一个都有5个元素长。

In the 3x5 2d case, axis 0 sums along the 3 dimension, resulting in a 5 element array. The descriptions 'sum over rows' or 'sum along colulmns' are a little vague in English. Focus on the results, the change in shape, and which values are being summed, not on the description.

在3x5二维情况下,轴0沿3维求和,得到5个元素数组。在英语中,“对行求和”或“对colulmns求和”的描述有点模糊。关注结果、形状的变化,以及正在求和的值,而不是描述。

Back to the 3d case:

回到3d情况:

With axis=0, it sums along the 1st dimension, effectively removing it, leaving us with a 3x5 array. 0+15=16, 1+16=17 etc.

轴=0时,它沿着第一个维度求和,有效地移除它,留给我们一个3x5的数组。1 + 16 = 0 + 15 = 16日17日等。

Axis 1, condenses the size 3 dimension, result is 2x5. 0+5+10=15, etc.

轴1,压缩尺寸3维,结果是2x5。0 + 5 + 10 = 15,等等。

Axis 2, condense the size 5 dimenson, result is 2x3, sum((0,1,2,3,4))

轴2,压缩尺寸5维,结果是2x3,和(0,1,2,3,4)

Your example is good, since the 3 dimensions are different, and it is easier to see which one was eliminated during the sum.

你的例子很好,因为这三个维度是不同的,很容易看出哪个在求和过程中被消掉了。

With 2d there's some ambiguity; 'sum over rows' - does that mean the rows are eliminated or retained? With 3d there's no ambiguity; with axis=0, you can only remove it, leaving the other 2.

2d有一些歧义;“对行求和”——这是否意味着删除或保留了行?使用3d,没有歧义;当坐标轴=0时,你只能移除它,剩下的是2。

#2


4  

If you want to keep the dimensions you can specify keepdims:

如果你想保留尺寸,你可以指定keepdims:

>>> arr = np.arange(0,30).reshape(2,3,5)
>>> arr.sum(axis=0, keepdims=True)
array([[[15, 17, 19, 21, 23],
        [25, 27, 29, 31, 33],
        [35, 37, 39, 41, 43]]])

Otherwise the axis you sum along is removed from the shape. An easy way to keep track of this is using the numpy.ndarray.shape property:

否则,您沿着的轴将从形状中删除。一个简单的跟踪方法是使用numpi .ndarray。形状属性:

>>> arr.shape
(2, 3, 5)

>>> arr.sum(axis=0).shape
(3, 5)  # the first entry (index = axis = 0) dimension was removed 

>>> arr.sum(axis=1).shape
(2, 5)  # the second entry (index = axis = 1) was removed

You can also sum along multiple axis if you want (reducing the dimensionality by the amount of specified axis):

如果你愿意,你也可以沿着多个轴求和(将维度减少指定轴的数量):

>>> arr.sum(axis=(0, 1))
array([75, 81, 87, 93, 99])
>>> arr.sum(axis=(0, 1)).shape
(5, )  # first and second entry is removed

#3


2  

Here is another way to interpret this. You can consider a multi-dimensional array as a tensor, T[i][j][k], while i, j, k represents axis 0,1,2 respectively.

这是另一种解释。你可以把一个多维数组看作一个张量,T[i][j][k],而i, j, k表示轴0,1,2。

T.sum(axis = 0) mathematically will be equivalent to:

T。数学上的和(轴= 0)等于:

在numpy数组中沿轴求和。

Similary, T.sum(axis = 1):

同样,T。总和(轴= 1):

在numpy数组中沿轴求和。

And, T.sum(axis = 2):

而且,T。总和(轴= 2):

在numpy数组中沿轴求和。

So in another word, the axis will be summed over, for instance, axis = 0, the first index will be summed over. If written in a for loop:

换句话说,轴将被求和,例如,轴= 0,第一个指标将被求和。如果写在for循环:

result[j][k] = sum(T[i][j][k] for i in range(T.shape[0])) for all j,k

for axis = 1:

轴= 1:

result[i][k] = sum(T[i][j][k] for j in range(T.shape[1])) for all i,k

etc.

等。

#4


0  

The axis you specify is the one that is effectively removed. So given a shape of (2,3,5), axis 0 gives (3,5), axis 1 gives (2,5), etc. This extends to any number of dimensions.

您指定的轴是有效删除的轴。已知(2,3,5)轴0表示(3,5)轴1表示(2,5)

#5


0  

You seem to be confused by the output style of numpy arrays. The "row" of the output is almost always the last index, not the first. Example:

您似乎对numpy数组的输出样式感到困惑。输出的“行”几乎总是最后一个索引,而不是第一个。例子:

x=np.arange(1,4)
y=np.arange(10,31,10)
z=np.arange(100,301,100)
xy=x[:,None]+y[None,:]

xy
Out[100]: 
array([[11, 21, 31],
       [12, 22, 32],
       [13, 23, 33]])

Notice the tens place increments on the row, not the column, even though y is the second index.

注意十位的增量在行上,而不是列上,即使y是第二个索引。

xyz=x[:,None,None]+y[None,:,None]+z[None,None,:]
xyz
Out[102]: 
array([[[111, 211, 311],
        [121, 221, 321],
        [131, 231, 331]],

       [[112, 212, 312],
        [122, 222, 322],
        [132, 232, 332]],

       [[113, 213, 313],
        [123, 223, 323],
        [133, 233, 333]]])

Now the hundred's place increments in the row, even though z is the last index. This can be somewhat counter-intuitive to beginners.

现在百位在行中递增,即使z是最后一个索引。这对初学者来说可能有点反直觉。

Thus when you do np.sum(x,index=-1) you will always sum over the "rows" as shown in the np.array([]) format. Looking at the arr.sum(axis=2)[0,0] that's 0+1+2+3+4=10.

因此,当您执行np.sum(x,index=-1)时,总是会对“行”进行求和,如np.array([])格式所示。看arr。sum(轴=2)[0,0]= 0+1+2+3+4=10。

#6


0  

Think of a multi-dimensional array as a tree. Each dimension is a level in the tree. Each grouping at that level is a node. A sum along a specific axis (say axis=4) means coalescing (overlaying) all nodes at that level into a single node (under their respective parents). Sub-trees rooted at the overlaid nodes at that level are stacked on top of each other. All overlapping nodes' values are added together.
Picture: https://ibb.co/dg3P3w

把多维数组想象成一棵树。每个维度都是树中的一个级别。该级别上的每个分组都是一个节点。沿特定轴的和(例如axis=4)意味着将该级别的所有节点合并(叠加)为单个节点(在各自的父节点下)。在这一层的重叠节点上的子树相互叠加。将所有重叠节点的值相加。图片:https://ibb.co/dg3P3w

#1


3  

numpy displays a (2,3,5) array as 2 blocks of 3x5 arrays (3 rows, 5 columns). Or call them 'planes' (MATLAB would show it as 5 blocks of 2x3).

numpy将(2,3,5)数组显示为3x5数组(3行,5列)的两个块。或者叫它们“平面”(MATLAB会显示为5个block的2x3)。

The numpy display also matches a nested list - a list of two sublists; each with 3 sublists. Each of those is 5 elements long.

numpy显示还匹配一个嵌套列表—两个子列表的列表;每3子列表。每一个都有5个元素长。

In the 3x5 2d case, axis 0 sums along the 3 dimension, resulting in a 5 element array. The descriptions 'sum over rows' or 'sum along colulmns' are a little vague in English. Focus on the results, the change in shape, and which values are being summed, not on the description.

在3x5二维情况下,轴0沿3维求和,得到5个元素数组。在英语中,“对行求和”或“对colulmns求和”的描述有点模糊。关注结果、形状的变化,以及正在求和的值,而不是描述。

Back to the 3d case:

回到3d情况:

With axis=0, it sums along the 1st dimension, effectively removing it, leaving us with a 3x5 array. 0+15=16, 1+16=17 etc.

轴=0时,它沿着第一个维度求和,有效地移除它,留给我们一个3x5的数组。1 + 16 = 0 + 15 = 16日17日等。

Axis 1, condenses the size 3 dimension, result is 2x5. 0+5+10=15, etc.

轴1,压缩尺寸3维,结果是2x5。0 + 5 + 10 = 15,等等。

Axis 2, condense the size 5 dimenson, result is 2x3, sum((0,1,2,3,4))

轴2,压缩尺寸5维,结果是2x3,和(0,1,2,3,4)

Your example is good, since the 3 dimensions are different, and it is easier to see which one was eliminated during the sum.

你的例子很好,因为这三个维度是不同的,很容易看出哪个在求和过程中被消掉了。

With 2d there's some ambiguity; 'sum over rows' - does that mean the rows are eliminated or retained? With 3d there's no ambiguity; with axis=0, you can only remove it, leaving the other 2.

2d有一些歧义;“对行求和”——这是否意味着删除或保留了行?使用3d,没有歧义;当坐标轴=0时,你只能移除它,剩下的是2。

#2


4  

If you want to keep the dimensions you can specify keepdims:

如果你想保留尺寸,你可以指定keepdims:

>>> arr = np.arange(0,30).reshape(2,3,5)
>>> arr.sum(axis=0, keepdims=True)
array([[[15, 17, 19, 21, 23],
        [25, 27, 29, 31, 33],
        [35, 37, 39, 41, 43]]])

Otherwise the axis you sum along is removed from the shape. An easy way to keep track of this is using the numpy.ndarray.shape property:

否则,您沿着的轴将从形状中删除。一个简单的跟踪方法是使用numpi .ndarray。形状属性:

>>> arr.shape
(2, 3, 5)

>>> arr.sum(axis=0).shape
(3, 5)  # the first entry (index = axis = 0) dimension was removed 

>>> arr.sum(axis=1).shape
(2, 5)  # the second entry (index = axis = 1) was removed

You can also sum along multiple axis if you want (reducing the dimensionality by the amount of specified axis):

如果你愿意,你也可以沿着多个轴求和(将维度减少指定轴的数量):

>>> arr.sum(axis=(0, 1))
array([75, 81, 87, 93, 99])
>>> arr.sum(axis=(0, 1)).shape
(5, )  # first and second entry is removed

#3


2  

Here is another way to interpret this. You can consider a multi-dimensional array as a tensor, T[i][j][k], while i, j, k represents axis 0,1,2 respectively.

这是另一种解释。你可以把一个多维数组看作一个张量,T[i][j][k],而i, j, k表示轴0,1,2。

T.sum(axis = 0) mathematically will be equivalent to:

T。数学上的和(轴= 0)等于:

在numpy数组中沿轴求和。

Similary, T.sum(axis = 1):

同样,T。总和(轴= 1):

在numpy数组中沿轴求和。

And, T.sum(axis = 2):

而且,T。总和(轴= 2):

在numpy数组中沿轴求和。

So in another word, the axis will be summed over, for instance, axis = 0, the first index will be summed over. If written in a for loop:

换句话说,轴将被求和,例如,轴= 0,第一个指标将被求和。如果写在for循环:

result[j][k] = sum(T[i][j][k] for i in range(T.shape[0])) for all j,k

for axis = 1:

轴= 1:

result[i][k] = sum(T[i][j][k] for j in range(T.shape[1])) for all i,k

etc.

等。

#4


0  

The axis you specify is the one that is effectively removed. So given a shape of (2,3,5), axis 0 gives (3,5), axis 1 gives (2,5), etc. This extends to any number of dimensions.

您指定的轴是有效删除的轴。已知(2,3,5)轴0表示(3,5)轴1表示(2,5)

#5


0  

You seem to be confused by the output style of numpy arrays. The "row" of the output is almost always the last index, not the first. Example:

您似乎对numpy数组的输出样式感到困惑。输出的“行”几乎总是最后一个索引,而不是第一个。例子:

x=np.arange(1,4)
y=np.arange(10,31,10)
z=np.arange(100,301,100)
xy=x[:,None]+y[None,:]

xy
Out[100]: 
array([[11, 21, 31],
       [12, 22, 32],
       [13, 23, 33]])

Notice the tens place increments on the row, not the column, even though y is the second index.

注意十位的增量在行上,而不是列上,即使y是第二个索引。

xyz=x[:,None,None]+y[None,:,None]+z[None,None,:]
xyz
Out[102]: 
array([[[111, 211, 311],
        [121, 221, 321],
        [131, 231, 331]],

       [[112, 212, 312],
        [122, 222, 322],
        [132, 232, 332]],

       [[113, 213, 313],
        [123, 223, 323],
        [133, 233, 333]]])

Now the hundred's place increments in the row, even though z is the last index. This can be somewhat counter-intuitive to beginners.

现在百位在行中递增,即使z是最后一个索引。这对初学者来说可能有点反直觉。

Thus when you do np.sum(x,index=-1) you will always sum over the "rows" as shown in the np.array([]) format. Looking at the arr.sum(axis=2)[0,0] that's 0+1+2+3+4=10.

因此,当您执行np.sum(x,index=-1)时,总是会对“行”进行求和,如np.array([])格式所示。看arr。sum(轴=2)[0,0]= 0+1+2+3+4=10。

#6


0  

Think of a multi-dimensional array as a tree. Each dimension is a level in the tree. Each grouping at that level is a node. A sum along a specific axis (say axis=4) means coalescing (overlaying) all nodes at that level into a single node (under their respective parents). Sub-trees rooted at the overlaid nodes at that level are stacked on top of each other. All overlapping nodes' values are added together.
Picture: https://ibb.co/dg3P3w

把多维数组想象成一棵树。每个维度都是树中的一个级别。该级别上的每个分组都是一个节点。沿特定轴的和(例如axis=4)意味着将该级别的所有节点合并(叠加)为单个节点(在各自的父节点下)。在这一层的重叠节点上的子树相互叠加。将所有重叠节点的值相加。图片:https://ibb.co/dg3P3w