numpy数组操作的奇怪行为

Explain this:

>>> a = np.arange(10)
>>> a[2:]
array([2, 3, 4, 5, 6, 7, 8, 9])
>>> a[:-2]
array([0, 1, 2, 3, 4, 5, 6, 7])
>>> a[2:] - a[:-2]
array([2, 2, 2, 2, 2, 2, 2, 2])
>>> a[2:] -= a[:-2]
>>> a
array([0, 1, 2, 2, 2, 3, 4, 4, 4, 5])

The expected result is of course array([0, 1, 2, 2, 2, 2, 2, 2, 2, 2]).

预期的结果当然是数组([0,1,2,2,2,2,2,2,2])。

I'm going to guess this is something to do with numpy parallelising things and not being smart enough to work out that it needs to make a temporary copy of the data first (or do the operation in the correct order).

我猜这是与numpy并行化的东西有关,并且不够聪明,因为它需要首先制作数据的临时副本(或以正确的顺序进行操作)。

In other words I suspect it is doing something naive like this:

换句话说,我怀疑它正在做一些像这样的天真:

for i in range(2, len-2):
    a[i] -= a[i-2]

For reference it works in Matlab and Octave:

供参考,它适用于Matlab和Octave:

a = 0:9
a(3:end) = a(3:end) - a(1:end-2)

a =

  0  1  2  3  4  5  6  7  8  9

a =

  0  1  2  2  2  2  2  2  2  2

And actually it works fine if you do:

实际上,如果你这样做,它的工作正常:

a[2:] = a[2:] - a[:-2]

So presumably this means that a -= b is not the same as a = a - b for numpy!

所以这可能意味着这意味着a = = b与a = a-b不同于numpy!

Actually now that I come to think of it, I think Mathworks gave this as one of the reasons for not implementing the +=, -=, /= and *= operators!

实际上现在我开始考虑它,我认为Mathworks将此作为未实现+ =, - =,/ =和* =运算符的原因之一!

2 个解决方案

#1

When you slice a numpy array as you are doing in the example, you get a view of the data rather than a copy.

当您像在示例中那样切片numpy数组时,您将获得数据的视图而不是副本。

See:

http://scipy-lectures.github.io/advanced/advanced_numpy/#example-inplace-operations-caveat-emptor

#2

The unexpected behavior is due to array aliasing because (as @JoshAdel stated in his answer), slicing returns a view, rather than a copy of the array. Your example of the "naive" loop already explains how the result is computed. But I'll add two points to your explanation:

意外的行为是由于数组别名,因为(正如@JoshAdel在他的回答中所述),切片返回一个视图,而不是数组的副本。你的“天真”循环的例子已经解释了如何计算结果。但我会在你的解释中加两点:

First, the unexpected behavior is not due to numpy parallelizing operations. If the operation were parallelized, then you shouldn't expect to [consistently] see the result of the naive loop (since that result depends on ordered execution of the loop). If you repeat your experiment several times - even for large arrays - you should see the same result.

首先,意外的行为不是由于numpy并行化操作。如果操作是并行化的,那么你不应该期望[始终]看到天真循环的结果(因为该结果取决于循环的有序执行)。如果您多次重复实验 - 即使对于大型阵列 - 您也应该看到相同的结果。

Second, while your presumption is true in general, I would state it this way:

第二,虽然你的推定一般是正确的,但我会这样说:

a -= b is the same as a = a - b for two numpy arrays when a and b are not aliased.

当a和b没有别名时,a - = b与a = a - b相同,对于两个numpy数组。

#1