使用索引数组在另一个数组中累加一个数组

时间:2021-08-07 19:18:24

My question is about a specific array operation that I want to express using numpy.

我的问题是关于一个特定的数组操作,我想用numpy来表示。

I have an array of floats w and an array of indices idx of the same length as w and I want to sum up all w with the same idx value and collect them in an array v. As a loop, this looks like this:

我有一个浮点数w的数组和一个索引idx的数组和w的长度相同我想把所有的w相加得到相同的idx值并在一个数组v中收集它们作为一个循环,这个看起来是这样的:

for i, x in enumerate(w):
     v[idx[i]] += x

Is there a way to do this with array operations? My guess was v[idx] += w but that does not work, since idx contains the same index multiple times.

是否有一种方法可以对数组操作进行处理?我的猜测是v[idx] += w,但这行不通,因为idx多次包含相同的索引。

Thanks!

谢谢!

2 个解决方案

#1


15  

numpy.bincount was introduced for this purpose:

numpy。为此引入了bincount:

tmp = np.bincount(idx, w)
v[:len(tmp)] += tmp

I think as of 1.6 you can also pass a minlength to bincount.

我认为在1.6的时候,你也可以传递一个小长度给bincount。

#2


4  

This is a known behavior and, though somewhat unfortunate, does not have a numpy-level workaround. (bincount can be used for this if you twist its arm.) Doing the loop yourself is really your best bet.

这是一种已知的行为,尽管有些不幸,但没有一个numpy级的解决方案。(如果你拧它的胳膊,它可以用来做这个。)自己做这个循环是你最好的选择。

Note that your code might have been a bit more clear without re-using the name w and without introducing another set of indices, like

注意,如果不重用名称w,不引入另一组索引,您的代码可能会更清晰一些,比如

for i, w_thing in zip(idx, w):
    v[i] += w_thing

If you need to speed up this loop, you might have to drop down to C. Cython makes this relatively easy.

如果您需要加速这个循环,您可能需要向下拉到C. Cython使这相对容易。

#1


15  

numpy.bincount was introduced for this purpose:

numpy。为此引入了bincount:

tmp = np.bincount(idx, w)
v[:len(tmp)] += tmp

I think as of 1.6 you can also pass a minlength to bincount.

我认为在1.6的时候,你也可以传递一个小长度给bincount。

#2


4  

This is a known behavior and, though somewhat unfortunate, does not have a numpy-level workaround. (bincount can be used for this if you twist its arm.) Doing the loop yourself is really your best bet.

这是一种已知的行为,尽管有些不幸,但没有一个numpy级的解决方案。(如果你拧它的胳膊,它可以用来做这个。)自己做这个循环是你最好的选择。

Note that your code might have been a bit more clear without re-using the name w and without introducing another set of indices, like

注意,如果不重用名称w,不引入另一组索引,您的代码可能会更清晰一些,比如

for i, w_thing in zip(idx, w):
    v[i] += w_thing

If you need to speed up this loop, you might have to drop down to C. Cython makes this relatively easy.

如果您需要加速这个循环,您可能需要向下拉到C. Cython使这相对容易。