My question is about a specific array operation that I want to express using numpy.
我的问题是关于一个特定的数组操作,我想用numpy来表示。
I have an array of floats w
and an array of indices idx
of the same length as w
and I want to sum up all w
with the same idx
value and collect them in an array v
. As a loop, this looks like this:
我有一个浮点数w的数组和一个索引idx的数组和w的长度相同我想把所有的w相加得到相同的idx值并在一个数组v中收集它们作为一个循环,这个看起来是这样的:
for i, x in enumerate(w):
v[idx[i]] += x
Is there a way to do this with array operations? My guess was v[idx] += w
but that does not work, since idx
contains the same index multiple times.
是否有一种方法可以对数组操作进行处理?我的猜测是v[idx] += w,但这行不通,因为idx多次包含相同的索引。
Thanks!
谢谢!
2 个解决方案
#1
15
numpy.bincount
was introduced for this purpose:
numpy。为此引入了bincount:
tmp = np.bincount(idx, w)
v[:len(tmp)] += tmp
I think as of 1.6 you can also pass a minlength to bincount
.
我认为在1.6的时候,你也可以传递一个小长度给bincount。
#2
4
This is a known behavior and, though somewhat unfortunate, does not have a numpy-level workaround. (bincount
can be used for this if you twist its arm.) Doing the loop yourself is really your best bet.
这是一种已知的行为,尽管有些不幸,但没有一个numpy级的解决方案。(如果你拧它的胳膊,它可以用来做这个。)自己做这个循环是你最好的选择。
Note that your code might have been a bit more clear without re-using the name w
and without introducing another set of indices, like
注意,如果不重用名称w,不引入另一组索引,您的代码可能会更清晰一些,比如
for i, w_thing in zip(idx, w):
v[i] += w_thing
If you need to speed up this loop, you might have to drop down to C. Cython makes this relatively easy.
如果您需要加速这个循环,您可能需要向下拉到C. Cython使这相对容易。
#1
15
numpy.bincount
was introduced for this purpose:
numpy。为此引入了bincount:
tmp = np.bincount(idx, w)
v[:len(tmp)] += tmp
I think as of 1.6 you can also pass a minlength to bincount
.
我认为在1.6的时候,你也可以传递一个小长度给bincount。
#2
4
This is a known behavior and, though somewhat unfortunate, does not have a numpy-level workaround. (bincount
can be used for this if you twist its arm.) Doing the loop yourself is really your best bet.
这是一种已知的行为,尽管有些不幸,但没有一个numpy级的解决方案。(如果你拧它的胳膊,它可以用来做这个。)自己做这个循环是你最好的选择。
Note that your code might have been a bit more clear without re-using the name w
and without introducing another set of indices, like
注意,如果不重用名称w,不引入另一组索引,您的代码可能会更清晰一些,比如
for i, w_thing in zip(idx, w):
v[i] += w_thing
If you need to speed up this loop, you might have to drop down to C. Cython makes this relatively easy.
如果您需要加速这个循环,您可能需要向下拉到C. Cython使这相对容易。