如何在numpy中有效地连接多个arange调用?

时间:2021-12-12 12:32:22

I'd like to vectorize calls like numpy.arange(0, cnt_i) over a vector of cnt values and concatenate the results like this snippet:

我想在cnt值的向量上向numpy.arange(0,cnt_i)这样的调用进行向量化,并像这个片段一样连接结果:

import numpy
cnts = [1,2,3]
numpy.concatenate([numpy.arange(cnt) for cnt in cnts])

array([0, 0, 1, 0, 1, 2])

Unfortunately the code above is very memory inefficient due to the temporary arrays and list comprehension looping.

不幸的是,由于临时数组和列表推导循环,上面的代码非常低效。

Is there a way to do this more efficiently in numpy?

有没有办法在numpy中更有效地做到这一点?

3 个解决方案

#1


4  

Here's a completely vectorized function:

这是一个完全矢量化的函数:

def multirange(counts):
    counts = np.asarray(counts)
    # Remove the following line if counts is always strictly positive.
    counts = counts[counts != 0]

    counts1 = counts[:-1]
    reset_index = np.cumsum(counts1)

    incr = np.ones(counts.sum(), dtype=int)
    incr[0] = 0
    incr[reset_index] = 1 - counts1

    # Reuse the incr array for the final result.
    incr.cumsum(out=incr)
    return incr

Here's a variation of @Developer's answer that only calls arange once:

这是@ Developer的答案的变体,它只调用一次范围:

def multirange_loop(counts):
    counts = np.asarray(counts)
    ranges = np.empty(counts.sum(), dtype=int)
    seq = np.arange(counts.max())
    starts = np.zeros(len(counts), dtype=int)
    starts[1:] = np.cumsum(counts[:-1])
    for start, count in zip(starts, counts):
        ranges[start:start + count] = seq[:count]
    return ranges

And here's the original version, written as a function:

这是原始版本,作为函数编写:

def multirange_original(counts):
    ranges = np.concatenate([np.arange(count) for count in counts])
    return ranges

Demo:

In [296]: multirange_original([1,2,3])
Out[296]: array([0, 0, 1, 0, 1, 2])

In [297]: multirange_loop([1,2,3])
Out[297]: array([0, 0, 1, 0, 1, 2])

In [298]: multirange([1,2,3])
Out[298]: array([0, 0, 1, 0, 1, 2])

Compare timing using a larger array of counts:

使用更大数量的计数比较时间:

In [299]: counts = np.random.randint(1, 50, size=50)

In [300]: %timeit multirange_original(counts)
10000 loops, best of 3: 114 µs per loop

In [301]: %timeit multirange_loop(counts)
10000 loops, best of 3: 76.2 µs per loop

In [302]: %timeit multirange(counts)
10000 loops, best of 3: 26.4 µs per loop

#2


3  

Try the following for solving memory problem, efficiency is almost the same.

尝试以下解决内存问题,效率几乎相同。

out = np.empty((sum(cnts)))
k = 0
for cnt in cnts:
    out[k:k+cnt] = np.arange(cnt)
    k += cnt

so no concatenation is used.

所以没有使用连接。

#3


1  

np.tril_indices pretty much does this for you:

np.tril_indices几乎为你做了这个:

In [28]: def f(c):
   ....:     return np.tril_indices(c, -1)[1]

In [29]: f(10)
Out[29]:
array([0, 0, 1, 0, 1, 2, 0, 1, 2, 3, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 5, 0, 1,
       2, 3, 4, 5, 6, 0, 1, 2, 3, 4, 5, 6, 7, 0, 1, 2, 3, 4, 5, 6, 7, 8])

In [33]: %timeit multirange(range(10))
10000 loops, best of 3: 93.2 us per loop

In [34]: %timeit f(10)
10000 loops, best of 3: 68.5 us per loop

much faster than @Warren Weckesser multirange when the dimension is small.

当维度很小时,比@Warren Weckesser多范围快得多。

But becomes much slower when the dimension is larger (@hpaulj, you have a very good point):

但是当尺寸更大时变得慢得多(@hpaulj,你有一个非常好的观点):

In [36]: %timeit multirange(range(1000))
100 loops, best of 3: 5.62 ms per loop

In [37]: %timeit f(1000)
10 loops, best of 3: 68.6 ms per loop

#1


4  

Here's a completely vectorized function:

这是一个完全矢量化的函数:

def multirange(counts):
    counts = np.asarray(counts)
    # Remove the following line if counts is always strictly positive.
    counts = counts[counts != 0]

    counts1 = counts[:-1]
    reset_index = np.cumsum(counts1)

    incr = np.ones(counts.sum(), dtype=int)
    incr[0] = 0
    incr[reset_index] = 1 - counts1

    # Reuse the incr array for the final result.
    incr.cumsum(out=incr)
    return incr

Here's a variation of @Developer's answer that only calls arange once:

这是@ Developer的答案的变体,它只调用一次范围:

def multirange_loop(counts):
    counts = np.asarray(counts)
    ranges = np.empty(counts.sum(), dtype=int)
    seq = np.arange(counts.max())
    starts = np.zeros(len(counts), dtype=int)
    starts[1:] = np.cumsum(counts[:-1])
    for start, count in zip(starts, counts):
        ranges[start:start + count] = seq[:count]
    return ranges

And here's the original version, written as a function:

这是原始版本,作为函数编写:

def multirange_original(counts):
    ranges = np.concatenate([np.arange(count) for count in counts])
    return ranges

Demo:

In [296]: multirange_original([1,2,3])
Out[296]: array([0, 0, 1, 0, 1, 2])

In [297]: multirange_loop([1,2,3])
Out[297]: array([0, 0, 1, 0, 1, 2])

In [298]: multirange([1,2,3])
Out[298]: array([0, 0, 1, 0, 1, 2])

Compare timing using a larger array of counts:

使用更大数量的计数比较时间:

In [299]: counts = np.random.randint(1, 50, size=50)

In [300]: %timeit multirange_original(counts)
10000 loops, best of 3: 114 µs per loop

In [301]: %timeit multirange_loop(counts)
10000 loops, best of 3: 76.2 µs per loop

In [302]: %timeit multirange(counts)
10000 loops, best of 3: 26.4 µs per loop

#2


3  

Try the following for solving memory problem, efficiency is almost the same.

尝试以下解决内存问题,效率几乎相同。

out = np.empty((sum(cnts)))
k = 0
for cnt in cnts:
    out[k:k+cnt] = np.arange(cnt)
    k += cnt

so no concatenation is used.

所以没有使用连接。

#3


1  

np.tril_indices pretty much does this for you:

np.tril_indices几乎为你做了这个:

In [28]: def f(c):
   ....:     return np.tril_indices(c, -1)[1]

In [29]: f(10)
Out[29]:
array([0, 0, 1, 0, 1, 2, 0, 1, 2, 3, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 5, 0, 1,
       2, 3, 4, 5, 6, 0, 1, 2, 3, 4, 5, 6, 7, 0, 1, 2, 3, 4, 5, 6, 7, 8])

In [33]: %timeit multirange(range(10))
10000 loops, best of 3: 93.2 us per loop

In [34]: %timeit f(10)
10000 loops, best of 3: 68.5 us per loop

much faster than @Warren Weckesser multirange when the dimension is small.

当维度很小时,比@Warren Weckesser多范围快得多。

But becomes much slower when the dimension is larger (@hpaulj, you have a very good point):

但是当尺寸更大时变得慢得多(@hpaulj,你有一个非常好的观点):

In [36]: %timeit multirange(range(1000))
100 loops, best of 3: 5.62 ms per loop

In [37]: %timeit f(1000)
10 loops, best of 3: 68.6 ms per loop