如何在numpy中沿特定尺寸拍摄点积?

时间:2021-09-20 21:25:18

I have two arrays. One is n by p and the other is d by p by r. I would like my output to be d by n by r, which I can achieve easily as I construct the tensor B below. However, I would like to do this without that loop.

我有两个数组。一个是n乘p,另一个是d乘p乘以r。我想我的输出是由r乘以n,我可以很容易地实现,因为我构造下面的张量B.但是,我想在没有那个循环的情况下这样做。

import numpy

X = numpy.array([[1,2,3],[3,4,5],[5,6,7],[7,8,9]]) # n x p
betas = numpy.array([[[1,2],[1,2],[1,2]], [[5,6],[5,6],[5,6]]]) # d x p x r

print X.shape
print betas.shape

B = numpy.zeros((betas.shape[0],X.shape[0],betas.shape[2]))
print B.shape

for i in range(B.shape[0]):
    B[i,:,:] = numpy.dot(X, betas[i])

print "B",B

C = numpy.tensordot(X, betas, axes=([1],[0]))
print C.shape

I have tried in various ways to get C to match B, but so far I have been unsuccessful. Is there a way that does not involve a call to reshape?

我已经尝试过各种各样的方法让C匹配B,但到目前为止我一直没有成功。有没有一种方法不涉及重塑的调用?

3 个解决方案

#1


0  

We can use np.tensordot and then need to permutes axes -

我们可以使用np.tensordot然后需要置换轴 -

B = np.tensordot(betas, X, axes=(1,1)).swapaxes(1,2)
# Or np.tensordot(X, betas, axes=(1,1)).swapaxes(0,1)

Related post to understand tensordot.

相关帖子了解tensordot。

#2


1  

Since the dot rule is 'last of A with 2nd to the last of B', you can do X.dot(betas) and get a (n,d,r) array (this sums on the shared p dimension). Then you just need a transpose to get (d,n,r)

由于点规则是“A的最后一个,第二个到最后一个B”,你可以做X.dot(beta)并获得一个(n,d,r)数组(这是共享p维的总和)。然后你只需要一个转置来得到(d,n,r)

In [200]: X.dot(betas).transpose(1,0,2)
Out[200]: 
array([[[  6,  12],
        [ 12,  24],
        [ 18,  36],
        [ 24,  48]],

       [[ 30,  36],
        [ 60,  72],
        [ 90, 108],
        [120, 144]]])

We can also write the einsum version directly from the dimensions specification:

我们也可以直接从维度规范中编写einsum版本:

np.einsum('np,dpr->dnr', X,betas)

So does matmul (this does dot on the last 2 axes, while d comes along for the ride).

matmul也是如此(这在最后2个轴上都是点,而d则是为了骑行)。

X@betas
  • If either argument is N-D, N > 2, it is treated as a stack of matrices residing in the last two indexes and broadcast accordingly.
  • 如果任一参数是N-D,N> 2,则将其视为驻留在最后两个索引中的矩阵堆栈并相应地进行广播。

#3


0  

Here is another approach using numpy.dot(), which also returns a view as you requested, and most importantly more than 4x faster than tensordot approach, particularly for small sized arrays. But, np.tensordot is way faster than plain np.dot() for reasonably larger arrays. See timings below.

这是另一种使用numpy.dot()的方法,它也可以按照您的要求返回视图,最重要的是比tensordot方法快4倍以上,特别是对于小型数组。但是,对于相当大的数组,np.tensordot比普通的np.dot()更快。见下面的时间安排。

In [108]: X.shape
Out[108]: (4, 3)

In [109]: betas.shape
Out[109]: (2, 3, 2)

# use `np.dot` and roll the second axis to first position
In [110]: dot_prod = np.rollaxis(np.dot(X, betas), 1)

In [111]: dot_prod.shape
Out[111]: (2, 4, 2)

# @Divakar's approach
In [113]: B = np.tensordot(betas, X, axes=(1,1)).swapaxes(1,2)

# sanity check :)
In [115]: np.all(np.equal(dot_prod, B))
Out[115]: True

Now, the performance of both approaches:

现在,两种方法的表现:

  • For small sized arrays np.dot() is 4x faster than np.tensordot()
  • 对于小型数组,np.dot()比np.tensordot()快4倍


# @Divakar's approach
In [117]: %timeit B = np.tensordot(betas, X, axes=(1,1)).swapaxes(1,2)
10.6 µs ± 2.1 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

# @hpaulj's approach
In [151]: %timeit esum_dot = np.einsum('np, dpr -> dnr', X, betas)
4.16 µs ± 235 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

# proposed approach: more than 4x faster!!
In [118]: %timeit dot_prod = np.rollaxis(np.dot(X, betas), 1)
2.47 µs ± 11.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

  • For reasonably larger arrays, np.tensordot() is much faster than np.dot()
  • 对于相当大的数组,np.tensordot()比np.dot()快得多


In [129]: X = np.random.randint(1, 10, (600, 500))
In [130]: betas = np.random.randint(1, 7, (300, 500, 300))

In [131]: %timeit B = np.tensordot(betas, X, axes=(1,1)).swapaxes(1,2)
18.2 s ± 2.41 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [132]: %timeit dot_prod = np.rollaxis(np.dot(X, betas), 1)
52.8 s ± 14.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

#1


0  

We can use np.tensordot and then need to permutes axes -

我们可以使用np.tensordot然后需要置换轴 -

B = np.tensordot(betas, X, axes=(1,1)).swapaxes(1,2)
# Or np.tensordot(X, betas, axes=(1,1)).swapaxes(0,1)

Related post to understand tensordot.

相关帖子了解tensordot。

#2


1  

Since the dot rule is 'last of A with 2nd to the last of B', you can do X.dot(betas) and get a (n,d,r) array (this sums on the shared p dimension). Then you just need a transpose to get (d,n,r)

由于点规则是“A的最后一个,第二个到最后一个B”,你可以做X.dot(beta)并获得一个(n,d,r)数组(这是共享p维的总和)。然后你只需要一个转置来得到(d,n,r)

In [200]: X.dot(betas).transpose(1,0,2)
Out[200]: 
array([[[  6,  12],
        [ 12,  24],
        [ 18,  36],
        [ 24,  48]],

       [[ 30,  36],
        [ 60,  72],
        [ 90, 108],
        [120, 144]]])

We can also write the einsum version directly from the dimensions specification:

我们也可以直接从维度规范中编写einsum版本:

np.einsum('np,dpr->dnr', X,betas)

So does matmul (this does dot on the last 2 axes, while d comes along for the ride).

matmul也是如此(这在最后2个轴上都是点,而d则是为了骑行)。

X@betas
  • If either argument is N-D, N > 2, it is treated as a stack of matrices residing in the last two indexes and broadcast accordingly.
  • 如果任一参数是N-D,N> 2,则将其视为驻留在最后两个索引中的矩阵堆栈并相应地进行广播。

#3


0  

Here is another approach using numpy.dot(), which also returns a view as you requested, and most importantly more than 4x faster than tensordot approach, particularly for small sized arrays. But, np.tensordot is way faster than plain np.dot() for reasonably larger arrays. See timings below.

这是另一种使用numpy.dot()的方法,它也可以按照您的要求返回视图,最重要的是比tensordot方法快4倍以上,特别是对于小型数组。但是,对于相当大的数组,np.tensordot比普通的np.dot()更快。见下面的时间安排。

In [108]: X.shape
Out[108]: (4, 3)

In [109]: betas.shape
Out[109]: (2, 3, 2)

# use `np.dot` and roll the second axis to first position
In [110]: dot_prod = np.rollaxis(np.dot(X, betas), 1)

In [111]: dot_prod.shape
Out[111]: (2, 4, 2)

# @Divakar's approach
In [113]: B = np.tensordot(betas, X, axes=(1,1)).swapaxes(1,2)

# sanity check :)
In [115]: np.all(np.equal(dot_prod, B))
Out[115]: True

Now, the performance of both approaches:

现在,两种方法的表现:

  • For small sized arrays np.dot() is 4x faster than np.tensordot()
  • 对于小型数组,np.dot()比np.tensordot()快4倍


# @Divakar's approach
In [117]: %timeit B = np.tensordot(betas, X, axes=(1,1)).swapaxes(1,2)
10.6 µs ± 2.1 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

# @hpaulj's approach
In [151]: %timeit esum_dot = np.einsum('np, dpr -> dnr', X, betas)
4.16 µs ± 235 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

# proposed approach: more than 4x faster!!
In [118]: %timeit dot_prod = np.rollaxis(np.dot(X, betas), 1)
2.47 µs ± 11.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

  • For reasonably larger arrays, np.tensordot() is much faster than np.dot()
  • 对于相当大的数组,np.tensordot()比np.dot()快得多


In [129]: X = np.random.randint(1, 10, (600, 500))
In [130]: betas = np.random.randint(1, 7, (300, 500, 300))

In [131]: %timeit B = np.tensordot(betas, X, axes=(1,1)).swapaxes(1,2)
18.2 s ± 2.41 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [132]: %timeit dot_prod = np.rollaxis(np.dot(X, betas), 1)
52.8 s ± 14.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)