用带Scipy的矢量化方法计算行点积两个矩阵

时间:2021-08-11 21:22:09

I want to calculate the row-wise dot product of two matrices of the same dimension as fast as possible. This is the way I am doing it:

我想尽快计算两个相同维数矩阵的行积。我就是这样做的:

import numpy as np
a = np.array([[1,2,3], [3,4,5]])
b = np.array([[1,2,3], [1,2,3]])
result = np.array([])
for row1, row2 in a, b:
    result = np.append(result, np.dot(row1, row2))
print result

and of course the output is:

当然输出是:

[ 26.  14.]

4 个解决方案

#1


20  

Check out numpy.einsum for another method:

查看numpy。einsum另一个方法:

In [52]: a
Out[52]: 
array([[1, 2, 3],
       [3, 4, 5]])

In [53]: b
Out[53]: 
array([[1, 2, 3],
       [1, 2, 3]])

In [54]: einsum('ij,ij->i', a, b)
Out[54]: array([14, 26])

Looks like einsum is a bit faster than inner1d:

看起来einsum比inner1d快一点:

In [94]: %timeit inner1d(a,b)
1000000 loops, best of 3: 1.8 us per loop

In [95]: %timeit einsum('ij,ij->i', a, b)
1000000 loops, best of 3: 1.6 us per loop

In [96]: a = random.randn(10, 100)

In [97]: b = random.randn(10, 100)

In [98]: %timeit inner1d(a,b)
100000 loops, best of 3: 2.89 us per loop

In [99]: %timeit einsum('ij,ij->i', a, b)
100000 loops, best of 3: 2.03 us per loop

#2


14  

Straightforward way to do that is:

简单的方法是:

import numpy as np
a=np.array([[1,2,3],[3,4,5]])
b=np.array([[1,2,3],[1,2,3]])
np.sum(a*b, axis=1)

which avoids the python loop and is faster in cases like:

避免了python循环,在以下情况下速度更快:

def npsumdot(x, y):
    return np.sum(x*y, axis=1)

def loopdot(x, y):
    result = np.empty((x.shape[0]))
    for i in range(x.shape[0]):
        result[i] = np.dot(x[i], y[i])
    return result

timeit npsumdot(np.random.rand(500000,50),np.random.rand(500000,50))
# 1 loops, best of 3: 861 ms per loop
timeit loopdot(np.random.rand(500000,50),np.random.rand(500000,50))
# 1 loops, best of 3: 1.58 s per loop

#3


13  

Played around with this and found inner1d the fastest:

玩了一会,发现inner1d是最快的:

用带Scipy的矢量化方法计算行点积两个矩阵

The plot was created with perfplot (a small project of mine)

这个情节是用perfplot(我的一个小项目)

import numpy
from numpy.core.umath_tests import inner1d
import perfplot

perfplot.show(
    setup=lambda n: (numpy.random.rand(n, 3), numpy.random.rand(n, 3)),
    n_range=[2**k for k in range(1, 18)],
    kernels=[
        lambda data: numpy.sum(data[0] * data[1], axis=1),
        lambda data: numpy.einsum('ij, ij->i', data[0], data[1]),
        lambda data: inner1d(data[0], data[1])
        ],
    labels=['np.sum(a*b, axis=1)', 'einsum', 'inner1d'],
    logx=True,
    logy=True,
    xlabel='len(a), len(b)'
    )

#4


4  

You'll do better avoiding the append, but I can't think of a way to avoid the python loop. A custom Ufunc perhaps? I don't think numpy.vectorize will help you here.

您可以更好地避免追加,但是我想不出一种方法来避免python循环。自定义Ufunc也许吗?我不认为numpy。vectorize将在这里帮助您。

import numpy as np
a=np.array([[1,2,3],[3,4,5]])
b=np.array([[1,2,3],[1,2,3]])
result=np.empty((2,))
for i in range(2):
    result[i] = np.dot(a[i],b[i]))
print result

EDIT

编辑

Based on this answer, it looks like inner1d might work if the vectors in your real-world problem are 1D.

根据这个答案,如果实际问题中的向量是1D,那么inner1d可能会起作用。

from numpy.core.umath_tests import inner1d
inner1d(a,b)  # array([14, 26])

#1


20  

Check out numpy.einsum for another method:

查看numpy。einsum另一个方法:

In [52]: a
Out[52]: 
array([[1, 2, 3],
       [3, 4, 5]])

In [53]: b
Out[53]: 
array([[1, 2, 3],
       [1, 2, 3]])

In [54]: einsum('ij,ij->i', a, b)
Out[54]: array([14, 26])

Looks like einsum is a bit faster than inner1d:

看起来einsum比inner1d快一点:

In [94]: %timeit inner1d(a,b)
1000000 loops, best of 3: 1.8 us per loop

In [95]: %timeit einsum('ij,ij->i', a, b)
1000000 loops, best of 3: 1.6 us per loop

In [96]: a = random.randn(10, 100)

In [97]: b = random.randn(10, 100)

In [98]: %timeit inner1d(a,b)
100000 loops, best of 3: 2.89 us per loop

In [99]: %timeit einsum('ij,ij->i', a, b)
100000 loops, best of 3: 2.03 us per loop

#2


14  

Straightforward way to do that is:

简单的方法是:

import numpy as np
a=np.array([[1,2,3],[3,4,5]])
b=np.array([[1,2,3],[1,2,3]])
np.sum(a*b, axis=1)

which avoids the python loop and is faster in cases like:

避免了python循环,在以下情况下速度更快:

def npsumdot(x, y):
    return np.sum(x*y, axis=1)

def loopdot(x, y):
    result = np.empty((x.shape[0]))
    for i in range(x.shape[0]):
        result[i] = np.dot(x[i], y[i])
    return result

timeit npsumdot(np.random.rand(500000,50),np.random.rand(500000,50))
# 1 loops, best of 3: 861 ms per loop
timeit loopdot(np.random.rand(500000,50),np.random.rand(500000,50))
# 1 loops, best of 3: 1.58 s per loop

#3


13  

Played around with this and found inner1d the fastest:

玩了一会,发现inner1d是最快的:

用带Scipy的矢量化方法计算行点积两个矩阵

The plot was created with perfplot (a small project of mine)

这个情节是用perfplot(我的一个小项目)

import numpy
from numpy.core.umath_tests import inner1d
import perfplot

perfplot.show(
    setup=lambda n: (numpy.random.rand(n, 3), numpy.random.rand(n, 3)),
    n_range=[2**k for k in range(1, 18)],
    kernels=[
        lambda data: numpy.sum(data[0] * data[1], axis=1),
        lambda data: numpy.einsum('ij, ij->i', data[0], data[1]),
        lambda data: inner1d(data[0], data[1])
        ],
    labels=['np.sum(a*b, axis=1)', 'einsum', 'inner1d'],
    logx=True,
    logy=True,
    xlabel='len(a), len(b)'
    )

#4


4  

You'll do better avoiding the append, but I can't think of a way to avoid the python loop. A custom Ufunc perhaps? I don't think numpy.vectorize will help you here.

您可以更好地避免追加,但是我想不出一种方法来避免python循环。自定义Ufunc也许吗?我不认为numpy。vectorize将在这里帮助您。

import numpy as np
a=np.array([[1,2,3],[3,4,5]])
b=np.array([[1,2,3],[1,2,3]])
result=np.empty((2,))
for i in range(2):
    result[i] = np.dot(a[i],b[i]))
print result

EDIT

编辑

Based on this answer, it looks like inner1d might work if the vectors in your real-world problem are 1D.

根据这个答案,如果实际问题中的向量是1D,那么inner1d可能会起作用。

from numpy.core.umath_tests import inner1d
inner1d(a,b)  # array([14, 26])