加快python代码计算矩阵因子

As part of a complex task, I need to compute matrix cofactors. I did this in a straightforward way using this nice code for computing matrix minors. Here is my code:

作为复杂任务的一部分，我需要计算矩阵代数余子。我用了一个简单的方法用这个很好的代码来计算矩阵的余子式。这是我的代码:

def matrix_cofactor(matrix):
    C = np.zeros(matrix.shape)
    nrows, ncols = C.shape
    for row in xrange(nrows):
        for col in xrange(ncols):
            minor = matrix[np.array(range(row)+range(row+1,nrows))[:,np.newaxis],
                           np.array(range(col)+range(col+1,ncols))]
            C[row, col] = (-1)**(row+col) * np.linalg.det(minor)
    return C

It turns out that this matrix cofactor code is the bottleneck, and I would like to optimize the code snippet above. Any ideas as to how to do this?

原来这个矩阵的协因子代码是瓶颈，我想对上面的代码片段进行优化。有什么办法吗?

2 个解决方案

#1

If your matrix is invertible, the cofactor is related to the inverse:

如果你的矩阵是可逆的，那么余因子与逆相关:

def matrix_cofactor(matrix):
    return np.linalg.inv(matrix).T * np.linalg.det(matrix)

This gives large speedups (~ 1000x for 50x50 matrices). The main reason is fundamental: this is an O(n^3) algorithm, whereas the minor-det-based one is O(n^5).

这提供了很大的速度(~ 1000x的50x50矩阵)。主要原因是基本的:这是一个O(n ^ 3)算法,而minor-det-based是O(n ^ 5)。

This probably means that also for non-invertible matrixes, there is some clever way to calculate the cofactor (i.e., not use the mathematical formula that you use above, but some other equivalent definition).

这可能意味着对于非可逆矩阵，也有一些计算辅助因子的聪明方法。不要用你上面用的数学公式，而是用其他等价的定义。

If you stick with the det-based approach, what you can do is the following:

如果你坚持使用基于det的方法，你可以做的是:

The majority of the time seems to be spent inside det. (Check out line_profiler to find this out yourself.) You can try to speed that part up by linking Numpy with the Intel MKL, but other than that, there is not much that can be done.

大部分时间似乎都花在了det内部(请查看line_profiler以了解这一点)。您可以尝试通过将Numpy与因特尔MKL连接来加速这一部分，但是除此之外，没有什么可以做的。

You can speed up the other part of the code like this:

你可以像这样加速代码的另一部分:

minor = np.zeros([nrows-1, ncols-1])
for row in xrange(nrows):
    for col in xrange(ncols):
        minor[:row,:col] = matrix[:row,:col]
        minor[row:,:col] = matrix[row+1:,:col]
        minor[:row,col:] = matrix[:row,col+1:]
        minor[row:,col:] = matrix[row+1:,col+1:]
        ...

This gains some 10-50% total runtime depending on the size of your matrices. The original code has Python range and list manipulations, which are slower than direct slice indexing. You could try also to be more clever and copy only parts of the minor that actually change --- however, already after the above change, close to 100% of the time is spent inside numpy.linalg.det so that furher optimization of the othe parts does not make so much sense.

这将根据矩阵的大小获得大约10-50%的运行时。原始代码具有Python范围和列表操作，这比直接片索引要慢。你也可以试着更聪明些，只复制那些实际上发生了变化的小调的部分——然而，在上面的修改之后，几乎100%的时间都花在numpy.linalg.det中，这样对othe parts的优化就没有多大意义了。

#2

The calculation of np.array(range(row)+range(row+1,nrows))[:,np.newaxis] does not depended on col so you could could move that outside the inner loop and cache the value. Depending on the number of columns you have this might give a small optimization.

计算np.array范围(范围(行)+(+ 1行,nrows))(:,np。newaxis]不依赖于col，因此可以将其移出内部循环并缓存值。根据列数的不同，这可能会得到一个小的优化。

#1