可以使用A的协方差来计算A'* A吗?

时间:2020-12-07 02:54:49

I am doing a benchmarking test in python on different ways to calculate A'*A, with A being a N x M matrix. One of the fastest ways was to use numpy.dot().

我正在以不同的方式在python中进行基准测试,以计算A'* A,其中A是N×M矩阵。最快的方法之一是使用numpy.dot()。

I was curious if I can obtain the same result using numpy.cov() (which gives the covariance matrix) by somehow varying the weights or by somehow pre-processing the A matrix ? But I had no success. Does anyone know if there is any relation between the product A'*A and covariance of A, where A is a matrix with N rows/observations and M columns/variables?

我很好奇,如果我可以使用numpy.cov()(它给出协方差矩阵)通过某种方式改变权重或通过某种方式预处理A矩阵来获得相同的结果?但我没有成功。有没有人知道产品A'* A和A的协方差之间是否存在任何关系,其中A是具有N行/观测和M列/变量的矩阵?

1 个解决方案

#1


1  

Have a look at the cov source. Near the end of the function it does this:

看看cov来源。接近函数的末尾,它执行此操作:

c = dot(X, X_T.conj())

Which is basically the dot product you want to perform. However, there are all kinds of other operations: checking inputs, subtracting the mean, normalization, ...

这基本上是您想要执行的点积。但是,还有各种其他操作:检查输入,减去均值,归一化,......

In short, np.cov will never ever be faster than np.dot(A.T, A) because internally it contains exactly that operation.

简而言之,np.cov永远不会比np.dot(A.T,A)更快,因为它内部恰好包含该操作。

That said - the covariance matrix is computed as

也就是说 - 协方差矩阵计算为

可以使用A的协方差来计算A'* A吗?

Or in Python:

或者在Python中:

import numpy as np

a = np.random.rand(10, 3)

m = np.mean(a, axis=0, keepdims=True)
x = np.dot((a - m).T, a - m) / (a.shape[0] - 1)

y = np.cov(a.T)

assert np.allclose(x, y)  # check they are equivalent

As you can see, the covariance matrix is equivalent to the raw dot product if you subtract the mean of each variable and divide the result by the number of samples (minus one).

如您所见,如果您减去每个变量的平均值并将结果除以样本数(减1),则协方差矩阵等效于原始点积。

#1


1  

Have a look at the cov source. Near the end of the function it does this:

看看cov来源。接近函数的末尾,它执行此操作:

c = dot(X, X_T.conj())

Which is basically the dot product you want to perform. However, there are all kinds of other operations: checking inputs, subtracting the mean, normalization, ...

这基本上是您想要执行的点积。但是,还有各种其他操作:检查输入,减去均值,归一化,......

In short, np.cov will never ever be faster than np.dot(A.T, A) because internally it contains exactly that operation.

简而言之,np.cov永远不会比np.dot(A.T,A)更快,因为它内部恰好包含该操作。

That said - the covariance matrix is computed as

也就是说 - 协方差矩阵计算为

可以使用A的协方差来计算A'* A吗?

Or in Python:

或者在Python中:

import numpy as np

a = np.random.rand(10, 3)

m = np.mean(a, axis=0, keepdims=True)
x = np.dot((a - m).T, a - m) / (a.shape[0] - 1)

y = np.cov(a.T)

assert np.allclose(x, y)  # check they are equivalent

As you can see, the covariance matrix is equivalent to the raw dot product if you subtract the mean of each variable and divide the result by the number of samples (minus one).

如您所见,如果您减去每个变量的平均值并将结果除以样本数(减1),则协方差矩阵等效于原始点积。