Matlab稀疏到Python scipy csr_matrix的转换。

时间:2021-04-20 21:20:13

I am new to both Matlab and Python and am converting some Matlab codes to it's Python equivalent. The issue I am facing is with converting from sparse(i,j,v,m,n) to csr_matrix((data, (row_ind, col_ind)), [shape=(M, N)]).

我对Matlab和Python都是新手,并且正在将一些Matlab代码转换成Python的等效代码。我所面临的问题是,从稀疏(I,j,v,m,n)转换到csr_matrix((数据,(row_ind, col_ind)), [shape=(m,n)])。

In this code, i, j and row_in, col_ind will be passed with an index array - idx of size(124416, 1), while v and data will be passed with a 2D array - D22 of size(290, 434)

在这段代码中,i、j和row_in、col_ind将通过一个索引数组- idx大小(124416,1)传递,而v和数据将通过一个2D数组- D22(29,434)传递。

Matlab:

Matlab:

...
H = 288;
W = 432;
N = (H+2)*(W+2);
mask = zeros(H+2, W+2);
mask(2:end-1, 2:end-1) = 1;

idx = find(mask==1);
>>>idx = [292, ..., 579, 582 ..., 869, ... , 125282, ..., 125569]

A = sparse(idx, idx+1, -D22(idx), N, N);
B = sparse(idx, idx-1, -D22(idx), N, N);
C = sparse(idx, idx+H+2, -D22(idx-1), N, N);
D = sparse(idx, idx-H-2, -D22(idx-1), N, N);
...

spy(A) first entry is m(293, 292) - (idx,idx+1), which was what I expected.

间谍(A)第一项是m(293,292) - (idx,idx+1),这是我所期望的。

spy(B) m(292, 293) - (idx,idx-1). I was expecting it to be m(291, 292), believing that idx-1 would return an array [291, ..., 578, 581 ..., 868, ... , 125281, ..., 125568]

间谍(B) m(292,293) - (idx,idx-1)。我希望它是m(291,292),相信idx-1会返回一个数组[291,…、578、581……,868,…,125281,…,125568)

spy(C) - m(582, 292) - (idx,idx+H+2)

间谍(C) - m(582, 292) - (idx,idx+H+2)

spy(D) - m(292, 582) - (idx,idx-H-2)

间谍(D) - m(292,582) - (idx,idx- h -2)

Hence, given that was how I understood the indexing order, I translated the code into Python in this form

因此,考虑到我是如何理解索引顺序的,我将代码转换成这种形式的Python。

Python:

Python:

...
H = 288
W = 432
N = (H+2) * (W+2)
mask = np.zeros([H+2, W+2])
mask[1:-1,1:-1] = 1

idx = np.nonzero(mask.transpose() == 1)                                 
idx = np.vstack((idx[1], idx[0]))                                        
idx = np.ravel_multi_index(idx, ((H+2),(W+2)), order='F').copy()     # Linear Indexing as per Matlab
>>> idx
array([291, ..., 578, 581 ..., 868, ... , 125281, ..., 125568])

idx_ = np.unravel_index(idx, ((H+2),(W+2)), order='F')               # *** Back to Linear Indexing
idx_ = np.column_stack((idx_[0], idx_[1]))                           # *** combine tuple of 2 arrays
idx_H_2 = np.unravel_index(idx-H-2, ((H+2),(W+2)), order='F')
idx_H_2 = np.column_stack((idx_H_2[0], idx_H_2[1]))

A = sp.csr_matrix((-D22[idx_[:,0], idx_[:,1]], (idx+1,idx)), shape = (N,N))
B = sp.csr_matrix((-D22[idx_[:,0], idx_[:,1]], (idx-1,idx)), shape = (N,N))
C = sp.csr_matrix((-D11[idx_[:,0], idx_[:,1]], (idx+H+2,idx)), shape = (N,N)) 
D = sp.csr_matrix((-D11[idx_H_2[:,0], idx_H_2[:,1]], (idx-H-2,idx)), shape = (N,N)) 
...

For A, the first entry is p(292, 291) - (idx+1,idx), and since Python starts from zero index, it refers to Matlab m(293, 292).

对于A,第一个条目是p(292,291) - (idx+1,idx),因为Python从0索引开始,它引用Matlab m(293,292)。

However for B, the first entry is p(290, 291) - (idx-1,idx), which was what I had expected (the equivalent in Matlab should be m(291, 292) ), but as mentioned earlier the Matlab code returns (292, 293) instead.

但是对于B,第一个条目是p(290, 291) - (idx-1,idx),这是我所期望的(在Matlab中等效的应该是m(291, 292)),但是前面提到的Matlab代码返回(292,293)。

C - p(581, 291) - (idx+H+2,idx)

C - p(581, 291) - (idx+H+2,idx)

D - p(1, 291) - (idx-H-2,idx)

D - p(1,291) - (idx- h -2,idx)

Could anyone kindly explain what I might have understood incorrectly, and how should I revise my Python code to reflect the Matlab code more accurately.

谁能解释一下我可能理解错了什么,以及我应该如何修改我的Python代码以更准确地反映Matlab代码。


Oh and just one more qns :)

哦,再来一个qns:)

Matlab:

Matlab:

A = A(idx,idx);

Python:

Python:

A = A[idx,:][:,idx]

Are the equivalent?

是等价的吗?

Thank you so much for your all kind help and time.

非常感谢您的帮助和时间。

2 个解决方案

#1


0  

These lines are confusing:

这些线是混乱:

py(A) first entry is m(293, 292) - (idx,idx+1), which was what I expected.

spy(B) m(292, 293) - (idx,idx-1). I was expecting it to be m(291, 292), believing that idx-1 would return an array [291, ..., 578, 581 ..., 868, ... , 125281, ..., 125568]

spy(C) - m(582, 292) - (idx,idx+H+2)

spy(D) - m(292, 582) - (idx,idx-H-2)

What is m(293,292)? Why the reverse in coordinates? Is that because of how spy plots the axes? p(...) for numpy code is equally confusing. In my (smaller) samples, A, B etc all have nonzeros where I expect.

m(293292)是什么?为什么在坐标系中是相反的?这是因为间谍密谋的方式吗?对于numpy代码,同样令人困惑。在我(更小的)样本中,A、B等都有我所期望的非零。

By the way, are there any zeros in D22(idx)?

顺便问一下,D22(idx)有零吗?

In any case, you've created 4 sparse matrices, with values along one diagonal or other, with periodic zero gaps.

无论如何,您已经创建了4个稀疏矩阵,沿着一个对角线或其他的值,具有周期性的零间隙。

A(idx, idx+1) has the same nonzero values as A, but contiguously on the main diagonal.

A(idx, idx+1)具有与A相同的非零值,但在主对角线上是连续的。

Condensed version of the numpy code is:

压缩版本的numpy代码是:

In [159]: idx=np.where(mask.ravel()==1)[0]
In [160]: A=sparse.csr_matrix((np.ones_like(idx),(idx,idx+1)),shape=(N,N))

I'm ignoring the F v C order, and the D22 array. If I had D22 matrix, I'd try to use D22.ravel[idx] (to match how I create and index mask). I don't think those details matter when comparing the overall generation of the matrices and their indexing.

我忽略了fv C和D22数组。如果我有D22矩阵,我会尝试使用D22。ravel[idx](匹配我创建和索引掩码)。我不认为这些细节在比较矩阵的整体生成和它们的索引时很重要。

A.tocoo().row and A.tocoo().col is a handy way of seeing the row and column indexes of the nonzero elements. A.nonzero() does this as well (with virtually the same code).

A.tocoo()。行和A.tocoo()。col是查看非零元素的行和列索引的简便方法。非零()也这样(实际上是相同的代码)。

Yes, A[idx,:][:,idx+1] produces the same submatrix.

是的,一个[idx,:][:,idx+1]产生相同的子矩阵。

A[idx, idx+1] gives a 1d vector of those diagonal values.

A[idx, idx+1]给出了这些对角线值的一维向量。

You need to transform the first index array into a 'column' vector to select a block (as the MATLAB version does):

您需要将第一个索引数组转换为“列”向量,以选择一个块(如MATLAB版本所做的):

A[np.ix_(idx,idx+1)]  # or with
A[idx[:,None],idx+1]

#2


0  

Seems fine to me, the only difference I can spot is:

我觉得很好,唯一的区别是:

MATLAB:

MATLAB:

A = sparse(idx, idx+1, -D22(idx), N, N);
B = sparse(idx, idx-1, -D22(idx), N, N);

Python:

Python:

A = sp.csr_matrix((-D22[idx_[:,0], idx_[:,1]], (idx+1,idx)), shape = (N,N))
B = sp.csr_matrix((-D22[idx_[:,0], idx_[:,1]], (idx,idx-1)), shape = (N,N))

Note that in Python for matrix B you change the index along the second dimension, whereas for matrix A you change along the first dimension.

请注意,在Python中,对于矩阵B,您可以在第二个维度上更改索引,而对于矩阵A,您将沿着第一个维度进行更改。

This difference is not present in your Matlab code, whereas all other lines are "symmetric"

这种区别不在您的Matlab代码中,而其他所有的行都是“对称的”

#1


0  

These lines are confusing:

这些线是混乱:

py(A) first entry is m(293, 292) - (idx,idx+1), which was what I expected.

spy(B) m(292, 293) - (idx,idx-1). I was expecting it to be m(291, 292), believing that idx-1 would return an array [291, ..., 578, 581 ..., 868, ... , 125281, ..., 125568]

spy(C) - m(582, 292) - (idx,idx+H+2)

spy(D) - m(292, 582) - (idx,idx-H-2)

What is m(293,292)? Why the reverse in coordinates? Is that because of how spy plots the axes? p(...) for numpy code is equally confusing. In my (smaller) samples, A, B etc all have nonzeros where I expect.

m(293292)是什么?为什么在坐标系中是相反的?这是因为间谍密谋的方式吗?对于numpy代码,同样令人困惑。在我(更小的)样本中,A、B等都有我所期望的非零。

By the way, are there any zeros in D22(idx)?

顺便问一下,D22(idx)有零吗?

In any case, you've created 4 sparse matrices, with values along one diagonal or other, with periodic zero gaps.

无论如何,您已经创建了4个稀疏矩阵,沿着一个对角线或其他的值,具有周期性的零间隙。

A(idx, idx+1) has the same nonzero values as A, but contiguously on the main diagonal.

A(idx, idx+1)具有与A相同的非零值,但在主对角线上是连续的。

Condensed version of the numpy code is:

压缩版本的numpy代码是:

In [159]: idx=np.where(mask.ravel()==1)[0]
In [160]: A=sparse.csr_matrix((np.ones_like(idx),(idx,idx+1)),shape=(N,N))

I'm ignoring the F v C order, and the D22 array. If I had D22 matrix, I'd try to use D22.ravel[idx] (to match how I create and index mask). I don't think those details matter when comparing the overall generation of the matrices and their indexing.

我忽略了fv C和D22数组。如果我有D22矩阵,我会尝试使用D22。ravel[idx](匹配我创建和索引掩码)。我不认为这些细节在比较矩阵的整体生成和它们的索引时很重要。

A.tocoo().row and A.tocoo().col is a handy way of seeing the row and column indexes of the nonzero elements. A.nonzero() does this as well (with virtually the same code).

A.tocoo()。行和A.tocoo()。col是查看非零元素的行和列索引的简便方法。非零()也这样(实际上是相同的代码)。

Yes, A[idx,:][:,idx+1] produces the same submatrix.

是的,一个[idx,:][:,idx+1]产生相同的子矩阵。

A[idx, idx+1] gives a 1d vector of those diagonal values.

A[idx, idx+1]给出了这些对角线值的一维向量。

You need to transform the first index array into a 'column' vector to select a block (as the MATLAB version does):

您需要将第一个索引数组转换为“列”向量,以选择一个块(如MATLAB版本所做的):

A[np.ix_(idx,idx+1)]  # or with
A[idx[:,None],idx+1]

#2


0  

Seems fine to me, the only difference I can spot is:

我觉得很好,唯一的区别是:

MATLAB:

MATLAB:

A = sparse(idx, idx+1, -D22(idx), N, N);
B = sparse(idx, idx-1, -D22(idx), N, N);

Python:

Python:

A = sp.csr_matrix((-D22[idx_[:,0], idx_[:,1]], (idx+1,idx)), shape = (N,N))
B = sp.csr_matrix((-D22[idx_[:,0], idx_[:,1]], (idx,idx-1)), shape = (N,N))

Note that in Python for matrix B you change the index along the second dimension, whereas for matrix A you change along the first dimension.

请注意,在Python中,对于矩阵B,您可以在第二个维度上更改索引,而对于矩阵A,您将沿着第一个维度进行更改。

This difference is not present in your Matlab code, whereas all other lines are "symmetric"

这种区别不在您的Matlab代码中,而其他所有的行都是“对称的”