I am trying to align RGB and Depth image from Kinect using Matlab. I am trying to do it using the algorithm from this page.
我正在尝试使用Matlab对齐来自Kinect的RGB和深度图像。我试图使用此页面中的算法来完成它。
Here is the code I have written so far
这是我到目前为止编写的代码
depth = imread('depth_00500.png');
color = imread('rgb_00500.png');
rotationMat=[9.9984628826577793e-01 1.2635359098409581e-03 -1.7487233004436643e-02;
-1.4779096108364480e-03 9.9992385683542895e-01 -1.2251380107679535e-02;
1.7470421412464927e-02 1.2275341476520762e-02 9.9977202419716948e-01 ];
translationMat=[1.9985242312092553e-02, -7.4423738761617583e-04, -1.0916736334336222e-02 ];
%parameters for color matrix
fx_rgb= 5.2921508098293293e+02;
fy_rgb= 5.2556393630057437e+02;
cx_rgb= 3.2894272028759258e+02;
cy_rgb= 2.6748068171871557e+02;
k1_rgb= 2.6451622333009589e-01;
k2_rgb= -8.3990749424620825e-01;
p1_rgb= -1.9922302173693159e-03;
p2_rgb= 1.4371995932897616e-03;
k3_rgb= 9.1192465078713847e-01;
%parameters for depth matrix
fx_d= 5.9421434211923247e+02;
fy_d= 5.9104053696870778e+02;
cx_d= 3.3930780975300314e+02;
cy_d= 2.4273913761751615e+02;
k1_d= -2.6386489753128833e-01;
k2_d =9.9966832163729757e-01;
p1_d =-7.6275862143610667e-04;
p2_d =5.0350940090814270e-03;
k3_d =-1.3053628089976321e+00;
row_num=480;
col_num=640;
for row=1:row_num
for col=1:col_num
pixel3D(row,col,1) = (row - cx_d) * depth(row,col) / fx_d;
pixel3D(row,col,2) = (col - cy_d) * depth(row,col) / fy_d;
pixel3D(row,col,3) = depth(row,col);
end
end
pixel3D(:,:,1)=rotationMat*pixel3D(:,:,1)+translationMat;
pixel3D(:,:,2)=rotationMat*pixel3D(:,:,2)+translationMat;
pixel3D(:,:,3)=rotationMat*pixel3D(:,:,3)+translationMat;
P2Drgb_x = fx_rgb*pixel3D(:,:,1)/pixel3D(:,:,3)+cx_rgb;
P2Drgb_y = fy_rgb*pixel3D(:,:,2)/pixel3D(:,:,3)+cy_rgb;
I am especially failing to understand why we're assigning value of depth pixel to dimension x,y and z of 3-dimensional space, shouldn't we assign (x,y,z) dimension to the depth pixel value?
我特别难以理解为什么我们将深度像素的值分配给三维空间的x,y和z维度,我们不应该将(x,y,z)维度分配给深度像素值吗?
I mean this part:
我的意思是这部分:
P3D.x = (x_d - cx_d) * depth(x_d,y_d) / fx_d
P3D.y = (y_d - cy_d) * depth(x_d,y_d) / fy_d
P3D.z = depth(x_d,y_d)
Also I'm not sure If I can represent 3d space using matrix. I am trying to use it in my code but for sure it has wrong size as multiplication by 3x3 rotation matrix is impossible.
另外我不确定我是否可以使用矩阵表示3d空间。我试图在我的代码中使用它,但肯定它的大小错误,因为乘以3x3旋转矩阵是不可能的。
Thank you for very much for every suggestion and help!
非常感谢您的每一个建议和帮助!
1 个解决方案
#1
This is a quite complex topic to explain in a short answer. As per me, the code is correct. Please read about intrinsic and extrinsic camera matrices. And reading about perspective projection will also help you to understand 2D to 3D projection.
这是一个非常复杂的主题,可以在简短的回答中解释。按照我的说法,代码是正确的。请阅读内在和外在的相机矩阵。阅读透视投影也可以帮助您理解2D到3D投影。
P3D.x = (x_d - cx_d) * depth(x_d,y_d) / fx_d
In above line, depth(x_d, y_d)
gives you the depth value at a pixel from the depth image. Then it is multiplied by the (x_d - cx_d)
, which is the difference along x-axis with the x coordinate of the centre point of depth map to the current pixel. Then finally this is divided by the fx_d
, which is the focal length of the depth camera.
在上面一行中,深度(x_d,y_d)为您提供深度图像中像素的深度值。然后将它乘以(x_d-cx_d),这是沿着x轴的差异,深度图的中心点的x坐标到当前像素。然后最后将其除以fx_d,这是深度相机的焦距。
Following two references will help you to understand this mathematically well if you are interested in.
如果您感兴趣,以下两个参考文献将帮助您在数学上理解这一点。
-
Mueller,K.,Smolic,A.,Dix,K.,Merkle,P.,Kauff,P。,&Wiegand,T。(2008)。查看高级3D视频系统的综合。 EURASIP图像和视频处理期刊,2008(1),1-11。
-
Daribo,I。和Saito,H。(2011)。一种新颖的基于修复的3DTV分层深度视频。 Broadcasting,IEEE Transactions,57(2),533-541。
#1
This is a quite complex topic to explain in a short answer. As per me, the code is correct. Please read about intrinsic and extrinsic camera matrices. And reading about perspective projection will also help you to understand 2D to 3D projection.
这是一个非常复杂的主题,可以在简短的回答中解释。按照我的说法,代码是正确的。请阅读内在和外在的相机矩阵。阅读透视投影也可以帮助您理解2D到3D投影。
P3D.x = (x_d - cx_d) * depth(x_d,y_d) / fx_d
In above line, depth(x_d, y_d)
gives you the depth value at a pixel from the depth image. Then it is multiplied by the (x_d - cx_d)
, which is the difference along x-axis with the x coordinate of the centre point of depth map to the current pixel. Then finally this is divided by the fx_d
, which is the focal length of the depth camera.
在上面一行中,深度(x_d,y_d)为您提供深度图像中像素的深度值。然后将它乘以(x_d-cx_d),这是沿着x轴的差异,深度图的中心点的x坐标到当前像素。然后最后将其除以fx_d,这是深度相机的焦距。
Following two references will help you to understand this mathematically well if you are interested in.
如果您感兴趣,以下两个参考文献将帮助您在数学上理解这一点。
-
Mueller,K.,Smolic,A.,Dix,K.,Merkle,P.,Kauff,P。,&Wiegand,T。(2008)。查看高级3D视频系统的综合。 EURASIP图像和视频处理期刊,2008(1),1-11。
-
Daribo,I。和Saito,H。(2011)。一种新颖的基于修复的3DTV分层深度视频。 Broadcasting,IEEE Transactions,57(2),533-541。