第六篇 视觉slam中的优化问题梳理及雅克比推导

时间:2021-10-16 15:50:49

优化问题定义以及求解

通用定义

解决问题的开始一定是定义清楚问题。这里引用g2o的定义。

\[
\begin{aligned}
\mathbf{F}(\mathbf{x})&=\sum_{k\in \mathcal{C}} \underbrace{\mathbf{e}_k(\mathbf{x}_k,\mathbf{z}_k)^\top \Omega_k\mathbf{e}_k(\mathbf{x}_k,\mathbf{z}_k)}_{\mathbf{F}_k} \\
\mathbf{x}^* &= \underset{\mathbf{x}}{\operatorname{argmin}}\mathbf{F}(\mathbf{x})
\end{aligned} \tag{1}
\]

  • \(\mathbf{x}=(\mathbf{x}_1^\top,\dots,\mathbf{x}_n^\top)^\top\),\(\mathbf{x}_i\in \mathbf{x}\)为向量,表示一组参数;
  • \(\mathbf{x}_k=(\mathbf{x}_{k_1}^\top,\dots,\mathbf{x}_{k_q}^\top)^\top\subset \mathbf{x}\),第k次约束参数子集;
  • \(\mathbf{z}_k\)可以当做观测向量,\(\Omega_k\)可以认为是观测协方差矩阵,是个对称矩阵;
  • \(\mathbf{e}_k(\mathbf{x}_k,\mathbf{z}_k)\)是误差函数;

\(\mathbf{F}(\mathbf{x})\)其实就是总测量误差的平方和,这里简单起见假设\(\Omega_k=\begin{bmatrix}\sigma_1^2&0 \\ 0 & \sigma_2^2\end{bmatrix}\),
可以把\(\mathbf{F}_k(\mathbf{x})\)当做单次测量误差平方和,假设\(\mathbf{e}_k(\mathbf{x}_k,\mathbf{z}_k)=(e_1,e_2)^\top\),展开看

\[
\begin{aligned}
\mathbf{F}_k(\mathbf{x})&=\mathbf{e}_k(\mathbf{x}_k,\mathbf{z}_k)^\top \Omega_k\mathbf{e}_k(\mathbf{x}_k,\mathbf{z}_k) \\
&=\sigma_1^2e_1^2+\sigma_2^2e_2^2
\end{aligned}
\]
问题就是求使得测量误差平方和最小的参数的值。

求解最优问题

简化误差方程定义:\(\mathbf{e}_k(\mathbf{x}_k,\mathbf{z}_k) \overset{def.}{=} \mathbf{e}_k(\mathbf{x}_k) \overset{def.}{=} \mathbf{e}_k(\mathbf{x})\)。误差方程在值\(\breve{\mathbf{x}}\)处进行一阶泰勒级数近似展开:

\[\begin{aligned}
\mathbf{e}_k(\breve{\mathbf{x}}_k+\Delta\mathbf{x}_k) &=\mathbf{e}_k(\breve{\mathbf{x}}+\Delta\mathbf{x}) \\
&\simeq \mathbf{e}_k(\breve{\mathbf{x}})+\mathbf{J}_k\Delta\mathbf{x}
\end{aligned} \tag{2}
\]
其中\(\mathbf{J}_k\)是\(\mathbf{e}_k(\mathbf{x})\)在\(\breve{\mathbf{x}}\)处的雅克比矩阵,代入(1)中得:

\[
\begin{aligned}
\mathbf{F}_k(\breve{\mathbf{x}}+\Delta\mathbf{x}) &= \mathbf{e}_k(\breve{\mathbf{x}}+\Delta\mathbf{x})^\top\Omega_k\mathbf{e}_k(\breve{\mathbf{x}}+\Delta\mathbf{x}) \\
&\simeq (\mathbf{e}_k(\breve{\mathbf{x}})+\mathbf{J}_k\Delta\mathbf{x})^\top\Omega_k(\mathbf{e}_k(\breve{\mathbf{x}})+\mathbf{J}_k\Delta\mathbf{x}) \\
&=\underbrace{(\mathbf{e}_k(\breve{\mathbf{x}})^\top+(\mathbf{J}_k\Delta\mathbf{x})^\top)}_{A^\top+B^\top = (A+B)^\top}\Omega_k(\mathbf{e}_k(\breve{\mathbf{x}})+\mathbf{J}_k\Delta\mathbf{x}) \\
&= \mathbf{e}_k(\breve{\mathbf{x}})^\top\Omega_k\mathbf{e}_k(\breve{\mathbf{x}})+\underbrace{\mathbf{e}_k(\breve{\mathbf{x}})^\top\Omega_k\mathbf{J}_k\Delta\mathbf{x}+(\mathbf{J}_k\Delta\mathbf{x})^\top\Omega_k\mathbf{e}_k(\breve{\mathbf{x}})}_{当A^TB为标量时,A^TB=B^TA}+\Delta\mathbf{x}^\top\mathbf{J}_k^\top\Omega_k\mathbf{J}_k\Delta\mathbf{x} \\
&=\underbrace{\mathbf{e}_k(\breve{\mathbf{x}})^\top\Omega_k\mathbf{e}_k(\breve{\mathbf{x}})}_{标量c_k}+2\underbrace{\mathbf{e}_k(\breve{\mathbf{x}})^\top\Omega_k\mathbf{J}_k}_{向量\mathbf{b}_k^\top}\Delta\mathbf{x}+\Delta\mathbf{x}^\top\underbrace{\mathbf{J}_k^\top\Omega_k\mathbf{J}_k}_{矩阵\mathbf{H}_k}\Delta\mathbf{x} \\
&=c_k+2\mathbf{b}_k^\top\Delta\mathbf{x}+\Delta\mathbf{x}^\top\mathbf{H}_k\Delta\mathbf{x}
\end{aligned} \tag{3}
\]
因此

\[
\begin{aligned}
\mathbf{F}(\breve{\mathbf{x}}+\Delta\mathbf{x}) &=\sum_{k\in \mathcal{C}} \mathbf{F}_k(\breve{\mathbf{x}}+\Delta\mathbf{x}) \\
&\simeq \sum_{k\in \mathit{C}} c_k+2\mathbf{b}_k\Delta\mathbf{x}+\Delta\mathbf{x}^\top\mathbf{H}_k\Delta\mathbf{x} \\
&= c+2\mathbf{b}^\top\Delta\mathbf{x}+\Delta\mathbf{x}^\top\mathbf{H}\Delta\mathbf{x}
\end{aligned} \tag{4}
\]
问题转化为求(4)的最小值,求标量\(\mathbf{F}(\breve{\mathbf{x}}+\Delta\mathbf{x})\)的微分

\[
\begin{aligned}
d\mathbf{F}(\breve{\mathbf{x}}+\Delta\mathbf{x}) &= 2\mathbf{b}^\top d(\Delta\mathbf{x}) + \underbrace{d(\Delta\mathbf{x}^\top)\mathbf{H}\Delta\mathbf{x}}_{d(X^T) = (dX)^T}+\Delta\mathbf{x}^\top\mathbf{H}d(\Delta\mathbf{x}) \\
&= 2\mathbf{b}^\top d(\Delta\mathbf{x}) + \underbrace{(d(\Delta\mathbf{x}))^\top\mathbf{H}\Delta\mathbf{x}}_{当A^TB为标量时,A^TB=B^TA} + \Delta\mathbf{x}^\top\mathbf{H}d(\Delta\mathbf{x}) \\
&= 2\mathbf{b}^\top d(\Delta\mathbf{x}) + \underbrace{\Delta\mathbf{x}^\top\mathbf{H}^\top d(\Delta\mathbf{x}) + \Delta\mathbf{x}^\top\mathbf{H}d(\Delta\mathbf{x})}_{\Omega_k为对称阵,因此H为对称阵} \\
&= 2(\mathbf{b}^\top + \Delta\mathbf{x}^\top\mathbf{H}^\top)d(\Delta\mathbf{x}) \\
&= 2(\mathbf{b} + \mathbf{H}\Delta\mathbf{x})^\top d(\Delta\mathbf{x})
\end{aligned}
\]
对照\(d\mathbf{F}=\frac{\partial \mathbf{F}}{\partial \Delta\mathbf{x}}^Td(\Delta\mathbf{x})\),得\(\frac{\partial \mathbf{F}}{\partial \Delta\mathbf{x}}=\mathbf{b} + \mathbf{H}\Delta\mathbf{x}\)

求\(\frac{\partial \mathbf{F}}{\partial \Delta\mathbf{x}}=0\),注意因为\(\mathbf{F}\)非负,所以极值处为极小值。

问题又转为求解线性方程 \(\mathbf{H}\Delta\mathbf{x} = -\mathbf{b}\),所得到的解为\(\Delta\mathbf{x}^*\),增量更新\(\mathbf{x}^*=\breve{\mathbf{x}}+\Delta\mathbf{x}^*\)。以次方式不断迭代求最优问题。

优化库

在实际的工程中,我们会使用优化库求解这些优化问题。在使用这些优化库的时候,我们只需要定义好误差函数\(\mathbf{e}_k\)计算误差,误差函数在某值处的雅克比矩阵\(\mathbf{J}_k\),定义好观测的协方差矩阵\(\Omega_k\),优化库便可以帮我们求解最优问题。优化库有很多种,Ceres,g2o,gtsam等,Ceres自身有自动求导甚至不需要我们计算雅克比矩阵,但是搞清楚他们的优化原理还是很有必要的。

视觉SLAM中的优化问题

相机投影模型

已知相机内参\(\mathbf{K}=\begin{bmatrix}f_x & 0 & c_x \\ 0 & f_y & c_y \\ 0 & 0 & 1\end{bmatrix}\),相机坐标系下空间点\(\mathbf{p}_{c}=[x_c,y_c,z_c]^\top\in \mathbb{R}^3\)投影到像平面点\(\mathbf{p}_{I}=[u,v]^\top\in \mathbb{R}^2\)的函数为:

\[
\begin{aligned}
\text{proj}(\mathbf{p}_{c})&=[\frac{1}{z_c}\mathbf{K}\mathbf{p}_{c}]_{1:2} \\
&= \begin{bmatrix}f_x & 0 & c_x \\ 0 & f_y & c_y \\ 0 & 0 & 1\end{bmatrix}\begin{bmatrix}x_c/z_c \\ y_c/z_c \\ 1 \end{bmatrix}_{1:2} \\
&= \begin{bmatrix}f_x*x_c/z_c+c_x \\ f_y*y_c/z_c+c_y \end{bmatrix}
\end{aligned}
\]

\[
\begin{aligned}
\frac{\partial \text{proj}(\mathbf{p}_{c})}{\partial \mathbf{p}_{c}}&= \begin{bmatrix}\frac{\partial u}{\partial x_c} & \frac{\partial u}{\partial y_c} & \frac{\partial u}{\partial z_c} \\ \frac{\partial v}{\partial x_c} & \frac{\partial v}{\partial y_c} & \frac{\partial v}{\partial z_c} \end{bmatrix}\\
&= \begin{bmatrix}f_x/z_c & 0 & -f_x*x_c/z_c^2 \\ 0 & f_y/z_c & -f_y*y_c/z_c^2 \end{bmatrix}
\end{aligned} \tag{5}
\]

立体视觉观测函数

假设双目相机的基线为\(b\),相机坐标系下空间点\(\mathbf{p}_{c}=[x_c,y_c,z_c]^\top\in \mathbb{R}^3\)投影到左右相机平面的坐标为\([u_l,v_l]^\top,[u_r,v_r]^\top\),假设是水平双目,则有\(u_l-u_r=\frac{bf_x}{z_c}\),那么

\[
u_r=u_l-\frac{bf_x}{z_c}=f_x*x_c/z_c+c_x - \frac{bf_x}{z_c}
\]
\[
\begin{aligned}
\frac{\partial u_r}{\partial \mathbf{p}_{c}} &= \begin{bmatrix}\frac{\partial u_r}{\partial x_c} & \frac{\partial u_r}{\partial y_c} & \frac{\partial u_r}{\partial z_c} \end{bmatrix} \\
&= \begin{bmatrix}f_x/z_c & 0 & -f_x*(x_c-b)/z_c^2\end{bmatrix}
\end{aligned}
\]
整合起来有

\[
\begin{aligned}
\mathbf{z}_{stereo}&=\binom{\text{proj}(\mathbf{p}_{c})}{u_r} \\
&= \begin{bmatrix}f_x*x_c/z_c+c_x \\ f_y*y_c/z_c+c_y \\ f_x*x_c/z_c+c_x - \frac{bf_x}{z_c} \end{bmatrix}
\end{aligned}
\]
\[
\frac{\partial \mathbf{z}_{stereo}}{\partial \mathbf{p}_{c}} = \begin{bmatrix}f_x/z_c & 0 & -f_x*x_c/z_c^2 \\ 0 & f_y/z_c & -f_y*y_c/z_c^2 \\ f_x/z_c & 0 & -f_x*(x_c-b)/z_c^2\end{bmatrix} \tag{6}
\]

SO3、SE3定义及指数映射

\[
SO(3) = \begin{Bmatrix}
\mathbf{R}\in\mathbb{R}^{3\times 3}|\mathbf{R}\mathbf{R}^\top=\mathbf{I},\text{det}(\mathbf{R})=1
\end{Bmatrix}
\]

\[
\mathfrak{s0}(3) = \begin{Bmatrix}
\omega^\wedge=\left.\begin{matrix}\begin{bmatrix}0 & -\omega_3 & \omega_2\\\omega_3 & 0 & -\omega_1 \\ -\omega_2 & \omega_1 & 0\end{bmatrix}\end{matrix}\right|\omega=[\omega_1,\omega_2,\omega_3]^\top\in\mathbb{R}^3
\end{Bmatrix}
\]

\(\text{exp}(\omega^\wedge)\in SO(3)\),证明见罗德里格斯公式。

\[
SE(3) = \begin{Bmatrix}
\mathbf{T}=\begin{bmatrix}\mathbf{R} & \mathbf{t} \\ \mathbf{0}^\top & 1\end{bmatrix}\in\mathbb{R}^{4\times 4}|\mathbf{R}\in SO(3),\mathbf{t}\in\mathbb{R}^3
\end{Bmatrix}
\]

\[
\mathfrak{se}(3) = \begin{Bmatrix}
\epsilon^\wedge=\left.\begin{matrix}\begin{bmatrix}\omega^\wedge & v\\ 0^\top & 0\end{bmatrix}\end{matrix}\right|\omega\in\mathbb{R}^3,v\in\mathbb{R}^3,\epsilon=[v,\omega]^\top
\end{Bmatrix}
\]

\[
\begin{aligned}
\text{exp}(\epsilon^\wedge) &= \underbrace{\text{exp}{\begin{bmatrix}\omega^\wedge & v\\ 0^\top & 0\end{bmatrix}}}_{泰勒级数展开} \\
&= \mathbf{I} + \begin{bmatrix}\omega^\wedge & v\\ 0^\top & 0\end{bmatrix} + \frac{1}{2!}\begin{bmatrix}\omega^{\wedge2} & \omega^\wedge v\\ 0^\top & 0\end{bmatrix} + \frac{1}{3!}\begin{bmatrix}\omega^{\wedge3} & \omega^{\wedge2} v\\ 0^\top & 0\end{bmatrix} + \dots \\
&= \begin{bmatrix}\text{exp}(\omega^\wedge) & \mathbf{V}v\\ 0^\top & 0\end{bmatrix} \in SE(3) ,\mathbf{V}=\mathbf{I}+\frac{1}{2!}\omega^{\wedge} + \frac{1}{3!}\omega^{\wedge2} + \dots
\end{aligned}
\]
实际上

\[
\mathbf{V} = \left\{\begin{matrix}
\mathbf{I}+\frac{1}{2}\omega^{\wedge}+\frac{1}{6}\omega^{\wedge2} = \mathbf{I}, & \theta \rightarrow 0 \\
\mathbf{I}+\frac{1-cos(\theta)}{\theta^2}\omega^{\wedge}+\frac{\theta-sin(\theta)}{\theta^3}\omega^{\wedge2}, & else
\end{matrix}\right. \: \: \: with \:\:\theta=\left\|\omega\right\|_2
\]

首先从最简单的位姿优化开始。

位姿优化

已知图像特征点在图像中的坐标集合\(\mathcal{P}_I=\left\{\mathbf{p}_{I_1}, \mathbf{p}_{I_2}, \ldots, \mathbf{p}_{I_n}\right\},\mathbf{p}_{I_i}\in \mathbb{R}^2\), 以及对应的空间坐标\(\mathcal{P}_w=\left\{\mathbf{p}_{w_1}, \mathbf{p}_{w_2}, \ldots, \mathbf{p}_{w_n}\right\},\mathbf{p}_{w_i}\in \mathbb{R}^3\),求解世界坐标系到相机的变换矩阵\(\mathbf{T}_{cw}^*=\begin{bmatrix} \mathbf{R}_{cw}^* & \mathbf{t}_{cw}^* \\ 0^\top & 1 \end{bmatrix}\)的最优值。

定义误差函数

假设变换矩阵的初始值为\(\mathbf{T}_{cw}=\begin{bmatrix} \mathbf{R}_{cw} & \mathbf{t}_{cw} \\ 0^\top & 1 \end{bmatrix}=\text{exp}(\xi_0^\wedge ),\xi^\wedge_0\in{\mathfrak{se}(3)}\),加在该初值的左扰动为\(\text{exp}(\epsilon^\wedge )\)。

单目误差

\[
\mathbf{e}_k(\xi)=\mathbf{p}_{I_k} - \text{proj}(\text{exp}(\xi^\wedge )\cdot\mathbf{p}_{w_k})
\]
\[
\begin{aligned}
\mathbf{J}_k=\frac{\partial \mathbf{e}_k}{\partial \epsilon} = -\frac{\partial \text{proj}(\mathbf{p}_{c})}{\partial \mathbf{p}_{c}}\cdot \left.\begin{matrix} \frac{\partial \text{exp}(\epsilon^\wedge )\text{exp}(\xi^\wedge )\cdot\mathbf{p}_{w_k}}{\partial \epsilon}\end{matrix}\right|_{\epsilon=0}
\end{aligned}
\]

\[
\begin{aligned}
\left.\begin{matrix}
\frac{\partial \text{exp}(\epsilon^\wedge )\text{exp}(\xi^\wedge )\cdot\mathbf{p}_{w_k}}{\partial \epsilon}
\end{matrix}\right|_{\xi=\xi_0, \epsilon=0}
&\approx \left.\begin{matrix}\frac{\partial\underbrace{(I+\epsilon^\wedge )}_{泰勒展开近似}\text{exp}(\xi_0^\wedge )\cdot\mathbf{p}_{w_k}}{\partial \epsilon}\end{matrix}\right|_{\xi=\xi_0, \epsilon=0} \\
&=\left.\begin{matrix}\frac{\partial\epsilon^\wedge \text{exp}(\xi_0^\wedge )\cdot\mathbf{p}_{w_k}}{\partial \epsilon}\end{matrix}\right|_{\xi=\xi_0, \epsilon=0} \\
&=\left.\begin{matrix}\frac{\partial \begin{bmatrix}\omega^\wedge & v \\ 0^\top & 0 \end{bmatrix}\begin{bmatrix}\underbrace{\mathbf{R}_{cw}*\mathbf{p}_{w_k}+\mathbf{t}_{cw}}_{\mathbf{p}_c} \\ 1\end{bmatrix}}{\partial \epsilon}\end{matrix}\right|_{\xi=\xi_0, \epsilon=0} \\
&=\left.\begin{matrix}\frac{\partial \begin{bmatrix}\omega^\wedge\mathbf{p}_c+v \end{bmatrix}_{3\times 1}}{\partial \epsilon}\end{matrix}\right|_{\epsilon=0} \\
&=\left.\begin{matrix}\frac{\partial \begin{bmatrix}-\mathbf{p}_c^\wedge\omega+v \end{bmatrix}_{3\times 1}}{\partial \epsilon}\end{matrix}\right|_{\epsilon=0} \\
&=\left.\begin{matrix}\frac{\partial -\begin{bmatrix}0 & -z_c & y_c \\ z_c & 0 & -x_c \\ -y_c & x_c & 0 \end{bmatrix}\begin{bmatrix}\omega_1 \\ \omega_2 \\ \omega_3 \end{bmatrix}+\begin{bmatrix}v_1 \\ v_2 \\ v_3 \end{bmatrix}}{\partial \epsilon}\end{matrix}\right|_{\epsilon=0} \\
&=\left.\begin{matrix}\frac{\partial -\begin{bmatrix}-z_c*\omega_2+y_c*\omega_3+v_1 \\ z_c*\omega_1-x_c*\omega_3+v_2 \\ -y_c*\omega_1+x_c*\omega_2+v_3 \end{bmatrix}}{\partial \epsilon}\end{matrix}\right|_{\epsilon=0} \\
&= -\begin{bmatrix}\mathbf{I}_{3\times 3} & \mathbf{p}_c^\wedge\end{bmatrix}
\end{aligned}
\]
结合(5)有

\[
\begin{aligned}
\mathbf{J}_k=\begin{bmatrix}f_x/z_c & 0 & -f_x*x_c/z_c^2 \\ 0 & f_y/z_c & -f_y*y_c/z_c^2 \end{bmatrix} \cdot -\begin{bmatrix}\mathbf{I}_{3\times 3} & \mathbf{p}_c^\wedge\end{bmatrix}
\end{aligned}
\]

双目误差

\[
\mathbf{e}_k(\xi)=\mathbf{p}_{I_k} - \mathbf{z}_{stereo}(\text{exp}(\xi^\wedge )\cdot\mathbf{p}_{w_k})
\]
\[
\begin{aligned}
\mathbf{J}_k=\frac{\partial \mathbf{e}_k}{\partial \epsilon} &= -\frac{\partial \mathbf{z}_{stereo}(\mathbf{p}_{c})}{\partial \mathbf{p}_{c}}\cdot \left.\begin{matrix} \frac{\partial \text{exp}(\epsilon^\wedge )\text{exp}(\xi^\wedge )\cdot\mathbf{p}_{w_k}}{\partial \epsilon}\end{matrix}\right|_{\epsilon=0} \\
&= \begin{bmatrix}f_x/z_c & 0 & -f_x*x_c/z_c^2 \\ 0 & f_y/z_c & -f_y*y_c/z_c^2 \\ f_x/z_c & 0 & -f_x*(x_c-b)/z_c^2\end{bmatrix}\cdot -\begin{bmatrix}\mathbf{I}_{3\times 3} & \mathbf{p}_c^\wedge\end{bmatrix}
\end{aligned}
\]

参考

  1. Giorgio Grisetti, Rainer Kummerle. g2o: A general Framework for (Hyper) Graph Optimization. 2017
  2. 高翔. 视觉SLAM十四讲. 2017
  3. Strasdat H. Local accuracy and global consistency for efficient visual SLAM[D]. Department of Computing, Imperial College London, 2012.
  4. Lie Groups for 2D and 3D Transformations