很多数学上的性质都记不牢,每次用到都需要重新推导。为了减少此类时间浪费,决定以后每次使用时彻底整理好,自用之余也可造福读者。
本文所有内容均已严格查证并推导,但限于水平,难免有误。恳请发现问题的各位予以指正,谢谢!
1. 迹的定义
在线性代数中,将 n n n阶方阵(即 n × n n\times n n×n矩阵) A {\bf A} A的主对角线上各个元素的和称为方阵 A {\bf A} A的迹(trace),记为 t r ( A ) {\rm tr}(\bf A) tr(A)。
这里需要注意的是,迹是在方阵上定义的。如果不是方阵,那么就没有迹。MATLAB中可以对方阵A直接使用trace函数来得到其迹(代码:trace(A)),但如果对非方阵使用trace函数,将报错“矩阵必须为方阵”。
2. 迹运算的基本性质
(1) 转置不改变迹:
t
r
(
A
T
)
=
t
r
(
A
)
{\rm tr}({\bf A}^{\rm T}) = {\rm tr}(\bf A)
tr(AT)=tr(A)
(2) 迹运算是线性运算:
t
r
(
a
A
+
b
B
)
=
a
⋅
t
r
(
A
)
+
b
⋅
t
r
(
B
)
{\rm tr}(a{\bf A}+b{\bf B}) = a\cdot{\rm tr}({\bf A}) + b\cdot{\rm tr}({\bf B})
tr(aA+bB)=a⋅tr(A)+b⋅tr(B)
(3) 交换矩阵乘法顺序不改变迹:
t
r
(
A
B
)
=
t
r
(
B
A
)
{\rm tr}({\bf AB})={\rm tr}({\bf BA})
tr(AB)=tr(BA)
t
r
(
A
B
C
)
=
t
r
(
C
A
B
)
=
t
r
(
B
C
A
)
{\rm tr}({\bf ABC})={\rm tr}({\bf CAB})={\rm tr}({\bf BCA})
tr(ABC)=tr(CAB)=tr(BCA)
3. 迹与偏导的常见混合运算
(1)
∂
t
r
(
A
B
)
∂
A
=
B
T
\frac{\partial{\rm tr}({\bf AB})}{\partial {\bf A}}= {\bf B}^{\rm T}
∂A∂tr(AB)=BT,
∂
t
r
(
A
B
)
∂
B
=
A
T
\frac{\partial{\rm tr}({\bf AB})}{\partial {\bf B}}= {\bf A}^{\rm T}
∂B∂tr(AB)=AT
(2)
∂
t
r
(
A
A
T
)
∂
A
=
2
A
\frac{\partial {\rm tr}( {\bf A} {\bf A}^{\rm T} )}{\partial {\bf A}}= 2{\bf A}
∂A∂tr(AAT)=2A
证明:
∂
t
r
(
A
A
T
)
∂
A
\frac{\partial {\rm tr}( {\bf A} {\bf A}^{\rm T} )}{\partial {\bf A}}
∂A∂tr(AAT)
=
∂
t
r
(
A
不变
A
T
)
∂
A
+
∂
t
r
(
A
A
不变
T
)
∂
A
=\frac{\partial {\rm tr}( {\bf A}_{不变} {\bf A}^{\rm T} )}{\partial {\bf A}}+\frac{\partial {\rm tr}( {\bf A} {\bf A}^{\rm T}_{不变} )}{\partial {\bf A}}
=∂A∂tr(A不变AT)+∂A∂tr(AA不变T)
=
2
∂
t
r
(
A
A
不变
T
)
∂
A
=2\frac{\partial {\rm tr}( {\bf A} {\bf A}^{\rm T}_{不变} )}{\partial {\bf A}}
=2∂A∂tr(AA不变T)(利用2中性质(1),有
∂
t
r
(
A
不变
A
T
)
∂
A
=
∂
t
r
(
A
A
不变
T
)
∂
A
\frac{\partial {\rm tr}( {\bf A}_{不变} {\bf A}^{\rm T} )}{\partial {\bf A}}=\frac{\partial {\rm tr}( {\bf A} {\bf A}^{\rm T}_{不变} )}{\partial {\bf A}}
∂A∂tr(A不变AT)=∂A∂tr(AA不变T))
=
2
A
=2{\bf A}
=2A
(3)
∂
t
r
(
A
B
A
T
C
)
∂
A
=
C
A
B
+
C
T
A
B
T
\frac{\partial {\rm tr}( {\bf AB} {\bf A}^{\rm T}{\bf C} )}{\partial {\bf A}}= {\bf CAB} + {\bf C}^{\rm T} {\bf A}{\bf B}^{\rm T}
∂A∂tr(ABATC)=CAB+CTABT
证明:
∂
t
r
(
A
B
A
T
C
)
∂
A
\frac{\partial {\rm tr}( {\bf AB} {\bf A}^{\rm T}{\bf C} )}{\partial {\bf A}}
∂A∂tr(ABATC)
=
∂
t
r
(
A
不变
B
A
T
C
)
∂
A
+
∂
t
r
(
A
B
A
不变
T
C
)
∂
A
=\frac{\partial {\rm tr}( {\bf A}_{不变} {\bf B} {\bf A}^{\rm T}{\bf C} )}{\partial {\bf A}}+\frac{\partial {\rm tr}( {\bf AB} {\bf A}^{\rm T}_{不变}{\bf C} )}{\partial {\bf A}}
=∂A∂tr(A不变BATC)+∂A∂tr(ABA不变TC)
=
∂
t
r
(
A
T
C
A
不变
B
)
∂
A
+
∂
t
r
(
A
B
A
不变
T
C
)
∂
A
=\frac{\partial {\rm tr}({\bf A}^{\rm T} {\bf C} {\bf A}_{不变} {\bf B})}{\partial {\bf A}}+\frac{\partial {\rm tr}( {\bf AB} {\bf A}^{\rm T}_{不变}{\bf C} )}{\partial {\bf A}}
=∂A∂tr(ATCA不变B)+∂A∂tr(ABA不变TC)(利用2中性质(1))
=
∂
t
r
(
B
T
A
不变
T
C
T
A
)
∂
A
+
∂
t
r
(
A
B
A
不变
T
C
)
∂
A
=\frac{\partial {\rm tr}({\bf B}^{\rm T} {\bf A}_{不变}^{\rm T} {\bf C}^{\rm T} {\bf A})}{\partial {\bf A}}+\frac{\partial {\rm tr}( {\bf AB} {\bf A}^{\rm T}_{不变}{\bf C} )}{\partial {\bf A}}
=∂A∂tr(BTA不变TCTA)+∂A∂tr(ABA不变TC)(利用2中性质(3))
=
(
B
T
A
T
C
T
)
T
+
(
B
A
T
C
)
T
={({\bf B}^{\rm T} {\bf A}^{\rm T} {\bf C}^{\rm T})}^{\rm T}+{({\bf B} {\bf A}^{\rm T}{\bf C})}^{\rm T}
=(BTATCT)T+(BATC)T(利用3中结果(1))
=
C
A
B
+
C
T
A
B
T
={\bf CAB} + {\bf C}^{\rm T} {\bf A}{\bf B}^{\rm T}
=CAB+CTABT