【数学】【矩阵】迹(Trace)及相关性质

时间:2024-11-21 08:20:18

很多数学上的性质都记不牢,每次用到都需要重新推导。为了减少此类时间浪费,决定以后每次使用时彻底整理好,自用之余也可造福读者。
本文所有内容均已严格查证并推导,但限于水平,难免有误。恳请发现问题的各位予以指正,谢谢!

1. 迹的定义

在线性代数中,将 n n n阶方阵(即 n × n n\times n n×n矩阵) A {\bf A} A的主对角线上各个元素的和称为方阵 A {\bf A} A的迹(trace),记为 t r ( A ) {\rm tr}(\bf A) tr(A)

这里需要注意的是,迹是在方阵上定义的。如果不是方阵,那么就没有迹。MATLAB中可以对方阵A直接使用trace函数来得到其迹(代码:trace(A)),但如果对非方阵使用trace函数,将报错“矩阵必须为方阵”。

2. 迹运算的基本性质

(1) 转置不改变迹: t r ( A T ) = t r ( A ) {\rm tr}({\bf A}^{\rm T}) = {\rm tr}(\bf A) tr(AT)=tr(A)
(2) 迹运算是线性运算: t r ( a A + b B ) = a ⋅ t r ( A ) + b ⋅ t r ( B ) {\rm tr}(a{\bf A}+b{\bf B}) = a\cdot{\rm tr}({\bf A}) + b\cdot{\rm tr}({\bf B}) tr(aA+bB)=atr(A)+btr(B)
(3) 交换矩阵乘法顺序不改变迹:
t r ( A B ) = t r ( B A ) {\rm tr}({\bf AB})={\rm tr}({\bf BA}) tr(AB)=tr(BA)
t r ( A B C ) = t r ( C A B ) = t r ( B C A ) {\rm tr}({\bf ABC})={\rm tr}({\bf CAB})={\rm tr}({\bf BCA}) tr(ABC)=tr(CAB)=tr(BCA)

3. 迹与偏导的常见混合运算

(1) ∂ t r ( A B ) ∂ A = B T \frac{\partial{\rm tr}({\bf AB})}{\partial {\bf A}}= {\bf B}^{\rm T} Atr(AB)=BT, ∂ t r ( A B ) ∂ B = A T \frac{\partial{\rm tr}({\bf AB})}{\partial {\bf B}}= {\bf A}^{\rm T} Btr(AB)=AT
(2) ∂ t r ( A A T ) ∂ A = 2 A \frac{\partial {\rm tr}( {\bf A} {\bf A}^{\rm T} )}{\partial {\bf A}}= 2{\bf A} Atr(AAT)=2A
证明: ∂ t r ( A A T ) ∂ A \frac{\partial {\rm tr}( {\bf A} {\bf A}^{\rm T} )}{\partial {\bf A}} Atr(AAT)
= ∂ t r ( A 不变 A T ) ∂ A + ∂ t r ( A A 不变 T ) ∂ A =\frac{\partial {\rm tr}( {\bf A}_{不变} {\bf A}^{\rm T} )}{\partial {\bf A}}+\frac{\partial {\rm tr}( {\bf A} {\bf A}^{\rm T}_{不变} )}{\partial {\bf A}} =Atr(A不变AT)+Atr(AA不变T)
= 2 ∂ t r ( A A 不变 T ) ∂ A =2\frac{\partial {\rm tr}( {\bf A} {\bf A}^{\rm T}_{不变} )}{\partial {\bf A}} =2Atr(AA不变T)(利用2中性质(1),有 ∂ t r ( A 不变 A T ) ∂ A = ∂ t r ( A A 不变 T ) ∂ A \frac{\partial {\rm tr}( {\bf A}_{不变} {\bf A}^{\rm T} )}{\partial {\bf A}}=\frac{\partial {\rm tr}( {\bf A} {\bf A}^{\rm T}_{不变} )}{\partial {\bf A}} Atr(A不变AT)=Atr(AA不变T)
= 2 A =2{\bf A} =2A
(3) ∂ t r ( A B A T C ) ∂ A = C A B + C T A B T \frac{\partial {\rm tr}( {\bf AB} {\bf A}^{\rm T}{\bf C} )}{\partial {\bf A}}= {\bf CAB} + {\bf C}^{\rm T} {\bf A}{\bf B}^{\rm T} Atr(ABATC)=CAB+CTABT
证明: ∂ t r ( A B A T C ) ∂ A \frac{\partial {\rm tr}( {\bf AB} {\bf A}^{\rm T}{\bf C} )}{\partial {\bf A}} Atr(ABATC)
= ∂ t r ( A 不变 B A T C ) ∂ A + ∂ t r ( A B A 不变 T C ) ∂ A =\frac{\partial {\rm tr}( {\bf A}_{不变} {\bf B} {\bf A}^{\rm T}{\bf C} )}{\partial {\bf A}}+\frac{\partial {\rm tr}( {\bf AB} {\bf A}^{\rm T}_{不变}{\bf C} )}{\partial {\bf A}} =Atr(A不变BATC)+Atr(ABA不变TC)
= ∂ t r ( A T C A 不变 B ) ∂ A + ∂ t r ( A B A 不变 T C ) ∂ A =\frac{\partial {\rm tr}({\bf A}^{\rm T} {\bf C} {\bf A}_{不变} {\bf B})}{\partial {\bf A}}+\frac{\partial {\rm tr}( {\bf AB} {\bf A}^{\rm T}_{不变}{\bf C} )}{\partial {\bf A}} =Atr(ATCA不变B)+Atr(ABA不变TC)(利用2中性质(1))
= ∂ t r ( B T A 不变 T C T A ) ∂ A + ∂ t r ( A B A 不变 T C ) ∂ A =\frac{\partial {\rm tr}({\bf B}^{\rm T} {\bf A}_{不变}^{\rm T} {\bf C}^{\rm T} {\bf A})}{\partial {\bf A}}+\frac{\partial {\rm tr}( {\bf AB} {\bf A}^{\rm T}_{不变}{\bf C} )}{\partial {\bf A}} =Atr(BTA不变TCTA)+Atr(ABA不变TC)(利用2中性质(3))
= ( B T A T C T ) T + ( B A T C ) T ={({\bf B}^{\rm T} {\bf A}^{\rm T} {\bf C}^{\rm T})}^{\rm T}+{({\bf B} {\bf A}^{\rm T}{\bf C})}^{\rm T} =(BTATCT)T+(BATC)T(利用3中结果(1))
= C A B + C T A B T ={\bf CAB} + {\bf C}^{\rm T} {\bf A}{\bf B}^{\rm T} =CAB+CTABT