【自学笔记】神经网络(1)

时间:2024-11-08 08:13:49
z j ( i ) = f j ( i ) ( x ( i ) ) = w j ( i ) ⋅ x ( i ) + b j ( i ) z^{(i)}_{j} = f^{(i)}_{j}(x^{(i)}) = w^{(i)}_{j} \cdot x^{(i)} + b^{(i)}_{j} zj(i)=fj(i)(x(i))=wj(i)x(i)+bj(i)
x j ( i + 1 ) = g ( z j ( i ) ) x^{(i+1)}_{j} = g(z^{(i)}_{j}) xj(i+1)=g(zj(i)),其中假设每个神经元用的都是g为 s i g m o i d sigmoid sigmoid函数,不作区分

应用链式法则:
δ L δ w j ( i ) = δ L δ x j ( i + 1 ) ∗ δ x j ( i + 1 ) δ z j ( i ) ∗ δ z j ( i ) δ w j ( i ) \frac{\delta L}{\delta w^{(i)}_{j}}=\frac{\delta L}{\delta x^{(i+1)}_{j}}*\frac{\delta x^{(i+1)}_{j}}{\delta z^{(i)}_{j}}*\frac{\delta z^{(i)}_{j}}{\delta w^{(i)}_{j}} δwj(i)δL=δxj(i+1)δLδzj(i)δxj(i+1)δwj(i)δzj(i)
          = δ L δ x j ( i + 1 ) ∗ x j ( i + 1 ) ∗ ( 1 − x j ( i + 1 ) ) ∗ x j ( i ) \ \ \ \ \ \ \ \ \ =\frac{\delta L}{\delta x^{(i+1)}_{j}}*x^{(i+1)}_{j}*(1-x^{(i+1)}_{j})*x^{(i)}_{j}          =δxj(i+1)δLxj(i+1)(1xj(i+1))xj(i)
δ L δ b j ( i ) = δ L δ x j ( i + 1 ) ∗ δ x j ( i + 1 ) δ z j ( i ) ∗ δ z j ( i ) δ b j ( i ) \frac{\delta L}{\delta b^{(i)}_{j}}=\frac{\delta L}{\delta x^{(i+1)}_{j}}*\frac{\delta x^{(i+1)}_{j}}{\delta z^{(i)}_{j}}*\frac{\delta z^{(i)}_{j}}{\delta b^{(i)}_{j}} δbj(i)δL=δxj(i+1)δLδzj(i)δxj(i+1)δbj(i)δzj(i)
          = δ L δ x j ( i + 1 ) ∗ x j ( i + 1 ) ∗ ( 1 − x j ( i + 1 ) ) \ \ \ \ \ \ \ \ \ =\frac{\delta L}{\delta x^{(i+1)}_{j}}*x^{(i+1)}_{j}*(1-x^{(i+1)}_{j})          =δxj(i+1)δLxj(i+1)(1xj(i+1))

计算 δ L δ x j ( i ) \frac{\delta L}{\delta x^{(i)}_{j}} δxj(i)δL:
δ L δ x j ( i ) = δ L δ x j ( i + 1 ) ∗ δ x j ( i + 1 ) δ x j ( i ) \frac{\delta L}{\delta x^{(i)}_{j}} = \frac{\delta L}{\delta x^{(i+1)}_{j}} * \frac{\delta x^{(i+1)}_{j}}{\delta x^{(i)}_{j}} δxj(i)δL=δx