$$
\begin{gathered}
\left{(x_{i},y_{i})\right}{i=1}^{N},x{i}\in \mathbb{R}^{p},y_{i} \in \left{0,1\right}
\end{gathered}
$$
逻辑回归是直接对$p(y|x)$建模,而高斯判别分析作为概率生成模型,是通过引入类型的先验,通过贝叶斯公式,得到联合分布$p(x,y)=p(x|y)p(y)$,再对联合分布的对数似然得到参数
贝叶斯公式为
$$p(y|x)=\frac{p(x|y)p(y)}{p(x)}$$
但是由于我们只关心$\begin{aligned} p(y=1|x)=\frac{p(x|y=1)p(y=1)}{p(x)}\end{aligned}$和$\begin{aligned} p(y=0|x)=\frac{p(x|y=0)p(y=0)}{p(x)}\end{aligned}$的大小关系,因此不需要关注分母,因为二者是一样的,即
$$
\begin{aligned}
\hat{y}&=\mathop{argmax\space}\limits_{y \in \left{0,1\right}}p(y|x)\
&由于p(y|x)\propto p(x|y)p(y)\
&=\mathop{argmax\space}\limits_{y}p(y)\cdot p(x|y)
\end{aligned}
$$
高斯判别分析我们对数据集作出的假设有,类的先验是二项分布,每一类的似然是高斯分布,即
$$
\begin{aligned}
y & \sim B(1,\phi)\Rightarrow p(y)=\left{\begin{aligned}&\phi^{y}&y=1\&(1-\phi)^{1-y}&y=0\end{aligned}\right.\
&\Rightarrow p(y)=\phi^{y}(1-\phi)^{1-y}\
x|y=1 &\sim N(\mu_{1},\Sigma)\
x|y=0 & \sim N(\mu_{2},\Sigma) \
&\Rightarrow p(x|y)=N(\mu_{1},\Sigma)^{y}\cdot N(\mu_{2},\Sigma)^{1-y}
\end{aligned}
$$
因此,最大后验
$$
\begin{aligned}
L(\mu_{1},\mu_{2},\Sigma,\phi)&=\log \prod\limits_{i=1}^{N}[p(x_{i}|y_{i})p(y_{i})]\
&=\sum\limits_{i=1}^{N}[\log p(x_{i}|y_{i})+\log p(y_{i})]\
&=\sum\limits_{i=1}^{N}[\log N(\mu_{1},\Sigma)^{y_{i}}+\log N(\mu_{2},\Sigma)^{1-y_{i}}+\log \phi^{y_{i}}(1-\phi)^{1-y_{i}}]
\end{aligned}
$$
作者:张文翔 链接:Andrew Ng Stanford机器学习公开课 总结(5) - 张文翔的博客 | BY ZhangWenxiang (demmon-tju.github.io)