1 交叉熵损失函数
\[\begin{aligned} L_{\mathrm{CE}}(\hat{y},y)& =-\log p(y|x)~=~-[y\log\hat{y}+(1-y)\log(1-\hat{y})] \\ &=-[y\log\sigma(w\cdot x+b)+(1-y)\log{(1-\sigma(w\cdot x+b))}] \end{aligned} \]2 对权重 \(w_j\) 求梯度
令 \(z = w \cdot x + b\) 得
\[\begin{aligned} \frac{\partial L_{CE}(\hat {y},y)}{\partial w_j}& =\left.-\left(\frac y{\sigma(z)}-\frac{(1-y)}{1-\sigma(z)}\right)\frac{\partial\sigma}{\partial w_j}\right. \\ &=-\left(\frac y{\sigma(z)}-\frac{(1-y)}{1-\sigma(z)}\right)\sigma^{\prime}(z)x_j \\ &=\frac{\sigma^{\prime}(z)x_j}{\sigma(z)(1-\sigma(z))}(\sigma(z)-y) \\ &=x_j(\sigma(z)-y) \end{aligned} \]其中
\[\sigma^{\prime}(z)=\sigma(z)(1-\sigma(z)) \]证明如下:
\[\begin{aligned} \sigma^{\prime}(z)& =(\frac1{1+e^{-z}})^{\prime} \\ &=(-1)(1+e^{-z})^{(-1)-1}\cdot(e^{-z})^{\prime} \\ &=\frac{1}{\left(1+e^{-z}\right)^{2}}\cdot(e^{-z}) \\ &=\frac{1}{1+e^{-z}}\cdot\frac{e^{-z}}{1+e^{-z}} \\ &=\frac{1}{1+e^{-z}}\cdot(1-\frac{1}{1+e^{-z}}) \\ &=\sigma(z)(1-\sigma(z)) \end{aligned} \]最近考试,后期会补充细节,倘若大佬发现错误,敬请斧正,感谢感谢!
标签:prime,frac,函数,交叉,cdot,损失,aligned,sigma,log From: https://www.cnblogs.com/W-ayang/p/17971791