首页 > 其他分享 >线性判别分析(fisher)

线性判别分析(fisher)

时间:2022-09-29 20:26:54浏览次数:54  
标签:frac sum tilde 判别分析 mu pmb theta 线性 fisher

线性判别分析

线性判别分析中有降维,把数据都投影到同一条线上,然后在直线上取一个阈值,将直线分成两条射线,每一条代表一个分类。会损失一些数据信息,但如果这些信息是一些干扰信息,丢失也未尝不是好事。

线性判别分析之后的结果是一个向量,其他的不行吗?

主要指导思想(目标):类内小,类间大。

公式推导

我们得到的是向量,为了方便计算损失,不妨设\(||\pmb w||=1\),每一个数据$ \pmb X_i\(看作一个向量。那么\)\pmb X_i\pmb w\(是每个数据在\)\pmb w\(方向上的投影。与\)\pmb w$的其中一个平面是划分平面。

两个不同类别分别命名为\(C_1\)和\(C_2\),用\(\pmb\mu\),\(\pmb\mu_{C_1}\), \(\pmb\mu_{C_2}\)分别代表全部数据,\(C_1\)数据,\(C_2\)数据的均值,用\(\pmb\Sigma\),\(\pmb\Sigma_{C_1}\), \(\pmb\Sigma_{C_2}\)分别代表全部数据,\(C_1\)数据,\(C_2\)数据的协方差矩阵。
\(\tilde{\mu}\)和\(\tilde{\sigma}^2\)表示投影的均值和方差。

\[{\LARGE \begin{array}{ccl} \pmb\mu &=& \frac{1}{N} \sum_1^{N}\pmb X_i \\ \pmb\mu_{C_1} &=& \frac{1}{N_{C_1}} \sum_1^{N_{C_1}}\pmb X_{C_1i} \\ \pmb\mu_{C_2} &=& \frac{1}{N_{C_2}} \sum_1^{N_{C_2}}\pmb X_{C_1i}\ \\ \pmb\Sigma &=& \frac{1}{N} \sum_1^{N}(\pmb X_i -\pmb\mu)(\pmb X_i -\pmb\mu)^T \\ \pmb\Sigma_{C_1} &=& \frac{1}{N_{C_1}} \sum_1^{N_{C_1}}(\pmb X_{C_1i} -\pmb\mu_{C_1} )(\pmb X_{C_1i} -\pmb\mu_{C_1} )^T\\ \pmb\Sigma_{C_2} &=& \frac{1}{N_{C_2}} \sum_1^{N_{C_2}}(\pmb X_{C_2i} -\pmb\mu_{C_2} )(\pmb X_{C_2i} -\pmb\mu_{C_2} )^T\\ \end{array} } \]

\[{\LARGE \begin{array}{ccl} \tilde\mu &=& \frac{1}{N} \sum_1^{N}\pmb X_i\pmb \theta \\ \tilde\mu_{C_1} &=& \frac{1}{N_{C_1}} \sum_1^{N_{C_1}}\pmb X_{C_1i}\pmb \theta \\ \tilde\mu_{C_2} &=& \frac{1}{N_{C_2}} \sum_1^{N_{C_2}}\pmb X_{C_2i}\pmb \theta \\ \tilde\sigma^2 &=& \frac{1}{N} \sum_1^{N}(\pmb X_i\pmb \theta -\tilde\mu)^2 \\ \tilde\sigma_{C_1}^2 &=& \frac{1}{N_{C_1}} \sum_1^{N_{C_1}}(\pmb X_{C_1i}\pmb \theta -\tilde\mu_{C_1} )^2\\ \tilde\sigma_{C_2}^2 &=& \frac{1}{N_{C_2}} \sum_1^{N_{C_2}}(\pmb X_{C_2i}\pmb \theta -\tilde\mu_{C_2} )^2\\ \end{array} } \]

类间:\((\tilde\mu_{C_1}-\tilde\mu_{C_2})^2\)

类内:\(\tilde\sigma^2_{C_1}+\tilde\sigma^2_{C_2}\)

目标函数:\(J(\pmb \theta) = \frac{(\tilde\mu_{C_1}-\tilde\mu_{C_2})^2}{\tilde\sigma^2_{C_1}+\tilde\sigma^2_{C_2}}\)

\[{\LARGE \begin{array}{ccl} J(\pmb \theta) &=& \frac{(\tilde\mu_{C_1}-\tilde\mu_{C_2})^2}{\tilde\sigma^2_{C_1}+\tilde\sigma^2_{C_2}}\\ 分子&=&(\tilde\mu_{C_1}-\tilde\mu_{C_2})^2\\ &=&(\frac{1}{N_{C_1}} \sum_1^{N_{C_1}}\pmb X_{C_1i}\pmb \theta- \frac{1}{N_{C_2}} \sum_1^{N_{C_2}}\pmb X_{C_2i}\pmb \theta)^2\\ &=&((\pmb\mu_{C_1}-\pmb\mu_{C_2})\pmb \theta)^2\\ &=&\pmb \theta^T(\pmb\mu_{C_1}-\pmb\mu_{C_2})^T(\pmb\mu_{C_1}-\pmb\mu_{C_2})\pmb \theta\\ \tilde\sigma^2_{C_1} &=& \frac{1}{N_{C_1}} \sum_1^{N_{C_1}}(\pmb X_{C_1i}\pmb \theta -\tilde\mu_{C_1} )^2\\ &=& \frac{1}{N_{C_1}} \sum_1^{N_{C_1}}(\pmb X_{C_1i}\pmb \theta -\frac{1}{N_{C_1}} \sum_1^{N_{C_1}}\pmb X_{C_1i}\pmb \theta )^2\\ &=& \frac{1}{N_{C_1}} \sum_1^{N_{C_1}}((\pmb X_{C_1i} -\frac{1}{N_{C_1}} \sum_1^{N_{C_1}}\pmb X_{C_1i})\pmb \theta )^2\\ &=& \frac{1}{N_{C_1}} \sum_1^{N_{C_1}}((\pmb X_{C_1i} -\pmb\mu_{C_1})\pmb \theta )^2\\ &=& \frac{1}{N_{C_1}} \sum_1^{N_{C_1}}\pmb \theta^T(\pmb X_{C_1i} -\pmb\mu_{C_1})^T(\pmb X_{C_1i} -\pmb\mu_{C_1})\pmb \theta \\ &=& \pmb\theta^T(\frac{1}{N_{C_1}} \sum_1^{N_{C_1}} (\pmb X_{C_1i} -\pmb\mu_{C_1})^T(\pmb X_{C_1i} -\pmb\mu_{C_1}))\pmb \theta \\ &=& \pmb\theta^T\pmb\Sigma_{C_1}\pmb \theta \\ \tilde\sigma^2_{C_2} &=& \pmb\theta^T\pmb\Sigma_{C_2}\pmb \theta\\ 分母&=&\pmb\theta^T\pmb\Sigma_{C_1}\pmb \theta+\pmb\theta^T\pmb\Sigma_{C_2}\pmb \theta\\ &=&\pmb\theta^T(\pmb\Sigma_{C_1}+\pmb\Sigma_{C_2})\pmb \theta\\ \end{array} } \]

\({\LARGE \therefore}\)

\( {\LARGE \begin{array}{ccl} J(\pmb \theta) &=& \frac{\pmb \theta^T(\pmb\mu_{C_1}-\pmb\mu_{C_2})^T(\pmb\mu_{C_1}-\pmb\mu_{C_2})\pmb \theta}{\pmb\theta^T(\pmb\Sigma_{C_1}+\pmb\Sigma_{C_2})\pmb \theta}\\ \end{array} } \)

设$ S_b = (\pmb\mu_{C_1}-\pmb\mu_{C_2})^T(\pmb\mu_{C_1}-\pmb\mu_{C_2})\(,\)S_w = \pmb\Sigma_{C_1}+\pmb\Sigma_{C_2}$

\(S_b\)就是类内方差

\(S_w\)就是类间方差

此时\({\LARGE J(\pmb \theta) = \frac{\pmb \theta^T S_b \pmb \theta}{\pmb\theta^TS_w\pmb \theta}}\)

求导

\[{\LARGE \begin{array}{rcl} \frac{\partial J(\pmb \theta)}{\partial \pmb\theta } &=& \frac{\partial\frac{\pmb \theta^T \pmb S_b \pmb \theta}{\pmb\theta^T\pmb S_w\pmb \theta}}{\partial\pmb\theta}\\ &=& \frac{\partial(\pmb \theta^T \pmb S_b \pmb \theta(\pmb\theta^T\pmb S_w\pmb \theta)^{-1})}{\partial\pmb\theta}\\ &=& \frac{\partial(\pmb \theta^T \pmb S_b \pmb \theta )}{\partial\pmb\theta}(\pmb\theta^T\pmb S_w\pmb \theta)^{-1}+\pmb \theta^T \pmb S_b \pmb \theta \frac{\partial((\pmb\theta^T\pmb S_w\pmb \theta)^{-1})}{\partial\pmb\theta}\\ &=&2\pmb\theta^T\pmb S_b(\pmb\theta^T\pmb S_w\pmb \theta)^{-1}+ \pmb \theta^T \pmb S_b \pmb \theta (- \frac{1}{(\pmb \theta^T \pmb S_w \pmb \theta )^2}) (2\pmb\theta^T\pmb S_w) \end{array} } \]

令导数等于零

\[{\LARGE \begin{array}{rcl} \pmb 0 &=& 2\pmb S_b\pmb\theta(\pmb\theta^T\pmb S_w\pmb \theta)^{-1}+ \pmb \theta^T \pmb S_b \pmb \theta (- \frac{1}{(\pmb \theta^T \pmb S_w \pmb \theta )^2}) (2\pmb S_w\pmb\theta) \\ 2\pmb S_b\pmb\theta(\pmb\theta^T\pmb S_w\pmb \theta)^{-1}&=& \pmb \theta^T \pmb S_b \pmb \theta ( \frac{1}{(\pmb \theta^T \pmb S_w \pmb \theta )^2}) (2\pmb S_w\pmb\theta)\\ \pmb S_b\pmb\theta(\pmb\theta^T\pmb S_w\pmb \theta) &=& (\pmb \theta^T \pmb S_b \pmb \theta )\pmb S_w\pmb\theta\\ (\pmb \theta^T \pmb S_b \pmb \theta )\pmb S_w\pmb\theta &=& \pmb S_b\pmb\theta(\pmb\theta^T\pmb S_w\pmb \theta) \\ \pmb\theta&=& \pmb S_w^{-1}\frac{\pmb \theta^T \pmb S_w \pmb \theta } {\pmb\theta^T\pmb S_b\pmb \theta}\pmb S_b\pmb\theta\\ \pmb\theta &=& \pmb S_w^{-1}\frac{\pmb \theta^T \pmb S_w \pmb \theta } {\pmb\theta^T\pmb S_b\pmb \theta}(\pmb\mu_{C_1}-\pmb\mu_{C_2})^T(\pmb\mu_{C_1}-\pmb\mu_{C_2})\pmb\theta\\ \end{array} } \]

\(\because\)
\(\frac{\pmb \theta^T \pmb S_w \pmb \theta }{\pmb\theta^T\pmb S_b\pmb \theta}\),\((\pmb\mu_{C_1}-\pmb\mu_{C_2})\pmb\theta\)是一个数,不影响\(\pmb \theta\)的方向

\({\LARGE \therefore}\)
\( {\LARGE \pmb\theta \propto \pmb S_w^{-1}(\pmb\mu_{C_1}-\pmb\mu_{C_2})^T } \)

\({\LARGE \mathcal{{\color{Blue} {if}} } } \pmb S_w \propto \pmb I\)

\({\LARGE \pmb \theta \propto (\pmb\mu_{C_1}-\pmb\mu_{C_2})^T }\)

求任意一个点的投影

\( {\Large proj_{\pmb \theta}(x) = x^T\pmb\theta } \)

求阈值

\( {\Large \begin{array}{rcl} threshold &=& \frac{N_{C_1}\tilde\mu_{C_1}+N_{C_2}\tilde\mu_{C_1}}{N_{C_1}+N_{C_2}}\\ &=& \frac{N_{C_1}\pmb\mu_{C_1}\pmb\theta+N_{C_2}\pmb\mu_{C_1}\pmb\theta}{N_{C_1}+N_{C_2}} \end{array} } \)

依赖

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

人工数据集

n = 100
X = np.random.multivariate_normal((1, 1), [[0.64, 0], [0, 0.64]], size = int(n/2))
X = np.insert(X, 50, np.random.multivariate_normal((3, 3), [[0.64, 0], [0,0.64]], size = int(n/2)),0)
#X = np.insert(X, 0, 1, 1)
m = X.shape[1]
y = np.array([1]*50+[-1]*50).reshape(-1,1)
plt.scatter(X[:50, -2], X[:50, -1])
plt.scatter(X[50:, -2], X[50:, -1], c = "#ff4400")
<matplotlib.collections.PathCollection at 0x7f2b50e680d0>

image

X1 = X[(y==1).reshape(-1)]
X0 = X[(y==-1).reshape(-1)]
n1 = np.array([[X1.shape[0]]])
n0 = np.array([[X0.shape[0]]])
mu1 = X1.mean(axis = 0).reshape(-1,1)
mu0 = X0.mean(axis = 0).reshape(-1,1)
Sigma1 = np.cov(X1.T)
Sigma0 = np.cov(X0.T)
theta = (Sigma1 + Sigma0) @ (mu1 - mu0)
threshold = (n1*mu1 + n0*mu0).T@theta/(n1 + n0)
def getForecast(x):
    return x.T @ theta
threshold
array([[-10.45793931]])

预测

print(f'{ 1 if getForecast(np.array([[1],[1]])) > threshold else 0}')
1

分界展示

plt.scatter(X[:50, -2], X[:50, -1])
plt.scatter(X[50:, -2], X[50:, -1], c = "#ff4400")
for i in np.arange(-1,5,0.02):
    for j in np.arange(-1,5,0.02):
        if abs(getForecast(np.array([[i],[j]])) - threshold) <0.01:
            plt.scatter(i,j,c="#000000")

image

标签:frac,sum,tilde,判别分析,mu,pmb,theta,线性,fisher
From: https://www.cnblogs.com/RanX2018/p/16742892.html

相关文章

  • 线性回归
    线性回归导入库importnumpyasnpimportpandasaspdimportmatplotlib.pyplotasplt人工数据集'''n=100true_theta=np.array([[1],[1]])X=np.insert(......
  • ARC146C Even XOR(线性基,组合)
    ARC146CEvenXOR有多少集合\(S\),每个元素都在\([0,2^N)\)之间,且所有偶数大小的子集的异或和不为\(0\)。CODE奇数大小的子集\(\oplus\)和可以为\(0\),可是如果......
  • [转]逻辑回归和线性回归区别
    https://blog.csdn.net/qq_30354455/article/details/827976201)线性回归要求变量服从正态分布,logistic回归对变量分布没有要求。2)线性回归要求因变量是连续性数值变量,而lo......
  • python 线性代数:解多元一次方程
    因为在程序化交易策略中使用了网格算法进行交易,因为在网格中想设置动态资源大小的问题,所以就想到使用抛物线的分布方法来对网格资金配置进行分配。比如我的网格最大值设置......
  • 线性表前期学习
         ......
  • UKF和EKF算法在非线性系统中的应用比较
    参考内容:书籍《卡尔曼滤波原理及应用------matlab仿真》这本书对kalman算法的解析很清晰,MATLAB程序很全,适合初学者(如有侵权,请联系删除(qq:1491967912))之前学习了EKF算法和......
  • 归档 220924 | 线性基学习笔记
    下文中的「线性基」都是指异或线性基。我自认为比GM给的那篇博客讲的清楚,,,当然是假的。不过说起来我不是很懂为什么CSP之前要学这么偏的知识点。。。定义给出一个......
  • 时间复杂度、线性查找
    排序算法时间复杂度比较稳定:如果a原本在b前面,而a=b,排序之后a仍然在b的前面;不稳定:如果a原本在b的前面,而a=b,排序之后a可能会出现在b的后面;内排序:所有排序操作都在内存......
  • 尾递归与非尾递归(线性递归)
    1尾递归与非尾递归区别非尾递归(线性递归):当数量很大时,会造成栈溢出。因为每次递归调用时,递归函数中的参数,局部变量等都要保存在栈中。尾递归:return时只调用自身,不能有额......
  • 【线性dp】 [SCOI2009]粉刷匠
    点个关注点个赞吧一道比较简单的线性dp题目前置知识:会手推一些简单的状态转移方程、较为熟练地掌握背包问题模型[SCOI2009]粉刷匠题目描述windy有\(N\)条木......