LASSO逻辑回归模型
-
主要用于对样本分类,假设样本个数是\(n\),变量个数是\(p\),\(p\gg n\),因变量\(Y\)是0-1变量,表示样本的两种分类,比如取值1表示样本是diseased,取值0表示样本是正常样本。
-
数据
- 因变量\(Y=(Y_1,Y_2,\cdots,Y_n)^T\)
- 特征矩阵\(X=(X_1,X_2,\cdots,X_p)=(X_{(1)},X_{(2)},\cdots,X_{(n)})^T\)
-
假设\(Y\sim Bernoulli(\pi)\)
-
建立逻辑回归模型
\(ln\frac{\pi_i}{1-\pi_i}=X_{(i)}^T\beta,i=1,2,\cdots,n\)
1. LASSO逻辑回归模型的求解过程
-
使用极大似然估计法估计参数\(\beta\),似然函数为:
\(\prod_{i=1}^n \pi_i^{Y_i}(1-\pi_i)^{(1-Y_i)}\)
-
对似然函数取负对数得到损失函数:
\(l(\beta)=-\frac{1}{n}\sum_{i=1}^n [Y_{i}ln(\pi_{i})+(1-Y_{i})ln(1-\pi_{i})]\)
-
通过牛顿-拉普森算法将最小化上面的损失函数\(l(\beta)\)转化为求解线性回归的重加权最小二乘问题
-
将\(l(\beta)\)在\(\beta\)的当前迭代值\(\tilde{\beta}\)处进行二阶泰勒展开,得到:
\(l_Q(\beta)=l(\tilde{\beta})+(\beta-\tilde{\beta})^T\nabla l(\tilde{\beta})+\frac{1}{2}(\beta-\tilde{\beta})^{T}H(\beta-\tilde{\beta})\)
其中,\(\nabla l(\tilde{\beta})=\frac{\partial l(\beta)}{\partial \beta}\Big|_{\beta=\tilde{\beta}}=-\frac{1}{n}X^T(Y-\tilde\Pi)\)是\(l(\beta)\)在\(\tilde{\beta}\)处的一阶导数,\(H=\frac{\partial^2 l(\beta)}{\partial \beta \partial \beta^T}\Big|_{\beta=\tilde{\beta}}=\frac{1}{n}X^TQX\)是\(l(\beta)\)在\(\tilde{\beta}\)处的Hessian矩阵。
\[\tilde\Pi= \begin{bmatrix} {\tilde{\pi}_1} \\ {\tilde{\pi}_2} \\ {\cdots} \\ {\tilde{\pi}_n} \end{bmatrix}= \begin{bmatrix} \frac{exp{\{X_{(1)}^T\tilde{\beta}\}}}{1+exp{\{X_{(1)}^T\tilde{\beta}\}}} \\ \frac{exp{\{X_{(2)}^T\tilde{\beta}\}}}{1+exp{\{X_{(2)}^T\tilde{\beta}\}}} \\ \cdots \\ \frac{exp{\{X_{(n)}^T\tilde{\beta}\}}}{1+exp{\{X_{(n)}^T\tilde{\beta}\}}} \end{bmatrix} \]\[Q= \begin{bmatrix} q_{11} & 0 & \cdots & 0 \\ 0 & q_{22} & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & q_{nn} \\ \end{bmatrix}= \begin{bmatrix} \tilde{\pi}_1(1-\tilde{\pi}_1) & 0 & \cdots & 0 \\ 0 & \tilde{\pi}_2(1-\tilde{\pi}_2) & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \tilde{\pi}_n(1-\tilde{\pi}_n) \\ \end{bmatrix} \\ = \begin{bmatrix} \frac{exp{\{X_{(1)}^T\tilde{\beta}\}}}{\left(1+exp{\{X_{(1)}^T\tilde{\beta}\}}\right)^2} & 0 & \cdots & 0 \\ 0 & \frac{exp{\{X_{(2)}^T\tilde{\beta}\}}}{\left(1+exp{\{X_{(2)}^T\tilde{\beta}\}}\right)^2} & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \frac{exp{\{X_{(n)}^T\tilde{\beta}\}}}{\left(1+exp{\{X_{(n)}^T\tilde{\beta}\}}\right)^2} \end{bmatrix} \] -
令\(l_Q(\beta)\)的梯度为0:
\(\frac{\partial l_Q(\beta)}{\partial\beta}=\nabla l(\tilde{\beta})+H(\beta-\tilde{\beta})=0\)
-
得到\(\beta\)的迭代公式:
\(\beta=(X^TQX)^{-1}X^T(Y-\tilde{\Pi})+(X^TQX)^{-1}X^TQX\tilde{\beta}\\=(X^TQX)^{-1}X^TQ\left[Q^{-1}(Y-\tilde{\Pi})+X\tilde{\beta}\right]\)
-
令\(Z=Q^{-1}(Y-\tilde{\Pi})+X\tilde{\beta}\),则\(\beta=(X^TQX)^{-1}X^TQZ\),\(\beta\)可以看作新的因变量\(Z\)与数据矩阵\(X\)的线性回归加权最小二乘的解,因此,可以通过坐标下降法求解。
-
-
通过坐标下降法求解具有LASSO惩罚项的加权最小二乘问题
-
此时的目标函数为:
\(L(\beta)=\frac{1}{2n}\sum_{i=1}^n q_{ii}(Z_i-X_{(i)}^T\beta)^2+\lambda\sum_{j=1}^p|\beta_j|\)
-
令梯度为0:
\(\frac{\partial L(\beta)}{\partial \beta_j}=-\frac{1}{n}\sum_{i=1}^n q_{ii}x_{ij}\left(Z_i-\sum_{l\neq j}^p x_{il}\beta_l\right)+\frac{1}{n}\sum_{i=1}^n q_{ii}x_{ij}^2\beta_j+\lambda sign{(\beta_j)}=0\)
-
分情况讨论\(\beta_j\)的迭代公式
如果\(\beta_j>0\),有\(\beta_j=\frac{\sum_{i=1}^nq_{ii}x_{ij}(Z_i-\sum_{l\neq j}^p x_{il}\beta_l)-n\lambda}{\sum_{i=1}^n q_{ii}x_{ij}^2}\)
如果\(\beta_j<0\),有\(\beta_j=\frac{\sum_{i=1}^nq_{ii}x_{ij}(Z_i-\sum_{l\neq j}^p x_{il}\beta_l)+n\lambda}{\sum_{i=1}^n q_{ii}x_{ij}^2}\)
\(\beta_j=0\)当且仅当\(n\lambda\ge\left|\sum_{i=1}^nq_{ii}x_{ij}(Z_i-\sum_{l\neq j}^p x_{il}\beta_l)\right|\)
-
2. MATLAB代码
通过glmnet包求解
clear;
clc;
load data; #加载数据,包括数据矩阵x,因变量y
x=zscore(x);
fold = 10; #10折交叉验证
m = 10; #参数lambda的个数
options = glmnetSet;
options.alpha = 1;
options.nlambda = m;
fit = glmnet(x,y+1,'binomial',options);
CVerr = cvglmnet(x,y+1,fold,[],'class','binomial',glmnetSet,0);
beta_path = fit.beta; #得到对应于所有lambda的系数矩阵
intercept_path = fit.a0; #截距项
lambda_path = fit.lambda; #lambda的取值
opt_index = find(CVerr.glmnetOptions.lambda == CVerr.lambda_min); #最优lambda所在位置
beta = beta_path(:,opt_index);
selectedgene = find(beta~=0); #模型筛选出来的变量所在位置
标签:逻辑,frac,lambda,模型,cdots,beta,tilde,pi,LASSO
From: https://www.cnblogs.com/DYDNyang/p/17181106.html