概述
在原始形式中,若(x_i,y_i)为误分类点,可如下更新参数:$$w\leftarrow w+\eta y_ix_i;\quad b\leftarrow b+\eta y_i$$ 假设初始值$w_0=\boldsymbol 0,b_0=0$,对误分类点$(x_i,y_i)$通过上述公式更新参数,修改$n_i$次之后,$w,b$的增量分别为$\alpha_iy_ix_i$和$\alpha_iy_i$,其中$\alpha_i=n_i\eta$
最后学习到的参数为$$w=\sum^N_{i=1}\alpha_i y_i x_i;\quad b=\sum^N_{i=1}\alpha_i y_i$$
算法
输入:训练集:$$T={(x_1,y_1),(x_2,y_2),\cdots,(x_N,y_N)}$$其中,$x_i\in\mathcal X\subseteq\boldsymbol R^n,y_i\in\mathcal Y={+1,-1}$;步长$\eta(0<\eta\leq1)$ 输入:$\alpha,b$;感知机模型$f(x)=sign(\sum^N_{j=1}\alpha_jy_jx_j\cdot x+b)$,其中$\alpha=(\alpha_1,\alpha_2,\cdots,\alpha_N)^T$
- 选取初始值$\alpha^{<0>}=(0,0,\cdots,0)^T,b^{<0>}=0$
- 于训练集中随机选取数据$(x_i,j_i)$
- 若$y_i(\sum^N_{j=1}\alpha_jy_jx_j\cdot x_i+b)\leq0$,$$\alpha_i\leftarrow\alpha_i+\eta;\quad b\leftarrow b+\eta y_i$$
- 转2.,直到训练集中没有误分类点
Gram矩阵
对于算法中的第三步 若$y_i(\sum^N_{j=1}\alpha_jy_jx_j\cdot x_i+b)\leq0$,$$\alpha_i\leftarrow\alpha_i+\eta;\quad b\leftarrow b+\eta y_i$$ 迭代条件:$$\begin{aligned}y_i(\sum^N_{j=1}\alpha_jy_jx_j\cdot x_i+b)&=y_{i}[(\alpha {1}y{1}x_{1}+ \alpha {2}y_2x{2}+\cdots+ \alpha {N}y{N}x_N)x_{i}+b)]\ &=y_{i}(\alpha {1}y_1x{1}\cdot x_{i}+ \alpha {2}y_2x{2}\cdot x_{i}+\cdots+ \alpha {N}y_Nx{N}\cdot x_{i}+b)\ &\leq0\end{aligned}$$ Gram矩阵$$G=[x_i\cdot x_j]_{N\times N}=\left[ \begin{matrix} { x _ { 1 } \cdot x _ { 1 } } & { x _ { 1 } \cdot x _ { 2 } } & { \cdots } & { x _ { 1 }\cdot x _ { N } } \ { x _ { 2 } \cdot x _ { 1 } } & { x _ { 2 } \cdot x _ { 2 } } & { \cdots } & { x _ { 2 } \cdot x _ { N } } \\vdots&\vdots&&\vdots\x_N\cdot x_1&x_N\cdot x_2&\cdots&x_N\cdot x_N\end{matrix}\right]$$
例题分析
对之前的例题采用原始算法求解的迭代过程
迭代次数 | 误分类点 | $w$ | $b$ | $w\cdot x+b$ |
---|---|---|---|---|
$0$ | $(0,0)^T$ | $0$ | $0$ | |
$1$ | $x_1$ | $(3,3)^T$ | $1$ | $3x^{(1)}+3x^{(2)}+1$ |
$2$ | $x_3$ | $(2,2)^T$ | $0$ | $2x^{(1)}+2x^{(2)}$ |
$3$ | $x_3$ | $(1,1)^T$ | $-1$ | $x^{(1)}+x^{(2)}-1$ |
$4$ | $x_3$ | $(0,0)^T$ | $-2$ | $-2$ |
$5$ | $x_1$ | $(3,3)^T$ | $-1$ | $3x^{(1)}+3x^{(2)}-1$ |
$6$ | $x_3$ | $(2,2)^T$ | $-2$ | $2x^{(1)}+2x^{(2)}-2$ |
$7$ | $x_3$ | $(1,1)^T$ | $-3$ | $x^{(1)}+x^{(2)}-2$ |
$8$ | $(1,1)^T$ | $-3$ | $x^{(1)}+x^{(2)}-3$ |
更新次数:$n_1=2,n_2=0,n_3=5$,最后学习到的参数为$$w=\alpha_1y_1x_1+\alpha_3y_3x_3=(1,1)^T;\quad b=\alpha_1y_1+\alpha_3y_3=-3$$其中$\alpha_1=n_1\eta=2\cdot1=2,\alpha_3=n_3\eta=5\cdot1=5$
使用对偶形式进行计算
输入:训练集:$$T={(x_1,+1),(x_2,+1),(x_3,-1)}$$其中,$x_1=(3,3)^T,x_2=(4,3)^T,x_3=(1,1)^T$,假设$\eta=1$ 输出:$\alpha,b$;感知机模型$f(x)=sign(\sum^N_{i=j}\alpha_jy_jx_j\cdot x+b)$
-
选取初始值$\alpha^{<0>}=(0,0,0)^T,b^{<0>}=0$
-
计算Gram矩阵$$G=\left[ \begin{array} { l l } { x _ { 1 } \cdot x _ { 1 } } & { x _ { 1 } \cdot x _ { 2 } } & { x _ { 1 } \cdot x _ { 3 } } \ { x _ { 2 } \cdot x _ { 1 } } & { x _ { 2 } \cdot x _ { 2 } } & { x _ { 2 } \cdot x _ { 3 } } \ { x _ { 3 } \cdot x _ { 1 } } &x_3\cdot x_2&x_3\cdot x_3\end{array}\right]=\left[ \begin{array} { l l l } { 18 } & { 21 } & { 6 } \ { 21 } & { 25 } & { 7 } \ { 6 } & { 7 } & { 2 } \end{array} \right]$$
-
误分条件$y_i(\sum^N_{j=1}\alpha_jy_jx_j\cdot x_i+b)\leq0$,参数更新$$\alpha_i\rightarrow \alpha_i+1;\quad b\rightarrow b+\eta y_i$$
-
对于点$x_1$,有$$y_{1}(\sum {j=1}^{N}\alpha {j}^{<0>}y{j}x{j}\cdot x_{1}+b^{<0>})=0$$ 更新参数,$$\alpha_{1}^{<1>}= \alpha {1}^{<0>}+ \eta =1, \quad b^{<1>}=b^{<0>}+ \eta y{1}=1$$
-
对于点$x_1$,有$$y_{1}(\sum {j=1}^{N}\alpha {j}^{<1>}y{j}x{j}\cdot x_{1}+b^{<1>})=y_{1}(\alpha {1}^{<1>}y{1}x_{1}\cdot x_{1}+b^{<1>})=19>0$$ 对于点$x_2$,有$$y_{2}(\sum {j=1}^{N}\alpha {j}^{<1>}y{j}x{j}\cdot x_{2}+b^{<1>})=y_{2}(\alpha {1}^{<1>}y{1}x_{1}\cdot x_{2}+b^{<1>})=22>0$$ 对于点$x_1$,有$$y_{1}(\sum {j=1}^{N}\alpha {j}^{<1>}y{j}x{j}\cdot x_{1}+b^{<1>})=y_{1}(\alpha {1}^{<1>}y{1}x_{1}\cdot x_{1}+b^{<1>})=19>0$$ 更新参数$$\alpha_{3}^{<2>}= \alpha {3}^{<1>}+ \eta =1, \quad b^{<2>}=b^{<1>}+ \eta y{3}=0$$
-
重复以上步骤,直到没有误分类点
迭代次数|误分类点|$\alpha_1$|$\alpha_2$|$\alpha_3$|$b$ :-:|:-:|:-:|:-:|:-:|:-: $0$| |$0$|$0$|$0$|$0$ $1$|$x_1$|$1$|$0$|$0$|$1$ $2$|$x_3$|$1$|$0$|$1$|$0$ $3$|$x_3$|$1$|$0$|$2$|$-1$ $4$|$x_3$|$1$|$0$|$3$|$-2$ $5$|$x_1$|$2$|$0$|$3$|$-1$ $6$|$x_3$|$2$|$0$|$4$|$-2$ $7$|$x_3$|$2$|$0$|$5$|$-3$
-
得到参数$$w^{<7>}=2x_1+0x_2-5x_3=(1,1)^T,\quad b^{<7>}=-3$$
结果:
- 分离超平面$$x^{(1)}+x^{(2)}-3=0$$
- 感知机模型$$f(x)=sign(x^{(1)}+x^{(2)}-3)$$
算法实现
计算Grma矩阵
def count_gram(x):
"""计算Grma矩阵
:param x: 输入变量
:return: 输入变量的Gram矩阵
"""
n_samples = len(x)
n_features = len(x[0])
gram = [[0] * n_samples for i in range(n_samples)]
for i in range(n_samples):
for j in range(n_samples):
gram[i][j] = sum(x[i][k] * x[j][k] for k in range(n_features))
gram[j][i] = gram[i][j]
return gram
感知机学习算法的对偶形式
def dual_form_perceptron(x, y, eta):
"""感知机学习算法的对偶形式
:param x: 输入变量
:param y: 输出变量
:param eta: 学习率
:return: 感知机模型的a(alpha)和b
"""
n_samples = len(x)
a0, b0 = [0] * n_samples, 0
gram = count_gram(x)
while True:
for i in range(n_samples):
xi, yi =x[i], y[i]
val = 0
for j in range(n_samples):
xj, yj=x[j], y[j]
val += a0[j] * y[j] * gram[i][j]
if (yi * (val + b0)) <= 0:
a0[i] = a0[i] + eta
b0 = b0 + eta * yi
break
else:
return a0, b0
测试
dataset = [[(3, 3), (4, 3), (1, 1)], [1, 1, -1]]
dual_form_perceptron(dataset[0], dataset[1], eta=1)
([2, 0, 5], -3)
标签:cdot,sum,学习,感知机,eta,gram,samples,alpha,对偶
From: https://blog.51cto.com/u_15767241/6049741