Cost function
标签(空格分隔): ML
目录1.Measurement for error
Even if the hypothesis is generated by the training learning model, the accuracy of the learning model should be taken into account for practical usage.
So cost function is able to measure how our hypothesis fits these data?
\(\implies\)We demonstrate this point in the same way as what we showed on 2-D plane before.
2. 2-D plane demo of the cost function
Before the demonstration of the cost function, there are two previous points:
\( \begin{cases} 1.hypothesis:\quad h_\Theta = \Theta_1 + \Theta_2 x\\ 2.h_\Theta \quad is \quad close \quad to \quad our \quad taining \quad example (x,y) \end{cases} \)
The term "error" here leads up to the concept "projection" in linear algebra.
Well, you may find two vector spaces here:
- the each entries in dataset forms A vector space
- the straight line fitting these data forms P vector spaces
It's easy to find the error vector is \(\vec{e}=A-P\).So as to find error just delves into \(e^T\cdot e.\)
\(A= \left[ \begin{matrix} x_1\quad x_2\quad \cdots\quad x_n\\ y_1 \quad y_2 \quad \cdots \quad y_n \end{matrix} \right] \) \(P=\left[ \begin{matrix} \hat{x_1} \quad \hat{x_2} \quad ... \hat{x_n} \\ \hat{y_1} \quad \hat{y_2} \quad ... \hat{y_n} \end{matrix} \right]\)
Note: each column is one example.
So far,
\(\vec{e}=
\left[
\begin{matrix}
x_1-\hat{x_1} \quad x_2-\hat{x_2} \quad ... \quad x_n-\hat{x_n}\\
y_1-\hat{y_1} \quad y_2-\hat{y_2} \quad ... \quad y_n-\hat{y_n}
\end{matrix}
\right]\quad\quad\quad\quad
\) \(\vec{e}^T=
\left[
\begin{matrix}
x_1-\hat{x_1} \quad y_1-\hat{y_1}\\
\cdots\\
x_n-\hat{x_n} \quad y_n-\hat{y_1}
\end{matrix}
\right]
\)
\(\vec{e} \cdot \vec{e}^T= \left[ \begin{matrix} (x_1-\hat{x_1})^2\times\cdots\times(x_n-\hat{x_n})^2 \quad\quad\quad\quad (x_1-\hat{x_1})\times(y_1-\hat{y_1})\times\cdots\times(x_n-\hat{x_n})\times(y_n-\hat{y_n})\\ (y_1-\hat{y_1})\times(x_1-\hat{x_1})\times\cdots\times(y_n-\hat{y_n})\times(x_n-\hat{x _n})\quad\quad\quad\quad (y_1-\hat{y_1})^2\times\cdots\times(y_n-\hat{y_n})^2 \end{matrix} \right] \)
There is no doubt that linear algebra can facilitate calculation.
(review linear algebra)
According to the lecture of Andrew Ng:
\[J(\theta_1,\theta_2) = \frac{1}{2m}\sum_{i=1}^m(h_\theta(x^{(i)})-y^{(i)})= \frac{1}{2m}\sum_{i=1}^m(\theta_1+\theta_2x^{(i)}-y^{(i)}) \]\(J(\theta_1.\theta_2)\) is cost function.also called as square error function.
3.good model
Q: What kind of result do we need that model produces?
A:as samll result \(J(\theta_1,\theta_2)\) produces as possible.
标签:function,matrix,times,Cost,theta,quad,hat From: https://www.cnblogs.com/UQ-44636346/p/16757436.html