Here is the note for lecture three.
the linear model
Linear model is a basic and important model in machine learning.
1. input representation
The data we get usually needs some changes, most of them is the input data.
In linear model,
input =(x1,x2,x3,x4,x5...xn)
then the model will be
model =(w1,w2,w3,w4,w5...wn)
That means we should use our learning algorithm to figure out the value of all these ws.
So it is clear that trying to
do the input representation is necessary. Trying to pick out some features of the input as input representation.
2. linear classification
When it comes to classification, linear model will be taken into consideration. Learning algorithm uses lines to classify.
Giving a linear model, we provide the input, and then classification will be got by the output. eg.y=f(X); if f(X)>0 and f(X')<0
then X and X' belong to different parts.
As it mentions above, in linear model, there will be the same parameters as the input. So how to come out a correct model?
There is a basic learning algorithm called Perceptron Learning Algorithm, it's PLA.
In PLA, there will be an initial model.
and learning algorithm will fix it up according to the verification of its data.
Therefore, PLA is a algorithm that getting
final hypothesis by several verifications.
So we can get linear model by PLA.
3. linear regression
What is linear regression?
in fact, it is really common to us.
regression equals a real valued output, if you have a real
valued funtion, then you get a linear regression problem. Sometimes we need a linear model to deal with a linear regression
problem.
I come up with a model now.
watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQveXVtYW8xOTkyMTAwNg==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/SouthEast" alt="">
the W and X are vector form. And I need figure out W to finish this model.
In fact, the problem have a really simple way to deal with. First, let us discuss with the error. f(X) is Our target function,
and we hope h(X) approximate f(X) as well as possible. However, there must be errors. We use square error in linear model, if E means error, then
X,Y,W are vectors.
Of course, we want to minmize E. So we get derivate and equate it with 0
watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQveXVtYW8xOTkyMTAwNg==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/SouthEast" alt="">
Well, as you see, we figure out W with matrix operation.(X and Y are the input data and output data we have got) Is it a simple method?
Finally, the linear regression can be used in linear classification. In linear classification, the initial model could be fixed
out by method used in linear regression, and completed by PLA.