闲着没事,写了个线性回归的源代码

时间:2023-01-25 18:57:39


以前只是在Coursera上吴恩达的《Machine Learning》课时用Matlab写过线性回归的源代码,这些东西虽然在python中有现成的库可以调用,但为了练手,还是随便写了一个线性回归的源代码。逻辑回归原理和线性回归类似,不过要乘一个diag矩阵。数据集是课后作业中的.mat文件不过由于只是练手的缘故,并没有特别注意输入X的数据格式的问题,比如这里默认X的feature个数为1,。感觉numpy对矩阵操作还是不如Matlab方便


# coding=utf-8
# 版权所有,侵权不究
# typhoonbxq
# the University of *

import scipy.io as scio
import numpy as np
import matplotlib.pyplot as plt

class LR:
    def __init__(self,X,Y):
        self.m = len(Y)                                              # num of training examples
        self.n = len(X[0]) + 1                                       # num of features, adding "1"
        self.X = np.hstack((np.ones((self.m,1)),X))                  # Add '1' to each training example
        self.Y = Y.reshape((self.m,1))
        self.theta = 0
        self.cost = []


    def fit(self):
        theta = np.zeros((self.n,1)) + 0.01                          # Initiate the theta with a small number
        cost_old = np.array([[0]])
        M = self.X.dot(theta) - self.Y
        cost_new = M.T.dot(M) * 1.0 / 2 /self.m
        # print 'the initail cost is',cost_new
        i = 0
        while (np.abs(cost_new - cost_old) > 0.001 ):
            cost_old = cost_new
            theta_grad = X.T.dot(self.X.dot(theta) - self.Y) / self.m
            # print 'in the %d-th interation'%(i),'theta:',theta
            # print 'theta_grad :',theta_grad
            theta = theta - theta_grad * 0.001                     # Actually, I have no idea how to properly set this index,
                                                                     # Anyway, practice and the curve will tell me everything
            M = self.X.dot(theta) - self.Y                           # M is a temporary variable
            cost_new = np.dot(M.T,M) / self.m                        # Divided by self.m, which is the num of training examples
            i = i + 1                                                # The index of iteration is added by 1
            self.cost.append(cost_new[0][0])
        self.theta = theta
        self.cost = np.array(self.cost)
        l = len(self.cost)
        axis_X = np.arange(l)

        # I wanna show how the cost changes in the iteration process
        # This better tells me that my code is right
        plt.plot(axis_X,self.cost)
        plt.xlabel('Iteration No.')
        plt.ylabel('Total Cost')
        plt.title('Iteration VS Cost')
        plt.show()

    def pred(self,X):
        X = np.hstack( ( X,np.ones( (len(X),1) ) ) )
        return np.dot(X,self.theta)

    def plot(self):
        x = np.linspace(0,30,401)
        y = x * self.theta[1] + self.theta[0]
        plt.plot(x,y,'r')
        plt.legend(['Prediction Line'])


# Import data, and change the data format becasue the input should be in np.array format instead of list format
data = scio.loadmat('F:\\data.mat')
X = data['X']
X = np.array(X)
Y = data['Y']
Y = np.array(Y)

# Train the Lineaer Regression with the given datasets
NN = LR(X,Y)
NN.fit()
X_test = np.array([[20]])
Y_test = NN.pred(X_test)
print "When the input is %.2f , the output is %.2f" %(X_test,Y_test)

# Scatter the data as well as the calculated prediction line
plt.scatter(X[:,0],Y)
plt.hold(True)
NN.plot()
plt.show()





主要的问题出在theta_grad的系数上把,最开始取得太大了,不收敛,不过这些东西都很基础,也很好搞定


我想补充说下,求导时出现了diag矩阵,实际上就是不同的向量点乘不同的数据,如果把这些东西统一起来,用一个优美的表达式去解释的话,就是一个矩阵(包含前面所说的不同向量)叉乘一个diag矩阵,从式子上来说,这样做会更优美一些。。。。。。。。