逻辑回归(logistic regression)

时间:2021-08-14 23:39:41

基本概念

逻辑回归是一种概率型非线性回归模型。虽然名字里面有回归,但是他其实是一种分类的方法,通常是用来研究在某些条件下某个结果会不会发生,例如:已知病人身体里的肿瘤的情况,然后判断这个肿瘤是良性还是恶性。

逻辑回归与线性回归

逻辑回归同线性回归一样都需要有一个假设函数 hθ(x) ,代价函数 J(θ) ,在大体上基本相同。
在线性回归中:

hθ(x)=θ0+θ1x1+...+..θnxn

但是我们引入了一个 Sigmoid 函数,将结果给映射到区间 (0,1) .
Sigmoid 函数:
π(x)=11+ex

因此在逻辑回归中的假设函数就是
hθ(x)=π(θτx)=11+eθτx

π(x) 的定义域为 (,+) ,值域为 (0,1) 。因此就将最后的结果映射到了 (0,1) 中。
P(y=1|x,θ)=π(θτx)P(y=0|x,θ)=1π(θτx)

hθ(x)0.5 时, y=1
hθ(x)<0.5 时, y=0

代价函数 J(θ)

在求解参数 θ 的时候我们用用极大似然估计来求解。设 pi=P(yi=1|xi;θ) 表示在给定条件下 yi=1 的概率,则 pi=P(yi=0|xi;θ)=1pi ,所以可以得到一个观测值的概率 P(yi)=pyii(1pi)1yi ,各样本间相互独立就可以得到似然函数为:

L(θ)=Πmi=1[hθ(xi)]yi[1hθ(xi)]1yi

目标就是求这个函数的值最大的时候的参数 θ ,推导过程如下:

lnL(θ)=Σmi=1[y(i)ln(hθ(x(i)))+(1y(i))ln(1hθ(x(i)))]

lnL(θ)=Σmi=1(y(i)ln(11+ex(i))+(1y(i))ln(111+ex(i)))

lnL(θ)=Σmi=1(y(i)ln(ex(i)ex(i)+1)+(1y(i))ln(1ex(i)+1))

lnL(θ)=Σmi=1(y(i)lnex(i)y(i)ln(1+ex(i))(1y(i))ln(1+ex(i)))

lnL(θ)=Σmi=1(x(i)y(i)ln(1+ex(i)))

我们要求得 θ ,使得 L(θ) 最大,那么我们的代价函数就可以这样表示:

J(θ)=1mlnL(θ)=1mΣmi=1(x(i)y(i)ln(1+ex(i)))

为了求得 θ 我们可以用梯度下降法:
J(θ) 求偏导:
J(θ)θj=Σmi=1[y(i)hθ(x(i))]x(i)j=Σmi=1[hθ(x(i))y(i)]x(i)j

因此完整的梯度下降应该为:
θj=θj+αJ(θ)θj=θj+αΣmi=1[hθ(x(i))y(i)]x(i)j

Matlab code

costfunction

function [J, grad] = costFunction(theta, X, y)
%COSTFUNCTION Compute cost and gradient for logistic regression
%   J = COSTFUNCTION(theta, X, y) computes the cost of using theta as the
%   parameter for logistic regression and the gradient of the cost
%   w.r.t. to the parameters.

% Initialize some useful values
m = length(y); % number of training examples

% You need to return the following variables correctly 
J = 0;
grad = zeros(size(theta));

% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta.
%               You should set J to the cost.
%               Compute the partial derivatives and set grad to the partial
%               derivatives of the cost w.r.t. each parameter in theta
%
% Note: grad should have the same dimensions as theta
%

z = X*theta;
hx = 1 ./ (1 + exp(-z));
J = -1/m *sum([y'*log(hx) + (1-y)'*log(1-hx)]);
%J = 1/m * sum([-y' * log(hx) - (1 - y)' * log(1 - hx)]);
for j = 1:length(theta)
    grad(j)=1/m*sum((hx-y)'*X(:,j));
end;






% =============================================================

end

Sigmoid

function g = sigmoid(z)
%SIGMOID Compute sigmoid functoon
% J = SIGMOID(z) computes the sigmoid of z.

% You need to return the following variables correctly 
g = zeros(size(z));

% ====================== YOUR CODE HERE ======================
% Instructions: Compute the sigmoid of each value of z (z can be a matrix,
% vector or scalar).

sz = size(z);

for i = 1:sz(1),
    for j = 1:sz(2),
        g(i,j) = 1./(1+exp(-z(i,j)));
    end;
end;
% =============================================================

end

predict

function p = predict(theta, X)
%PREDICT Predict whether the label is 0 or 1 using learned logistic 
%regression parameters theta
% p = PREDICT(theta, X) computes the predictions for X using a 
% threshold at 0.5 (i.e., if sigmoid(theta'*x) >= 0.5, predict 1)

m = size(X, 1); % Number of training examples

% You need to return the following variables correctly
p = zeros(m, 1);

% ====================== YOUR CODE HERE ======================
% Instructions: Complete the following code to make predictions using
% your learned logistic regression parameters. 
% You should set p to a vector of 0's and 1's
%


pp = sigmoid(X*theta);

pos = find(pp>=0.5);

p(pos,1)=1;




% =========================================================================


end