吴恩达机器学习练习2：Regularized logistic regression

本小节主要练习正则化logistic分类。
1、原始数据的可视化

data = load('ex2data2.txt');%将数据导入data中
X = data(:, [1, 2]); y = data(:, 3);%读取data的第1、2列为输入X，第3列为输出y(y=0或y=1)
plotData(X, y);%调用plotData函数绘图
hold on;
xlabel('Microchip Test 1')
ylabel('Microchip Test 2')
legend('y = 1', 'y = 0')
hold off;

其中补充plotData函数为：

function plotData(X, y)
figure; hold on;
pos = find(y==1);%返回向量y中数值为1的位置,pos也为向量
neg = find(y==0);%返回向量y中数值为0的位置,neg也为向量

%绘制y==1的点，使用红+表示
plot(X(pos,1),X(pos,2),'r+','LineWidth',2,'MarkerSize',7);
%绘制y==0的点，使用蓝o表示
plot(X(neg,1),X(neg,2),'bo','LineWidth',2,'MarkerSize',7);
hold off;
end

其数据可视化图形为：
吴恩达机器学习练习2：Regularized logistic regression
2、扩充特征矢量
原始数据仅有两个特征矢量x1和x2，为更好拟合数据，扩充特征矢量至28维，扩充为：

X = mapFeature(X(:,1), X(:,2));

function out = mapFeature(X1, X2)
degree = 6;
out = ones(size(X1(:,1)));%out为全为1的列向量
for i = 1:degree
    for j = 0:i
        out(:, end+1) = (X1.^(i-j)).*(X2.^j);%增加out矢量列数据
    end
end
end

得到的out矩阵为：
吴恩达机器学习练习2：Regularized logistic regression
设置其初始化参数：

% Initialize fitting parameters
initial_theta = zeros(size(X, 2), 1);

% Set regularization parameter lambda to 1
lambda = 1;

3、代价函数和梯度
正则化后的代价函数为：
吴恩达机器学习练习2：Regularized logistic regression
为防止过拟合，对于特征参数的高次项系数增加惩罚，即代价函数增加theta^2，但无需对theta0（matlab编程中theta0即为theta(1)）进行惩罚。
其梯度计算结果为（分为j=0和j≥1两种情况）：
吴恩达机器学习练习2：Regularized logistic regression

% Compute and display initial cost and gradient for regularized logistic
% regression
[cost, grad] = costFunctionReg(initial_theta, X, y, lambda);

完善costFunctionReg函数：

function [J, grad] = costFunctionReg(theta, X, y, lambda)
m = length(y); % number of training examples
J = 0;
grad = zeros(size(theta));
J = ((log(sigmoid(X*theta)))'*y + (log(1-sigmoid(X*theta)))'*(1-y))/(-m)+...
    (theta'*theta - theta(1)*theta(1))*lambda/(2*m);
grad = (sigmoid(X*theta)-y)'*X/m +theta'*lambda/m;
grad(1) = (sigmoid(X*theta)-y)'*X(:,1)/m;
end

grad采用矩阵进行运算，由于梯度在计算时grad(1)计算方式不同，则只需重新计算grad(1)并刷新数据即可。

验证代码的正确性，输出当初始theta全为0的代价值和梯度的前5项，计算出来的结果与期望值一致。

fprintf('Cost at initial theta (zeros): %f\n', cost);
fprintf('Expected cost (approx): 0.693\n');
fprintf('Gradient at initial theta (zeros) - first five values only:\n');
fprintf(' %f \n', grad(1:5));
fprintf('Expected gradients (approx) - first five values only:\n');
fprintf(' 0.0085\n 0.0188\n 0.0001\n 0.0503\n 0.0115\n');

其输出结果为：

Cost at initial theta (zeros): 0.693147
Expected cost (approx): 0.693
Gradient at initial theta (zeros) - first five values only:
 0.008475 
 0.018788 
 0.000078 
 0.050345 
 0.011501 
Expected gradients (approx) - first five values only:
 0.0085
 0.0188
 0.0001
 0.0503
 0.0115

将theta全部置为1，lambda置为10，重新计算：

% Compute and display cost and gradient
% with all-ones theta and lambda = 10
test_theta = ones(size(X,2),1);
[cost, grad] = costFunctionReg(test_theta, X, y, 10);

fprintf('\nCost at test theta (with lambda = 10): %f\n', cost);
fprintf('Expected cost (approx): 3.16\n');
fprintf('Gradient at test theta - first five values only:\n');
fprintf(' %f \n', grad(1:5));
fprintf('Expected gradients (approx) - first five values only:\n');
fprintf(' 0.3460\n 0.1614\n 0.1948\n 0.2269\n 0.0922\n');

其输出结果与期望一致，保证代码为正确，其输出结果为：

Cost at test theta (with lambda = 10): 3.164509
Expected cost (approx): 3.16
Gradient at test theta - first five values only:
 0.346045 
 0.161352 
 0.194796 
 0.226863 
 0.092186 
Expected gradients (approx) - first five values only:
 0.3460
 0.1614
 0.1948
 0.2269
 0.0922

4、使用fminunc函数进行优化计算

% Initialize fitting parameters
initial_theta = zeros(size(X, 2), 1);

% Set regularization parameter lambda to 1 (you should vary this)
lambda = 1;

% Set Options
options = optimset('GradObj', 'on', 'MaxIter', 400);

% Optimize
[theta, J, exit_flag] = ...
	fminunc(@(t)(costFunctionReg(t, X, y, lambda)), initial_theta, options);

% Plot Boundary
plotDecisionBoundary(theta, X, y);
hold on;
title(sprintf('lambda = %g', lambda))

% Labels and Legend
xlabel('Microchip Test 1')
ylabel('Microchip Test 2')

legend('y = 1', 'y = 0', 'Decision boundary')
% Compute accuracy on our training set
p = predict(theta, X);

fprintf('Train Accuracy: %f\n', mean(double(p == y)) * 100);
fprintf('Expected accuracy (with lambda = 1): 83.1 (approx)\n');

（1）设置lambda = 0，数据训练的准确度为87.3%，其决策边界为：
吴恩达机器学习练习2：Regularized logistic regression

lambda = 0 Train Accuracy: 87.288136

（2）设置lambda = 1，数据训练准确度为83%，其决策边界为：
吴恩达机器学习练习2：Regularized logistic regression

lambda = 1 Train Accuracy: 83.050847

（3）设置lambda = 10，数据训练准确度为74.6%，其决策边界为：
吴恩达机器学习练习2：Regularized logistic regression

lambda = 10 Train Accuracy: 74.576271

（4）设置lambda = 50，数据训练的准确度为66.95%，其决策边界为：
吴恩达机器学习练习2：Regularized logistic regression

lambda = 50 Train Accuracy: 66.949153

（5）设置lambda = 100，数据训练的准确度为61%，其决策边界为：
吴恩达机器学习练习2：Regularized logistic regression

lambda = 100 Train Accuracy: 61.016949

综上，lambda = 0即未对theta进行惩罚，其为过拟合状态，theta = 10、50、100时，其为欠拟合状态，设置lambda=1时的效果较好。
且当lambda=1时，其数据训练的准确度为83%，效果较好。

吴恩达机器学习练习2：Regularized logistic regression

相关推荐