cs229|编程作业第二周(Python)
练习二:逻辑回归
目录
1.包含的文件
2.使用逻辑回归二分类(不带正则化)
3.使用逻辑回归二分类(带正则化)
1.包含的文件
文件名 | 含义 |
ex2.py | 逻辑回归算法(不带正则化)主程序 |
ex2_reg.py | 逻辑回归算法(带正则化)主程序 |
ex2data1.txt | 第一个实验的训练数据集 |
ex2data2txt | 第二个实验的训练数据集 |
mapFeature.py | 从原有特征生成更多的特征 |
plotDecisionBoundary.py | 绘制决策边界 |
plotData.py | 可视化数据 |
sigmoid.py | Sigmoid函数 |
costFunction.py | 逻辑回归(不带正则化)代价函数(梯度) |
predict.py | 逻辑回归测试函数 |
costFunctionReg.py | 逻辑回归(带正则化)代价函数(梯度) |
红色部分需要自己填写。
2.逻辑回归
- 需要的包,以及初始化:
import matplotlib.pyplot as plt
import numpy as np
import scipy.optimize as opt
from plotData import *
import costFunction as cf
import plotDecisionBoundary as pdb
import predict as predict
from sigmoid import *
plt.ion()
# Load data
# The first two columns contain the exam scores and the third column contains the label.
data = np.loadtxt('ex2data1.txt', delimiter=',')
X = data[:, 0:2]
y = data[:, 2]
2.1数据可视化
- 数据可视化plotData.py:
import matplotlib.pyplot as plt
import numpy as np
def plot_data(X, y):
plt.figure()
# ===================== Your Code Here =====================
# Instructions : Plot the positive and negative examples on a
# 2D plot, using the marker="+" for the positive
# examples and marker="o" for the negative examples
#
count = 0
for i in y:
if i == 1:
plt.scatter(X[count,0],X[count,1],marker='+',color = 'b')
else:
plt.scatter(X[count,0],X[count,1],marker='o',color = 'r')
count = count+1
- 测试代码:
# ===================== Part 1: Plotting =====================
print('Plotting Data with + indicating (y = 1) examples and o indicating (y = 0) examples.')
plot_data(X, y)
plt.axis([30, 100, 30, 100])
plt.legend(['Admitted', 'Not admitted'], loc=1)
plt.xlabel('Exam 1 score')
plt.ylabel('Exam 2 score')
input('Program paused. Press ENTER to continue')
- 测试效果:
2.2逻辑回归
- 逻辑回归模型:
- 编写sigmoid函数sigmoid.py
import numpy as np
def sigmoid(z):
g = np.zeros(z.size)
# ===================== Your Code Here =====================
# Instructions : Compute the sigmoid of each value of z (z can be a matrix,
# vector or scalar
#
# Hint : Do not import math
g = 1/(1+np.exp(-z))#计算sigmoid函数
return g
- 代价函数为:
- 梯度下降为:
- 编写损失函数(以及梯度)costFunction.py:
import numpy as np
from sigmoid import *
def cost_function(theta, X, y):
m = y.size
# You need to return the following values correctly
cost = 0
grad = np.zeros(theta.shape)
# ===================== Your Code Here =====================
# Instructions : Compute the cost of a particular choice of theta
# You should set cost and grad correctly.
#
#h_out = theta*X#回归输出
h_out= np.dot(X,theta)#使用点乘 生成一维的数组
cost = -np.mean((y*np.log(sigmoid(h_out))+(1-y)*np.log(1-sigmoid(h_out))),axis=0)#按列求和 并除以m得损失函数
error = sigmoid(h_out)- y# sigmoid(a0+a1x1+...) 再与y求偏差
error_2d = error.reshape(m,1)#转换成2维数组进行乘法
error_all = error_2d*X#是一个 m*n大小的数组
grad = (1/m)*error_all.sum(axis=0)
# ===========================================================
return cost, grad
- 测试代码:
# ===================== Part 2: Compute Cost and Gradient =====================
# In this part of the exercise, you will implement the cost and gradient
# for logistic regression. You need to complete the code in
# costFunction.py
# Setup the data array appropriately, and add ones for the intercept term
(m, n) = X.shape
# Add intercept term
X = np.c_[np.ones(m), X]
# Initialize fitting parameters
initial_theta = np.zeros(n + 1)
# Compute and display initial cost and gradient
cost, grad = cf.cost_function(initial_theta, X, y)
np.set_printoptions(formatter={'float': '{: 0.4f}\n'.format})
print('Cost at initial theta (zeros): {:0.3f}'.format(cost))
print('Expected cost (approx): 0.693')
print('Gradient at initial theta (zeros): \n{}'.format(grad))
print('Expected gradients (approx): \n-0.1000\n-12.0092\n-11.2628')
# Compute and display cost and gradient with non-zero theta
test_theta = np.array([-24, 0.2, 0.2])
cost, grad = cf.cost_function(test_theta, X, y)
print('Cost at test theta (zeros): {}'.format(cost))
print('Expected cost (approx): 0.218')
print('Gradient at test theta: \n{}'.format(grad))
print('Expected gradients (approx): \n0.043\n2.566\n2.647')
input('Program paused. Press ENTER to continue')
- 测试结果:
Cost at initial theta (zeros): 0.693
Expected cost (approx): 0.693
Gradient at initial theta (zeros):
[-0.1000
-12.0092
-11.2628
]
Expected gradients (approx):
-0.1000
-12.0092
-11.2628
Cost at test theta (zeros): 0.21833019382659774
Expected cost (approx): 0.218
Gradient at test theta:
[ 0.0429
2.5662
2.6468
]
Expected gradients (approx):
0.043
2.566
2.647
2.3使用opt.fmin_bfgs进行优化求解
- 测试代码:
# ===================== Part 3: Optimizing using fmin_bfgs =====================
# In this exercise, you will use a built-in function (opt.fmin_bfgs) to find the
# optimal parameters theta
def cost_func(t):
return cf.cost_function(t, X, y)[0]
def grad_func(t):
return cf.cost_function(t, X, y)[1]
# Run fmin_bfgs to obtain the optimal theta
theta, cost, *unused = opt.fmin_bfgs(f=cost_func, fprime=grad_func, x0=initial_theta, maxiter=400, full_output=True, disp=False)
print('Cost at theta found by fmin: {:0.4f}'.format(cost))
print('Expected cost (approx): 0.203')
print('theta: \n{}'.format(theta))
print('Expected Theta (approx): \n-25.161\n0.206\n0.201')
# Plot boundary
pdb.plot_decision_boundary(theta, X, y)
plt.xlabel('Exam 1 score')
plt.ylabel('Exam 2 score')
input('Program paused. Press ENTER to continue')
- 测试结果:
Cost at theta found by fmin: 0.2035
Expected cost (approx): 0.203
theta:
[-25.1613
0.2062
0.2015
]
Expected Theta (approx):
-25.161
0.206
0.201
2.4画出决策边界
- 预测函数predict.py:
import numpy as np
from sigmoid import *
def predict(theta, X):
m = X.shape[0]
# Return the following variable correctly
p = np.zeros(m)
# ===================== Your Code Here =====================
# Instructions : Complete the following code to make predictions using
# your learned logistic regression parameters.
# You should set p to a 1D-array of 0's and 1's
#
p = np.dot(X,theta) #点乘 保证一维
p = sigmoid(p) #逻辑回归
p[p>=0.5] = 1
p[p<0.5] = 0
# ===========================================================
return p
- 测试代码:
# ===================== Part 4: Predict and Accuracies =====================
# After learning the parameters, you'll like to use it to predict the outcomes
# on unseen data. In this part, you will use the logistic regression model
# to predict the probability that a student with score 45 on exam 1 and
# score 85 on exam 2 will be admitted
#
# Furthermore, you will compute the training and test set accuracies of our model.
#
# Your task is to complete the code in predict.py
# Predict probability for a student with score 45 on exam 1
# and score 85 on exam 2
prob = sigmoid(np.array([1, 45, 85]).dot(theta))
print('For a student with scores 45 and 85, we predict an admission probability of {:0.4f}'.format(prob))
print('Expected value : 0.775 +/- 0.002')
# Compute the accuracy on our training set
p = predict.predict(theta, X)
print('Train accuracy: {}'.format(np.mean(y == p) * 100))
print('Expected accuracy (approx): 89.0')
input('ex2 Finished. Press ENTER to exit')
- 测试结果:
For a student with scores 45 and 85, we predict an admission probability of 0.7763
Expected value : 0.775 +/- 0.002
Train accuracy: 89.0
Expected accuracy (approx): 89.0
3.逻辑回归(带正则化)
- 需要的包以及初始化:
import matplotlib.pyplot as plt
import numpy as np
import scipy.optimize as opt
from plotData import *
import costFunctionReg as cfr
import plotDecisionBoundary as pdb
import predict as predict
import mapFeature as mf
plt.ion()
# Load data
# The first two columns contain the exam scores and the third column contains the label.
data = np.loadtxt('ex2data2.txt', delimiter=',')
X = data[:, 0:2]
y = data[:, 2]
plot_data(X, y)
plt.xlabel('Microchip Test 1')
plt.ylabel('Microchip Test 2')
plt.legend(['y = 1', 'y = 0'])
input('Program paused. Press ENTER to continue')
- 数据分布:
3.1特征扩展
- 特征扩展示例:
- 特征扩展代码:
def map_feature(x1, x2): #生成新的多项式特征
degree = 6
x1 = x1.reshape((x1.size, 1))
x2 = x2.reshape((x2.size, 1))
result = np.ones(x1[:, 0].shape) #把result初始为一个值全为1的列向量
for i in range(1, degree + 1):
for j in range(0, i + 1):
result = np.c_[result, (x1**(i-j)) * (x2**j)] #不断拼接新的列 扩充特征矩阵
return result
3.2带有正则化的代价函数和梯度下降
- 代价函数:
- 梯度下降:
- 编写代码costFunctionReg.py:
import numpy as np
from sigmoid import *
def cost_function_reg(theta, X, y, lmd):
m = y.size
# You need to return the following values correctly
cost = 0
grad = np.zeros(theta.shape)
# ===================== Your Code Here =====================
# Instructions : Compute the cost of a particular choice of theta
# You should set cost and grad correctly.
#
h_out= np.dot(X,theta)#使用点乘 生成一维的数组
theta_lmd = theta[1:]#第一个theta不正则化
#在前面的基础上加上正则项
cost = -np.mean((y*np.log(sigmoid(h_out))+(1-y)*np.log(1-sigmoid(h_out))),axis=0)+lmd/(2*m)*(theta_lmd*theta_lmd).sum()
error = sigmoid(h_out)- y# sigmoid(a0+a1x1+...) 再与y求偏差
error_2d = error.reshape(m,1)#转换成2维数组进行乘法
error_all = error_2d*X#是一个 m*n大小的数组
#j>=1加上正则化项 j=0不加
grad[0] = (1/m)*error_all[:,0].sum(axis=0)
grad[1:] = (1/m)*error_all[:,1:].sum(axis=0)+lmd/m*theta_lmd
# ===========================================================
return cost, grad
- 测试代码:
# ===================== Part 1: Regularized Logistic Regression =====================
# In this part, you are given a dataset with data points that are not
# linearly separable. However, you would still like to use logistic
# regression to classify the data points.
# To do so, you introduce more feature to use -- in particular, you add
# polynomial features to our data matrix (similar to polynomial regression)
#
# Add polynomial features
# Note that mapFeature also adds a column of ones for us, so the intercept
# term is handled
X = mf.map_feature(X[:, 0], X[:, 1])
# Initialize fitting parameters
initial_theta = np.zeros(X.shape[1])
# Set regularization parameter lambda to 1
lmd = 1
# Compute and display initial cost and gradient for regularized logistic regression
cost, grad = cfr.cost_function_reg(initial_theta, X, y, lmd)
np.set_printoptions(formatter={'float': '{: 0.4f}\n'.format})
print('Cost at initial theta (zeros): {}'.format(cost))
print('Expected cost (approx): 0.693')
print('Gradient at initial theta (zeros) - first five values only: \n{}'.format(grad[0:5]))
print('Expected gradients (approx) - first five values only: \n 0.0085\n 0.0188\n 0.0001\n 0.0503\n 0.0115')
input('Program paused. Press ENTER to continue')
# Compute and display cost and gradient with non-zero theta
test_theta = np.ones(X.shape[1])
cost, grad = cfr.cost_function_reg(test_theta, X, y, lmd)
print('Cost at test theta: {}'.format(cost))
print('Expected cost (approx): 2.13')
print('Gradient at test theta - first five values only: \n{}'.format(grad[0:5]))
print('Expected gradients (approx) - first five values only: \n 0.3460\n 0.0851\n 0.1185\n 0.1506\n 0.0159')
input('Program paused. Press ENTER to continue')
- 测试结果:
Cost at test theta: 2.134848314665857
Expected cost (approx): 2.13
Gradient at test theta - first five values only:
[ 0.3460
0.0851
0.1185
0.1506
0.0159
]
Expected gradients (approx) - first five values only:
0.3460
0.0851
0.1185
0.1506
0.0159
3.3使用opt.fmin_bfgs优化求解,并画出决策边界(可以设置不同 lmd)
- 测试代码:
# ===================== Part 2: Regularization and Accuracies =====================
# Optional Exercise:
# In this part, you will get to try different values of lambda and
# see how regularization affects the decision boundary
#
# Try the following values of lambda (0, 1, 10, 100).
#
# How does the decision boundary change when you vary lambda? How does
# the training set accuracy vary?
#
# Initializa fitting parameters
initial_theta = np.zeros(X.shape[1])
# Set regularization parameter lambda to 1 (you should vary this)
lmd = 1
# Optimize
def cost_func(t):
return cfr.cost_function_reg(t, X, y, lmd)[0]
def grad_func(t):
return cfr.cost_function_reg(t, X, y, lmd)[1]
theta, cost, *unused = opt.fmin_bfgs(f=cost_func, fprime=grad_func, x0=initial_theta, maxiter=400, full_output=True, disp=False)
# Plot boundary
print('Plotting decision boundary ...')
pdb.plot_decision_boundary(theta, X, y)
plt.title('lambda = {}'.format(lmd))
plt.xlabel('Microchip Test 1')
plt.ylabel('Microchip Test 2')
# Compute accuracy on our training set
p = predict.predict(theta, X)
print('Train Accuracy: {:0.4f}'.format(np.mean(y == p) * 100))
print('Expected accuracy (with lambda = 1): 83.1 (approx)')
input('ex2_reg Finished. Press ENTER to exit')
- 测试结果:
Train Accuracy: 83.0508
Expected accuracy (with lambda = 1): 83.1 (approx)
注:所有代码及说明PDF在全部更新完后统一上传。