Python Logistic Regression（逻辑回归）实现预测某事件

逻辑回归模型，自己的理解逻辑就相当于是非，那就只有0,1的情况。这个是我在一个大神那看到的，https://blog.****.net/zouxy09/article/details/20319673

逻辑回归模型用于分类，可以知道哪几个影响因素占主导地位，从而可以预测某事件。

我从网上下载了一个2017某高中理科一模成绩单，模糊姓名和学校，具体长这样：

Python Logistic Regression（逻辑回归）实现预测某事件

最后一列是能否过二本，搜索当年二本线480，sum>480为1，否则为0。一共有10002条数据。

步骤：1、读取数据。

2、将特征（影响因素）和结果变成矩阵的形式。

3、导入模块sklearn.linear_model 下RandomizedLogisticRegression，进行实例化。

4、通过fit()进行训练模型。

5、通过get_support()筛选有效特征，也是降维的过程。

6、简化模型，训练模型。

注意，y=dataf.iloc[:,7].as_matrix()这句不能写成y=dataf.iloc[:,0:7].as_matrix(),后者是形成一个二维数组，前者是一个一维数组，否则会出现DataConversionWarning: A column-vector y was passed when a 1d array was expected.

Python Logistic Regression（逻辑回归）实现预测某事件

代码附上：

import pandas as pda
fname="D:/xx/xx/2017yimo.xls"

dataf=pda.read_excel(fname)

#数据框切片,转成矩阵

x=dataf.iloc[:,0:7].as_matrix()
y=dataf.iloc[:,7].as_matrix()
from sklearn.linear_model import LogisticRegression as LR
from sklearn.linear_model import RandomizedLogisticRegression as RLR
r1=RLR()
r1.fit(x,y)
r1.get_support(indices=True)
print(dataf.columns[r1.get_support(indices=True)])
t=dataf[dataf.columns[r1.get_support(indices=True)]].as_matrix()
r2=LR()
r2.fit(t,y)
print("训练结束")

print("模型正确率:"+str(r2.score(t,y)))

由于数据量还可以，所以正确率还是很高的

Python Logistic Regression（逻辑回归）实现预测某事件

Python Logistic Regression（逻辑回归）实现预测某事件

相关推荐