sklearn线性回归系数具有单个值输出

问题描述:

我正在使用数据集来查看工资与大学GPA之间的关系。我正在使用sklearn线性回归模型。我认为这些系数应该是拦截和关闭的。相应功能的值。但该模型给出了单一的价值。sklearn线性回归系数具有单个值输出

from sklearn.cross_validation import train_test_split 
from sklearn.linear_model import LinearRegression 

# Use only one feature : CollegeGPA 
labour_data_gpa = labour_data[['collegeGPA']] 

# salary as a dependent variable 
labour_data_salary = labour_data[['Salary']] 

# Split the data into training/testing sets 
gpa_train, gpa_test, salary_train, salary_test = train_test_split(labour_data_gpa, labour_data_salary) 

# Create linear regression object 
regression = LinearRegression() 

# Train the model using the training sets (first parameter is x) 
regression.fit(gpa_train, salary_train) 

#coefficients 
regression.coef_ 

The output is : Out[12]: array([[ 3235.66359637]]) 

salary_pred = regression.predict(gpa_test) 
print salary_pred 
print salary_test 

我觉得小号alary_pred = regression.coef_*salary_test。 试试通过pyplot打印salary_predsalary_test。图可以解释每一件事情。

尝试:

regression = LinearRegression(fit_intercept =True) 
regression.fit(gpa_train, salary_train) 

,其结果将是

regression.coef_ 
regression.intercept_ 

为了更好地了解您的线性回归的,你也许应该考虑另一个模块,下面的教程帮助: http://statsmodels.sourceforge.net/devel/examples/notebooks/generated/ols.html

+0

感谢您的教程链接! – MaxU