我如何确保从回归生成器中获得正确的结果？

问题描述：

我写了一个简单的脚本生成并消退随机抽样数据：我如何确保从回归生成器中获得正确的结果？

import matplotlib.pyplot as plt 
import numpy as np 
import random 
import sklearn.datasets 
import sklearn.linear_model as lm 
########################################## 
n = np.random.randint(1,10) 
b = np.random.randint(50,200) 
X1_, Y1_ = sklearn.datasets.make_regression(n_samples=100, n_features=1, noise=n, bias=b) 
X1 = X1_.reshape(len(X1_), 1) 
Y1 = Y1_.reshape(len(Y1_), 1) 
########################################## 
x = np.array(X1) 
y = np.array(Y1) 
########################################## 
lr = lm.LinearRegression() 
lr.fit(x, y) 
td = np.arange(1, 101, 1).reshape(100, 1) 
n_y = lr.predict(td) 
########################################## 
f, ax = plt.subplots(1, 2, sharey=True) 
ax[0].scatter(x, y) 
ax[0].set_xlim([-4, 4]) 
ax[0].set_title("x, y") 
ax[1].plot(x, n_y, 'g') 
ax[1].set_xlim([-4, 4]) 
ax[1].set_title("x_tr, y_lr") 
f.suptitle("Regression") 
plt.ylim(y.min()-1, y.max()+1) 
########################################## 
print ("Array: {}\nType: {}\nShape: {}\nLength: {}\nData: {}\n".format("X1", type(X1), str(np.shape(X1)), len(X1), str(X1))) 
print ("Array: {}\nType: {}\nShape: {}\nLength: {}\nData: {}\n".format("Y1", type(Y1), str(np.shape(Y1)), len(Y1), str(Y1))) 
print ("Array: {}\nType: {}\nShape: {}\nLength: {}\nData: {}\n".format("x", type(x), str(np.shape(x)), len(x), str(x))) 
print ("Array: {}\nType: {}\nShape: {}\nLength: {}\nData: {}\n".format("y", type(y), str(np.shape(y)), len(y), str(y))) 
print ("Array: {}\nType: {}\nShape: {}\nLength: {}\nData: {}\n".format("td", type(td), str(np.shape(td)), len(td), str(td))) 
print ("Array: {}\nType: {}\nShape: {}\nLength: {}\nData: {}\n".format("n_y", type(n_y), str(np.shape(n_y)), len(n_y), str(n_y))) 
########################################## 
plt.show()

，虽然它似乎是工作的罚款，没有任何错误，我仍然关心的准确性：回归线总是充满随机角度，奇怪的形状。我如何测试这个？是否有任何我应该注意的错误报告功能？

答

你观察到的是因为你的数据是随机的。回归本质上是恢复生成数据的分布，所以有意思的是，你试图恢复随机生成器的分布，这基本上试图隐藏它的分布。

如果你想测试回归方法，你应该使用互联网上可用的一些流行的ML数据集。例如：UCI ML数据集集合（用于回归任务的筛选器）：

什么是一些很好的示例？ –

我如何确保从回归生成器中获得正确的结果？

相关推荐