Machine Learning week6 quiz1
1。第 1 个问题
You train a learning algorithm, and find that it has unacceptably high error on the test set. You plot the learning curve, and obtain the figure below. Is the algorithm suffering from high bias, high variance, or neither?
High bias
Neither
High variance
point
2。第 2 个问题
Suppose you have implemented regularized logistic regression
to classify what object is in an image (i.e., to do object
recognition). However, when you test your hypothesis on a new
set of images, you find that it makes unacceptably large
errors with its predictions on the new images. However, your
hypothesis performs well (has low error) on the
training set. Which of the following are promising steps to
take? Check all that apply.
Try increasing the regularization parameter λ.
Try evaluating the hypothesis on a cross validation set rather than the test set.
Try decreasing the regularization parameter λ.
Try using a smaller set of features.
point
3。第 3 个问题
Suppose you have implemented regularized logistic regression
to predict what items customers will purchase on a web
shopping site. However, when you test your hypothesis on a new
set of customers, you find that it makes unacceptably large
errors in its predictions. Furthermore, the hypothesis
performs poorly on the training set. Which of the
following might be promising steps to take? Check all that
apply.
Try adding polynomial features.
Try decreasing the regularization parameter λ.
Try evaluating the hypothesis on a cross validation set rather than the test set.
Use fewer training examples.
point
4。第 4 个问题
Which of the following statements are true? Check all that apply.
Suppose you are using linear regression to predict housing prices, and your dataset comes sorted in order of increasing sizes of houses. It is then important to randomly shuffle the dataset before splitting it into training, validation and test sets, so that we don’t have all the smallest houses going into the training set, and all the largest houses going into the test set.
Suppose you are training a logistic regression classifier using polynomial features and want to select what degree polynomial (denoted d in the lecture videos) to use. After training the classifier on the entire training set, you decide to use a subset of the training examples as a validation set. This will work just as well as having a validation set that is separate (disjoint) from the training set.
It is okay to use data from the test set to choose the regularization parameter λ, but not the model parameters (θ).
A typical split of a dataset into training, validation and test sets might be 60% training set, 20% validation set, and 20% test set.
point
5。第 5 个问题
Which of the following statements are true? Check all that apply.
If a learning algorithm is suffering from high variance, adding more training examples is likely to improve the test error.
When debugging learning algorithms, it is useful to plot a learning curve to understand if there is a high bias or high variance problem.
We always prefer models with high variance (over those with high bias) as they will able to better fit the training set.
If a learning algorithm is suffering from high bias, only adding more training examples may not improve the test error significantly.