roc曲线auc面积

ROC曲线(ROC Curve)

The Receiver Operating characteristic (ROC) curve is explicitly used for binary classification. However, it can be extended for multiclass classification.

接收器工作特性(ROC)曲线明确用于二进制分类。但是，它可以扩展为多类分类。

In binary classification, when a model gives probability scores as output, we use 0.5 as a threshold for the simplest model. If the probability of a query point is greater than 0.5, the model will classify it as class 1 (say positive) otherwise class 0 (negative). To measure the performance of the model, we can use accuracy, confusion matrix, precision, recall, and F1 score.

在二元分类中，当模型给出概率得分作为输出时，我们使用0.5作为最简单模型的阈值。如果查询点的概率大于0.5，则模型会将其分类为1类(例如正)，否则为0类(负)。要衡量模型的性能，我们可以使用准确性，混淆矩阵，准确性，召回率和F1分数。

A question that arises here is using 0.5 as a threshold will give significant results for every model?

这里出现的一个问题是使用0.5作为阈值会为每个模型给出显着的结果吗？

没有

Selecting a threshold that produces significant results is problem specific. For example, in cancer detection problems, if we keep a low threshold, more people will be predicted as positive by the model. Moreover, if we set an incredibly high threshold, there is a possibility of missing actual patients.

选择产生显着结果的阈值是特定于问题的。例如，在癌症检测问题中，如果我们将阈值保持在较低水平，则该模型会将更多的人预测为阳性。此外，如果我们将阈值设置得非常高，则可能会丢失实际患者。

How should we decide the appropriate threshold value?

我们应该如何确定合适的阈值？

ROC is one technique that can give the best threshold value.

ROC是一种可以提供最佳阈值的技术。

Let’s take a trivial example

让我们举一个简单的例子

脚步(STEPS)

Take unique probability scores (in descending order) as a threshold and predict the class labels. If we have k unique probability scores, there will be k thresholds.
将唯一的概率分数(降序)作为阈值并预测类别标签。如果我们有k个唯一的概率分数，那么将有k个阈值。
For each threshold, we measure the class labels for all the query points.
对于每个阈值，我们测量所有查询点的类标签。

For each prediction, we calculate the true positive rate (TPR) and false positive rate (FPR).Let’s look at the confusion matrix.

对于每个预测，我们都计算真实阳性率(TPR)和错误阳性率(FPR)。让我们看一下混淆矩阵。

The True Positive rate measures the true positive out of total actual positives (True positive (TP) + False Negative (FN)).

真实肯定率衡量的是真实总阳性率中的真实肯定(True positive(TP)+ False Negative(FN))。

False Negative rate measures False-positive out of total actual negatives (True Negative (TN) + False Positive (FP)).

假阴性率衡量的是实际总阴性中的假阳性(真阴性(TN)+假阳性(FP))。

If we have k unique probability score, we will have k different predictions. Therefore, we will have k pairs of (TPR, FPR).In the above example, we have 5 predictions, so there will be 5 TPR, FPR pairs.

如果我们有k个唯一的概率得分，我们将有k个不同的预测。因此，我们将有k对(TPR，FPR)。在上面的示例中，我们有5个预测，因此将有5对TPR，FPR对。

PREDICTION 1 (threshold = 0.95)

预测1(阈值= 0.95)

TPR1 = 1/(1+2) = 1/3FPR1 = 0/2 = 0

TPR1 = 1 /(1 + 2)= 1 / 3FPR1 = 0/2 = 0

PREDICTION 2 (threshold=0.92)

预测2(阈值= 0.92)

TPR2 = 2/(2+1) = 2/3FPR2 = 0/2 =0

TPR2 = 2 /(2 + 1)= 2 / 3FPR2 = 0/2 = 0

PREDICTION 3 (threshold = 0.7)

预测3(阈值= 0.7)

TPR3 = 3/3 = 1FPR3 = 0/2 = 0

PREDICTION 4 (threshold = 0.6)

预测4(阈值= 0.6)

TPR4 = 3/3 = 1FPR4 = 1/(1+1) = 1/2

TPR4 = 3/3 = 1FPR4 = 1 /(1 + 1)= 1/2

PREDICTION 5 (threshold = 0.46)

预测5(阈值= 0.46)

TPR5 = 3/3 = 1FPR5 = 2/2 = 1

3. Plot the FPR vs. TPR curve with TPR on the y-axis and FPR on the x-axis.

3.绘制FPR与TPR曲线，y轴为TPR，x轴为FPR。

Blueline shows the random model, which means the model is giving a random output (unable to classify the query points). The appropriate threshold will be the point where TPR is maximum, and FPR is minimum because we want true positive and true negative to be more than a false positive and false negative.

蓝线显示了随机模型，这意味着该模型正在提供随机输出(无法对查询点进行分类)。 适当的阈值将是TPR最大值和FPR最小值的点，因为我们希望真实的正数和真实的负数大于错误的正数和错误的负数。

For our example, the best threshold value will be of prediction 3, i.e., 0.70.

对于我们的示例，最佳阈值为预测3，即0.70 。

AUC (AUC)

The area under the TPR-FPR curve will give an idea about the effectiveness of the model. The higher the AUC score, the better is the model. AUC scores are used to compare different models.

TPR-FPR曲线下的面积将为模型的有效性提供一个思路。 AUC分数越高，模型越好。 AUC分数用于比较不同的模型。

The maximum value of AUC can be 1. WHY?

AUC的最大值可以为1。为什么？

Because the maximum value of TPR and FPR is 1, respectively. So, the maximum area will be 1.

因为TPR和FPR的最大值分别为1。因此，最大面积将为1。

The classifier is correctly predicting both the classes. In the above example, we have a perfect classifier as the area under the orange curve is 1.

分类器正确预测了两个类别。在上面的示例中，我们有一个完美的分类器，因为橙色曲线下的面积为1。

What can we infer if AUC =0.5?

如果AUC = 0.5，我们可以推断什么？

This means the classifier is unable to separate both the classes. The model is giving random output.

这意味着分类器无法分离两个类。该模型提供随机输出。

What can we infer from AUC<0.5?

我们可以从AUC <0.5推断出什么？

That means the model predicts the opposite value, i.e., the actual positive label is being predicted as negative and vice versa.

这意味着模型将预测相反的值，即实际的正标记被预测为负，反之亦然。

The below image shows two models. How do we decide which one is better?

下图显示了两个模型。 我们如何确定哪个更好？

Model 2 area under the curve (AUC) is more than that of Model 1. Therefore, Model 2 is a better classifier than Model 1.

模型2的曲线下面积(AUC)大于模型1的面积。因此，模型2比模型1更好。

A limitation of AUC scores is they are sensitive to imbalanced data. If the model is dumb, AUC scores can be high.

AUC分数的局限性是它们对不平衡数据敏感。 如果模型很笨，则AUC分数可能会很高。

结论 (Conclusion)

The Receiver Operating Characteristic (ROC) curve is used to determine the appropriate threshold for the models, which give probability scores as output in binary classification. Area under the curve (AUC) scores are used to compare different models. However, they can be impacted by the imbalanced dataset.

接收器工作特性(ROC)曲线用于确定模型的适当阈值，该阈值给出概率分数作为二进制分类中的输出。 曲线下面积(AUC)分数用于比较不同的模型。但是，它们可能会受到不平衡数据集的影响。

Thanks for reading!

谢谢阅读！

翻译自: https://towardsdatascience.com/performance-metrics-receiver-operating-characteristic-roc-area-under-curve-auc-79d6d5b0b977

roc曲线auc面积

roc曲线auc面积_性能指标接收机在曲线auc下的操作特性roc面积

ROC曲线(ROC Curve)

脚步(STEPS)

AUC (AUC)

结论 (Conclusion)

相关推荐