备忘小记002--深度学习detection task中的IOU(IU)是个啥东西

概念

IOU(Intersection over Union),有人译为“交并比”,有人译为“重叠度”(本人感觉意译比直译更佳)。可以理解为detection task系统预测出来的框bounding box与原来图片中标记的框的重合程度。 因为我们的算法不可能百分百跟人工标注的数据完全匹配,因此就存在一个定位精度评价的问题。解决此问题的过程中人们选择了IOU这个参数,它定义了两个bounding box的重叠度,如下图所示。

其实,本质上就是这玩意:

备忘小记002--深度学习detection task中的IOU(IU)是个啥东西

矩形框A、B的重叠面积(交集)占A、B的并集面积的比例。

detection task需要定位出物体position的bounding box,如下图所示(出自大神滴Faster R-CNN论文),我们不仅要detection图像里有什么物体,还要localization出已经检测到的物体(一般滴最终呈现都使用bounding box),而且我们还要能够识别出bounding box里面的物体就是不大不小、刚好就是detection出来的物体。

图说

备忘小记002--深度学习detection task中的IOU(IU)是个啥东西

备忘小记002--深度学习detection task中的IOU(IU)是个啥东西

 

计算

顺溜下来,IOU指标参数的计算方法即是——检测结果Detection Result与Ground Truth 的交集比上它们的并集,即为检测的准确率:

备忘小记002--深度学习detection task中的IOU(IU)是个啥东西

我们以上面三色框图为例说明,DetectionResult为A绿色框,也就是通过神经网络得到的结果,GroundTruth为B蓝色框,也就是标注的结果,IOU描述了绿色框(检测到的结果)和蓝色框(标注的结果)重合的程度。

详情参见检测评价函数 intersection-over-union ( IOU )

code and WHY is IOU

Python实现参考代码:

def bb_intersection_over_union(boxA, boxB):
    # determine the (x, y)-coordinates of the intersection rectangle
    xA = max(boxA[0], boxB[0])
    yA = max(boxA[1], boxB[1])
    xB = min(boxA[2], boxB[2])
    yB = min(boxA[3], boxB[3])

    # compute the area of intersection rectangle
    interArea = (xB - xA + 1) * (yB - yA + 1)

    # compute the area of both the prediction and ground-truth
    # rectangles
    boxAArea = (boxA[2] - boxA[0] + 1) * (boxA[3] - boxA[1] + 1)
    boxBArea = (boxB[2] - boxB[0] + 1) * (boxB[3] - boxB[1] + 1)

    # compute the intersection over union by taking the intersection
    # area and dividing it by the sum of prediction + ground-truth
    # areas - the interesection area
    iou = interArea / float(boxAArea + boxBArea - interArea)

    # return the intersection over union value
    return iou

好帖子传送门 ——  Intersection over Union (IoU) for object detection

部分摘录如下

Why do we use Intersection over Union?

If you have performed any previous machine learning in your career, specifically classification, you’ll likely be used to predicting class labelswhere your model outputs a single label that is either corrector incorrect.

This type of binary classification makes computing accuracy straightforward; however, for object detection it’s not so simple.

In all reality, it’s extremely unlikely that the (x, y)-coordinates of our predicted bounding box are going to exactly match the (x, y)-coordinates of the ground-truth bounding box.

Due to varying parameters of our model (image pyramid scale, sliding window size, feature extraction method, etc.), a complete and total match between predicted and ground-truth bounding boxes is simply unrealistic.

Because of this, we need to define an evaluation metric that rewardspredicted bounding boxes for heavily overlapping with the ground-truth:

备忘小记002--深度学习detection task中的IOU(IU)是个啥东西

In the above figure I have included examples of good and bad Intersection over Union scores.

As you can see, predicted bounding boxes that heavily overlap with the ground-truth bounding boxes have higher scores than those with less overlap. This makes Intersection over Union an excellent metric for evaluating custom object detectors.

We aren’t concerned with an exact match of (x, y)-coordinates, but we do want to ensure that our predicted bounding boxes match as closely as possible — Intersection over Union is able to take this into account.

再记

江湖说大不大,说小不小,这个方兴未艾的领域目前还没有牛顿一样的人物来问鼎江湖,一统领域内修辞上、语法上、互译上的描述。例如,IOU在Fully Convolutional Networks for Semantic Segmentation论文中称为IU,乍眼一看:

备忘小记002--深度学习detection task中的IOU(IU)是个啥东西

其实那里的IU也就是IOU,检测物体轮廓不一定非得是方框,也可以是沿着物体的边线。比如何恺明大牛的神作Mask R-CNN,给detection的东西带个面罩,不能因为蛇游到水里去背个龟壳出来你就不认识它了。在实际的detection task中,说白了其实都是IOU或者IU。 
另外mean IU指的是不同类别识别准确度的平均值,即IU狗、IU猫、IU马和IUcar的准确度平均值,用于识别同一张图像中的多类别(classification而非单一class里中的多个物体)时。

参考

仔细读读长知识:Intersection over Union (IoU) for object detection - PyImageSearch https://www.pyimagesearch.com/2016/11/07/intersection-over-union-iou-for-object-detection/