SORT

SORT
SORT是一种简单粗糙的目标跟踪算法，那本文也简单粗糙地介绍一下~

论文地址：https://arxiv.org/abs/1602.00763
源码地址：https://github.com/abewley/sort

1. 简介

SORT（Simple Online and Realtime Tracking），其主要有以下几个方面：

Online & Realtime：这是两个不同又有点相关的方面；Online主要指的是在tracking时只能利用当前帧以前的信息；Realtime则关注的是速度够快；
FrRCNN detector：作者用实验的方式向我们展现了好的detector能大大帮助tracking任务；作者proposed tracking方法也正是基于优秀的detector—FrRCNN，从而获得了较好的性能；（吐槽，detector对tracking-by-detection的性能重要不是显而易见的嘛~）
Kalman Filter：卡尔曼滤波器用来进行运动预测；本文使用的是匀速线性运动模型；
Hungarian Algorithm：匈牙利算法用来处理数据关联问题；本文代价矩阵是用IOU distance来计算的；
Igonring Occlusion：SORT中选择忽略遮挡问题（short-term and long-term occlusion），作者的理由是，这种场景罕见、复杂、reID开销大；

PS：本文中仅将模型应用在了行人跟踪上，自然地，可以考虑泛化到其他类别上~

所以，其实本文内容非常少，作者自己对本文的贡献总结为：

We leverage the power of CNN based detection in the context of MOT.

A pragmatic tracking approach based on the Kalman filter and the Hungarian algorithm is presented and evaluated on a recent MOT benchmark.

Code will be open sourced to help establish a baseline method for research experimentation and uptake in collision avoidance applications.

2. 方法

（1）Detection

本文选用了FrRCNN作为检测器，作者给出了两个原因：

快：parameters are shared between the two stages creating an efficient framework for detection（备注：FrRCNN是一种 two stages 检测方法，即”提取特征&候选区推荐“ + ”分类”）
网络结构可替换：the network architecture itself can be swapped to any design which enables rapid experimentation of different architectures to improve the detection performance；也就是说为了提升性能，我们可以更换FrRCNN的backbone，比如本文比较了ZFNet和VGG-16，并最终使用后者作为backbone；

其他细节：

作者使用的是VOC训练好的参数，并仅保留person类别
仅使用概率大于50%的检测框，传递给tracking framework

（2）Estimation Model

目标估计模型（表现模型和运动模型），是用来将目标的ID传递给下一帧的；本文选用匀速线性运动模型来粗略估计目标的帧间位移；

具体而言，每个目标的状态建模为：
SORT

u,v： the horizontal and vertical pixel location of the centre of the target
s：the scale (area) of the target’s bounding box
r：the aspect ratio of the target’s bounding box（常量）

技术细节：当detection与目标相关联时，检测到的bbox用于更新目标状态，其中通过Kalmam Filter最佳地求解速度分量；如果没有检测与目标相关联，则仅使用线速度模型预测其状态而不进行校正。

（3）Data Association

首先需要对targets预测其bbox，然后计算predicted bbox和detection之间的IOU distance从而组成代价矩阵（cost matrix），最后用Hungarian算法进行分配（assignment）。

另外，通过设定 $IOU_{min}$ 来拒绝不可靠关联。

PS：文中还讲了一个IOU distance的优点，不过不是很好理解，这里说明一下我个人的理解（欢迎交流）：
SORT

（4）Creation and Detection of Track Identities

这部分主要是介绍如何处理目标的进场和离场问题的

进场：利用上文所说的 $IOU_{min}$ 来判断是否为新的目标，若是则进行初始化；另外新tracker会有一段试用期（probationary period），期间轨迹需要关联到足够的detections，来避免FP问题
离场：持续 $T_{Lost}$ 未关联detection的轨迹视为离场；本文中将 $T_{Lost}$ 设置为1，原因有三：匀速模型不善预测、本文主要考虑帧间跟踪而忽略reID问题、早删ID更有效率；这也意味着，一旦发生了FN问题，断裂后的轨迹会赋予一个新ID

3. 结果

SORT
SORT
从结果上来看，SORT方法获得了speed和accuracy上的双丰收（很大程度上得益于FrRCNN）

具体指标上看，SORT在MOTA、MOTP、FAF、ML、FP、FN等方面都有不俗的表现（靠FrRCNN支撑起来的~），不过在其他方面尤其ID_SW方面问题严重，因为SORT没有去处理这方面问题 ~

最后，作者希望简洁的SORT可以用作baseline，通过引入新方法来解决reID（occusion）等问题

As our experiments highlight the importance of detection quality in tracking, future work will investigate a tightly coupled detection and tracking framework.（耦合检测跟踪框架）