动作识别——Multi-Model Domain Adaptation for Fine-Grained Action Recognition——CVPR2020 oral
Abstract
Fine-grained action recognition datasets exhibit environmental bias, where multiple video sequences are captured from a limited number of environments. Multi-modal nature of video(视频的多模态性),提出的方法一个是multi-modal self-supervision,还有一个是adversarial training per modality
Introduction
fine-grained action recognition,
Few works have attempted deep UDA for video data《Temporal attentive alignment for large-scale video domain adaptation, ICCV2019》《Deep domain adaptation in action space, BMVC2018》
Conclusion
modality指的是两种信息(optical flow和RGB信息),future work包含audio
Key points: Motivation很好; 提出的新数据集