A Simple Framework for Contrastive Learning of Visual Representations

文章目录

- 概
- 主要内容
- 代码

Chen T., Kornblith S., Norouzi M., Hinton G. A Simple Framework for Contrastive Learning of Visual Representations. arXiv: Learning, 2020.

@article{chen2020a,
title={A Simple Framework for Contrastive Learning of Visual Representations},
author={Chen, Ting and Kornblith, Simon and Norouzi, Mohammad and Hinton, Geoffrey E},
journal={arXiv: Learning},
year={2020}}

概

SimCLR 主要是利用augmentation来生成正负样本对, 虽然没有花里胡哨的结构, 但是通过细致的tricks比之前的方法更为有效.

主要内容

A Simple Framework for Contrastive Learning of Visual Representations

流程

流程是很简单的, 假设有一个batch的样本 x x x, 然后从augmentation T \mathcal{T} T中随机选取俩个 t , t ′ t,t' t,t′, 由此得到两批数据 x ~ i = t ( x ) , x ~ j = t ′ ( x ) \tilde{x}_i=t(x), \tilde{x}_j=t'(x) x~i=t(x),x~j=t′(x), 经过第一个encoder得到特征表示 h i , h j h_i,h_j hi,hj, 再经由一个非线性变化 g g g得到 z i , z j z_i,z_j zi,zj(注意这一步是和以往方法不同的点), 再由 z i , z j z_i, z_j zi,zj生成正负样本对(对应同一个样本的俩个样本构成正样本对, 否则为负样本对).

A Simple Framework for Contrastive Learning of Visual Representations

接下来先介绍一些比较重要的特别的tricks, 再介绍别的.

projection head g

一般方法只有一个encoder f ( ⋅ ) f(\cdot) f(⋅), SimCLR多了一个projection head g ( ⋅ ) g(\cdot) g(⋅), 它把第一次提到的特征再进行一次过滤:
z i = g ( h i ) = W ( 2 ) σ ( W ( 1 ) h i ) , z_i = g(h_i)=W^{(2)} \sigma(W^{(1)}h_i), zi=g(hi)=W(2)σ(W(1)hi),
其中 σ \sigma σ为ReLU.

作者说, 这是为了过滤到由augmentation带来的额外的可分性, 让区分特征 z z z变得更为困难从而学习到更好的特征 h h h.
注: 用于下游任务的特征是 h h h而非 z z z!

A Simple Framework for Contrastive Learning of Visual Representations

上表是将特征 h h h或者 z z z用于一个二分类任务, 区分输入是否经过了特定的augmentation, 结果显示 h h h能够更好的分类, 意味着 h h h比 z z z含有更多的augmentation的信息.

constractive loss

ℓ i j = − log ⁡ exp ⁡ ( s i m ( z i , z j ) / τ ) ∑ k ≠ i exp ⁡ ( s i m ( z i , z k ) / τ ) , (1) \tag{1} \ell_{ij}=-\log \frac{\exp(\mathrm{sim}(z_i,z_j)/\tau)}{\sum_{k\not=i} \exp(\mathrm{sim}(z_i,z_k)/\tau)}, ℓij=−log∑k=iexp(sim(zi,zk)/τ)exp(sim(zi,zj)/τ),(1)
其中 s i m ( u , v ) = u T v / ∥ u ∥ ∥ v ∥ \mathrm{sim}(u,v)=u^Tv/\|u\|\|v\| sim(u,v)=uTv/∥u∥∥v∥.

实验显示这个损失比别的都好用.
A Simple Framework for Contrastive Learning of Visual Representations

augmentation

A Simple Framework for Contrastive Learning of Visual Representations

SimCLR中augmentation是很重要的构造正负样本对的配件, 经过消融实验发现, 最有效的的是crop和color distortion.

A Simple Framework for Contrastive Learning of Visual Representations

另外, 实验还显示, 监督学习比起对比学习来讲, 对augmentation的依赖程度很低, 甚至可以说是不依赖.

other

大的模型充当encoder效果更好;
大的batch size 和更多的 training epoches有助于学习到更好的特征表示;

代码

原文代码

A Simple Framework for Contrastive Learning of Visual Representations

文章目录

概

主要内容

流程

projection head g

constractive loss

augmentation

other

代码

相关推荐