Boosting Adversarial Training with Hypersphere Embedding

文章目录

- 概
- 主要内容
- 代码

Pang T., Yang X., Dong Y., Xu K., Su H., Zhu J. Boosting Adversarial Training with Hypersphere Embedding. arXiv preprint arXIv 2002.08619

概

在最后一层, 对weight和features都进行normalize有助于加强对抗训练.

Boosting Adversarial Training with Hypersphere Embedding

主要内容

一般的神经网络可以用下式表示:
f ( x ) = S ( W T z + b ) , f(x) = \mathbb{S}(W^Tz + b), f(x)=S(WTz+b),
其中 z = z ( x ; ω ) z=z(x;\omega) z=z(x;ω)是encoder部分提取的特征, W = ( W 1 , W 2 , … , W L ) , b W=(W_1, W_2,\ldots, W_L), b W=(W1,W2,…,WL),b分别是最后的权重和偏置, S \mathbb{S} S表示softmax.

hypersphere embedding (HE):
W ~ l = W ~ l ∥ W l ∥ , z ~ = z ∥ z ∥ f ~ ( x ) = S ( W ~ T z ~ ) = S ( cos ⁡ θ ) . \widetilde{W}_l = \frac{\widetilde{W}_l}{\|W_l\|}, \widetilde{z} = \frac{z}{\|z\|} \\ \widetilde{f}(x) = \mathbb{S}(\widetilde{W}^T\widetilde{z})=\mathbb{S}(\cos\theta). W l=∥Wl∥W l,z =∥z∥zf (x)=S(W Tz )=S(cosθ).
进一步添加一些margin:
L c e m ( f ~ ( x ) , y ) = − 1 y T log ⁡ S ( s ⋅ ( cos ⁡ θ − m ⋅ 1 y ) ) . \mathcal{L}_{ce}^m (\widetilde{f}(x), y) = -1_y^T \log \mathbb{S}(s\cdot (\cos\theta -m \cdot \mathbb{1}_y)). Lcem(f (x),y)=−1yTlogS(s⋅(cosθ−m⋅1y)).

为什么要这么做呢? 作者觉得, 生成对抗样本最有效的途径是旋转角度, 即图中的蓝线. 如果你不限制 z z z或者 W W W, 那么梯度会同时在模的大小的上下功夫, 这并不高效.

代码

原文代码

Boosting Adversarial Training with Hypersphere Embedding

文章目录

概

主要内容

代码

相关推荐