AttGAN: Facial Attribute Editing by Only Changing What You Want（TIP19）

III. ATTRIBUTE GAN (ATTGAN)

前提：所有attribute都是binary型的
AttGAN: Facial Attribute Editing by Only Changing What You Want（TIP19）

A. Testing Formulation

定义输入图像为 $\mathbf{x^a}$ ，包含 $n$ 个attribute $\mathbf{a}=\left [ a_1, \cdots, a_n \right ]$

encoder网络 $G_{enc}$ 将 $\mathbf{x^a}$ 编码为latent representation $\mathbf{z}$
$\mathbf{z} = G_{enc}(\mathbf{x^a}) \qquad(3)$

target attribute记为 $\mathbf{b}=\left [ b_1, \cdots, b_n \right ]$

decoder网络 $G_{dec}$ 将 $\mathbf{z}$ 和 $\mathbf{b}$ 作为输入，生成图像 $\mathbf{x^{\hat{b}}}$
$\mathbf{x^{\hat{b}}} = G_{dec}(\mathbf{z}, \mathbf{b}) \qquad(4)$

综合公式(3)和(4)，有
$\mathbf{x^{\hat{b}}} = G_{dec}(G_{enc}(\mathbf{x^a}), \mathbf{b}) \qquad(5)$

B. Training Formulation

整个训练过程是无监督的，因为ground truth $\mathbf{x^b}$ 是未知的

Reconstruction Loss

我们希望只编辑attribute改变的部分，同时保留其它attribute不变，因此引入reconstruction learning（文章给出了2个理由，感觉略显牵强）

令 $\mathbf{b}=\mathbf{a}$ ，得到生成图像 $\mathbf{x^{\hat{a}}}$
$\mathbf{x^{\hat{a}}} = G_{dec}(\mathbf{z}, \mathbf{a}) \qquad(6)$
那么 $\mathbf{x^{\hat{a}}}$ 与 $\mathbf{x^a}$ 应该比较近似，于是关于生成器 $G$ 的Reconstruction Loss定义如下
$\underset{G_{enc},G_{dec}}{\min}\ \mathcal{L}_{rec}=\mathbb{E}_{\mathbf{x^a}\sim p_{data}} \left \| \mathbf{x^a}-\mathbf{x^{\hat{a}}} \right \|_1 \qquad(11)$
使用 $\ell_1$ loss相较于 $\ell_2$ loss不容易模糊

Attribute Classification Constraint

生成图像 $\mathbf{x^{\hat{b}}}$ 应该确保包含属性 $\mathbf{b}$ ，因此引入一个attribute classifier $C$

于是关于生成器 $G$ 的Attribute Classification Constraint定义如下
$\underset{G_{enc}, G_{dec}}{\min}\mathcal{L}_{cls_g}=\mathbb{E}_{\mathbf{x^a}\sim p_data, \mathbf{b}\sim p_{attr}}\left [ \ell_g\left ( \mathbf{x^a}, \mathbf{b} \right ) \right ] \qquad(7)$
$\ell_g(\mathbf{x^a}, \mathbf{b})=\sum_{i=1}^{n}-b_i\log C_i\left ( \mathbf{x^{\hat{b}}} \right )-(1-b_i)\log\left ( 1-C_i\left ( \mathbf{x^{\hat{b}}} \right ) \right ) \qquad(8)$

attribute classifier $C$ 的训练目标如下
$\underset{C}{\min}\ \mathcal{L}_{cls_c}=\mathbb{E}_{\mathbf{x^a}\sim p_data}\left [ \ell_r(\mathbf{x^a}, \mathbf{a}) \right ] \qquad(9)$
$\ell_r(\mathbf{x^a}, \mathbf{a})=\sum_{i=1}^{n}-a_i\log C_i\left ( \mathbf{x^a} \right )-(1-a_i)\log\left ( 1-C_i\left ( \mathbf{x^a} \right ) \right ) \qquad(10)$

Adversarial Loss

使用WGAN-GP版本的adversarial Loss，判别器 $D$ 和生成器 $G$ 的目标函数分别如下

$\underset{\left \| D \right \|_L\leqslant 1}{\min} \mathcal{L}_{adv_{d}}=-\mathbb{E}_{\mathbf{x^a}\sim p_{data}}D(\mathbf{x^a})+\mathbb{E}_{\mathbf{x^a}\sim p_{data},\mathbf{b}\sim p_{attr}}D\left ( \mathbf{x^{\hat{b}}} \right ) \qquad(12)$
$\underset{G_{enc},G_{dec}}{\min}\mathcal{L}_{adv_g}=-\mathbb{E}_{\mathbf{x^a}\sim p_{data},\mathbf{b}\sim p_{attr}}D\left ( \mathbf{x^{\hat{b}}} \right ) \qquad(13)$

Overall Objective

生成器 $G$ 的目标函数如下
$\underset{G_{enc},G_{dec}}{\min}\mathcal{L}_{enc,dec}=\lambda_1\mathcal{L}_{rec}+\lambda_2\mathcal{L}_{cls_g}+\mathcal{L}_{adv_g} \qquad(14)$

判别器 $D$ 和attribute classifier $C$ 的目标函数如下
$\underset{D,C}{\min}\ \mathcal{L}_{dis,cls}=\lambda_3\mathcal{L}_{cls_c}+\mathcal{L}_{adv_d} \qquad(15)$

C. Why are attribute-excluding details preserved?

AttGAN执行了multi-task learning，一个是face reconstruction task，另一个是attribute editing task

作者认为这两个task是高度相似的，它们之间的transferability gap非常小，因此the detail preservation ability learned from the face reconstruction task can be easily transfered to the attribute editing task

D. Extension for Attribute Style Manipulation

参考文献[28]和[26]，引入一组style controllers $\theta=\left [ \theta_1, \cdots, \theta_i, \cdots, \theta_n \right ]$ ，然后maximize the mutual information between the controllers and the output images to make them highly correlated
AttGAN: Facial Attribute Editing by Only Changing What You Want（TIP19）
具体来说，如Figure 3所示，额外引入一个style predictor $Q$ ，encoder网络 $G_{dec}$ 额外接收 $\theta$ 作为输入，生成具备target attribute $\mathbf{b}$ 和style controller $\theta$ 的图像 $\mathbf{x^{\hat{\theta}\hat{b}}}$
$\mathbf{x^{\hat{\theta}\hat{b}}}=G_{dec}\left ( G_{enc}(\mathbf{x^a}), \theta, \mathbf{b} \right ) \qquad(16)$

style controller $\theta$ 与生成图像 $x^*$ 之间的mutual information定义如下
$I\left ( \theta;x^* \right )=\underset{Q}{\max}\ \mathbb{E}_{\theta\sim p(\theta), x^*\sim p(x^*|\theta)}\left [ \log Q(\theta|x^*) \right ] + const \qquad(17)$

故生成器 $G$ 新增一项损失函数如下
$\underset{G_{enc}, G_{dec}}{\max}I\left ( \theta;x^* \right ) \qquad(18)$