Learning Continuous Face Age Progression: A Pyramid of GANs（CVPR18扩展）

1 INTRODUCTION

本文是CVPR18的扩展

3 METHOD

3.1 Overview

Learning Continuous Face Age Progression: A Pyramid of GANs（CVPR18扩展）
loss包括the traditional squared Euclidean loss、the GAN loss、the identity loss

结构上，判别器是pyramid-structured discriminator

3.2 Generator

生成器是Encoder-Decoder结构，接收input young face $x$ 和target age label（or age range） $c$ 作为输入，首先使用卷积层转换到latent space，然后连接4个residual blocks，最后连接3个反卷积层，生成age progression result $y$ ，即 $G(x,c)\rightarrow y$

卷积层之后是Instance Normalization和ReLU，最后一层是total variation regularization layer

3.3 Adversarial Learning

3.3.1 Aging Modeling

定义young faces的分布为 $x\sim P_{young}$ ，generated faces的分布为 $G(x, c)\sim P_G$ ，target age faces的分布为 $P_{old}$ ，我们希望 $P_G=P_{old}$

在原版GAN的优化目标中，判别器的损失函数如下
$\begin{aligned} \mathcal{L}_{GAN_D}=&-\mathbb{E}_{x\sim P_{young},c}\log\left [ 1-D\left ( G\left ( x,c \right ) \right ) \right ] \qquad(2)\\ &-\mathbb{E}_{x\sim P_{old}}\log\left [ D(x) \right ] \end{aligned}$

因为JS divergence is locally saturated，所以当 $D$ 训练得太好时，造成优化 $G$ 时梯度消失，本文使用Least Squares GAN的版本

对于 $D$ ，将actual young faces和generated age-progressed faces作为阜样本，将true elderly face of age range $c$ 作为正样本，于是 $G$ 和 $D$ 的优化目标分别为
$\mathcal{L}_{GAN\_G}=\mathbb{E}_{x\sim P_{young},c}\ H\left ( 1,D\left ( \phi_{age}\left ( G\left ( x,c \right ) \right ) \right ) \right ) \qquad(3)$
$\begin{aligned} \mathcal{L}_{GAN\_D}=&\mathbb{E}_{x\sim P_{young},y\sim P_{old},c}\ H([1,0,0],\\ &\left [ D\left ( \phi_{age}(y) \right ), D\left ( \phi_{age}\left ( G\left ( x,c \right ) \right ) \right ), D\left ( \phi_{age}(x) \right ) \right ]) \qquad(4) \end{aligned}$
其中， $H$ 表示least squares distance， $\phi_{age}$ 是提取extract age-related features的网络，预先训练一个VGG16进行age classification，然后去掉FC layer

$\phi_{age}$ 提取的特征包括第2、4、7、10层的feature map，与之对应， $D$ 有4个分支，每一个分支的输出都是3x3，最终拼接成12x3

$D$ 的结构是Conv-BN-LeakyReLU

3.3.2 Progressive Aging Modeling
Learning Continuous Face Age Progression: A Pyramid of GANs（CVPR18扩展）
原始的框架如Figure 4(a)所示

比较常见的做法是对D加一个auxiliary classifier，增加the age classification loss $L_{age}$ ，如Figure 4(b)所示

3.4 Identity Preservation

采用the network of deep face descriptor，记为 $\phi_{id}$

the identity loss定义如下
$\mathcal{L}_{identity}=\mathbb{E}_{x\sim P_{young}, c}\ d\left ( \phi_{id}(x), \phi_{id}\left ( G(x,c) \right ) \right ) \qquad(9)$
其中 $d$ 表示squared Euclidean distance

3.5 Objective

为了缩小生成图像与原图之间的gap（保证颜色相近），添加一项pixel-wise L2 loss
$\mathcal{L}_{pixel}=\mathbb{E}_{x\sim P_{young}, c}\ \frac{1}{W\times H\times C}\left \| G(x,c)-x\right \|_2^2 \qquad(10)$

借鉴文献[38]中的total variation regularizer loss，增加一项 $\mathcal{L}_{tv}$ ，保证spatial smoothness

最终整个framework的训练目标如下
$\mathcal{L}_G=\lambda_a\sum_{i}\mathcal{L}_{GAN\_G_i}+\lambda_p\mathcal{L}_{pixel}+\lambda_i\mathcal{L}_{identity}+\lambda_{t}\mathcal{L}_{tv} \qquad(11)$
$\mathcal{L}_{D_i}=\mathcal{L}_{GAN\_D_i} \qquad(12)$