Deep Learning Specialization 4: Convolutional Neural Networks - Week4

本周的课程还是很有意思的，可以让人知道神奇的照片凡高化是怎么做出来的。在创新地应用上，最重要的还是如何让问题形式化。问题能够定义清楚就成功了一大半。

1. Face Recognition

1.1 定义

(Face) Verification

Input image, name/ID
Output whether the input image is that of claimed person

(Face) Recognition

Has a database of K persons
Get an input image
Output ID if the image is any of the K persons (or “not recognized”)

难点：One Shot Learning.
Learning from one example to recognize the person again. 只提供一张图片就能完成后续的识别。

1.2 Siamese Network

Learning a similarity function

if d(image1, image2) $\le \tau$ , “same”
else d(image1, image2) $\gt \tau$ , “different”

1.2.1 Encoding of image

Deep Learning Specialization 4: Convolutional Neural Networks - Week4

$d(x^{(1)}, x^{(2)}) = \left \| f(x^{(1)})-f(x^{(2)}) \right \|^2_2$

1.2.2 Loss

Triplet Loss
APN - Anchor/Positive/Negative
$\left \| f(A) - f(P) \right \|^2 + \alpha \le \left \| f(A) - f(N) \right \|^2$
$\alpha$ is margin, 能够让正负样本之间的距离更明显。

$\mathcal{L}(A, P, N) = \max ( \left \| f(A) - f(P) \right \|^2 - \left \| f(A) - f(N) \right \|^2 + \alpha, 0 )$

$J = \sum^m_{i=1} \mathcal{L}(A^{(i)}, P^{(i)}, N^{(i)})$
Choose triplets that are “hard” to train on.

Binary Classification
或者说将这个问题转换成一个二分类问题

Deep Learning Specialization 4: Convolutional Neural Networks - Week4

$\hat{y} = \sigma \left( \sum_{k=1}^{128} w_i \left| f(x^{(i)})_k - f(x^{(j)})_k \right | + b \right)$

2. Neural Style Transfer

定义: Content/Sytle/Generated image，将C按照S生成G。

Deep Learning Specialization 4: Convolutional Neural Networks - Week4

Cost Function
$J(G) = \alpha J_\text{content}(C, G) + \beta J_{style}(S, G)$
代价函数由两部分构成。
一部分是内容相似的得分：计算隐藏层的activation输出
$J_\text{content} = \frac{1}{2} \left \| a^{[l](C)} - a^{[l][G]} \right \|$
一部分是风格流派分：

定义Style：Correlation between activations across channels.
$G_{kk'}^{[l]} = \sum_{i=1}^{n_H^{[l]}} \sum_{i=j}^{n_W^{[l]}} a_{ijk}^{[l]}a_{ijk'}^{[l]}$
$G^{[l]}$ 是 $n_c^{[l]} \times n_c^{[l]}$ 的Gram Matrix，k/k’都是channel上的。l表示是在第l层的activation上。然后分别在S和G上计算：
$J^{[l]}_\text{style}(S, G) = \frac {1} {(2n_H^{[l]}n_W^{[l]}n_C^{[l]})^2} \left \| G^{[l][S]} - G^{[l][G]} \right \|_F^2$
最后，在多个隐层上进行计算。

3. What are deep ConvNets learning?

浅层学到的是基础特征：第一层不同的filter就是在提取不同的边特征；

深层学到的是组合程度更强的：比如耳朵、鼻子之类的局部组件。

Deep Learning Specialization 4: Convolutional Neural Networks - Week4

1. Face Recognition

1.1 定义

1.2 Siamese Network

1.2.1 Encoding of image

1.2.2 Loss

2. Neural Style Transfer

3. What are deep ConvNets learning?

相关推荐