deeplearning.ai - 卷积神经网络

卷积神经网络
吴恩达 Andrew Ng

Computer Vision Problems

  • Image Classification
  • Object Detection
  • Neural Style Transfer

Vertical edge detection

  • filter (过滤器) (通常是奇数阶)

    (101101101) 边缘检测

  • convolution operation *

  • bright pixels on the left and dark pixels on the right

Padding

  • (n×n)(f×f)=(nf+1)×(nf+1)

    • output will shrink
    • pixels in the corner are used only once, so we loss information near the edge of the image
  • 解决上述两个问题的方法 —— Pad

    • with an additional border of one pixel all around the edges
    • pad with zeros by convention
    • so the output will be (n+2pf+1)×(n+2pf+1)
  • Valid Convolution: No paddings (p = 0)

  • Same Convolution: Pad so that output size is the same as the input size

  • f 通常是奇数

    • 便于 Same Convolution 的操作
    • 存在中心点 central pixel

Strided Convolution

  • 卷积步长 (stride):每次移动的格子数
  • output: (nfs+1)×(nfs+1)
  • the filter must entirely lies in the image (plus padding region)

Cross-correlation VS. Convolution

  • in mathematic, (convolution) before calculation the filter needs a flipping operation (沿副对角线的镜面翻转)
  • in ML we usually do not use flipping operation, actually it should be cross-correlation, but by convention we call this convolution
  • 卷积满足结合律

Convolution on RGB images

  • height, width, channels(depth)

  • 图片和过滤器的通道数必须相等

  • deeplearning.ai - 卷积神经网络

  • n×n×ncf×f×nc(nf+1)×(nf+1)×nc

    nc: number of channels; nc: number of filters

  • detect nc features

Example of a layer

  • add a bias (偏差) to the output, and then apply a real non-linearity (非线性**函数)
  • less prone to over fitting (避免过拟合)
  • 上一层的输出作为这一层的输入
  • notation
    deeplearning.ai - 卷积神经网络

A simple convolution network example

  • deeplearning.ai - 卷积神经网络

  • Convolutional Layer (Conv) 卷积层

    Pooling Layer (Pool) 池化层

    Fully Connected Layer (FC) 全连接层

Pooling layer

  • reduce the size of representation to speed up computation and make some of the features it detects a bit more robust
  • no parameters to learn, just a fixed function, has no weights
  • 最后将池化的结果平整化为一个列向量

Max pooling

  • break into different regions
  • 输出每一个区域的最大值
  • 最大池化的超级参数(hyper-parameters)f=2,s=2 (often)
  • usually does not use any padding
  • 在某个区域提取到的特征保存在输出里
  • if this feature is detected anywhere in this filter, then keep a high number
  • 每个信道独立执行最大池化的计算

Average pooling

  • 每个区域取平均得到输出
  • 最大池化比平均池化更常用

Neural network example

识别数字

  • f=2,s=2 使输入的高和宽减少一半

  • 两类卷积的形式

    一个卷积层和一个池化层一起作为一层;或者分为单独的两层

  • 一般计算网络层数时,只看具有权重的层

  • 池化后的结果和全连接层的单元作笛卡尔连接??

  • not to invent your own settings of hyper parameters, but to look in the literature

  • 随着层数的增加,高度和宽度都会减少,信道数会增加

  • deeplearning.ai - 卷积神经网络
    deeplearning.ai - 卷积神经网络

Why convolution

  • parameter sharing (参数共享) and sparsity of connections (稀疏连接)
  • feature detector (特征检测器) 适用于整张图片
  • 某一个输出之和一部分输入相关
  • good at capturing translation invariance (捕捉平移不变)
  • 即使移动几个像素,图片依然具有与原图相似的特征