Task05：卷积神经网络基础；leNet；卷积神经网络进阶

课程地址：

https://www.boyuai.com/elites/course/cZu18YmweLv10OeV/video/Rosi4tliobRSKaSVcsRx_

Task05：卷积神经网络基础；leNet；卷积神经网络进阶（1天）

卷积运算计算

关于二维卷积输出特征图形状计算公式：https://pytorch.org/docs/stable/nn.html?highlight=nn%20conv2d#conv2d

Parameters

in_channels (python:int) – Number of channels in the input image
out_channels (python:int) – Number of channels produced by the convolution
kernel_size (python:int or tuple) – Size of the convolving kernel
stride (python:int or tuple, optional) – Stride of the convolution. Default: 1
padding (python:int or tuple, optional) – Zero-padding added to both sides of the input. Default: 0
padding_mode (string, optional) – zeros
dilation (python:int or tuple, optional) – Spacing between kernel elements. Default: 1
groups (python:int, optional) – Number of blocked connections from input channels to output channels. Default: 1
bias (bool, optional) – If True, adds a learnable bias to the output. Default: True

Task05：卷积神经网络基础；leNet；卷积神经网络进阶

卷积神经网络感受野计算Receptive Field Arithmetic：

当前感受野的尺寸r，相邻特征之间的距离（或者jump）j，左上角（起始）特征的中心坐标start，其中特征的中心坐标定义为其感受野的中心坐标（如上述固定大小CNN特征图所述）。假设卷积核大小k，填充大小p，步长大小s，则其输出层的相关属性计算如下：

Task05：卷积神经网络基础；leNet；卷积神经网络进阶

公式一基于输入特征个数和卷积相关属性计算输出特征的个数
公式二计算输出特征图的jump，等于输入图的jump与输入特征个数（执行卷积操作时jump的个数，stride的大小）的乘积
公式三计算输出特征图的receptive field size，等于k个输入特征覆盖区域(k-1)*jin,加上边界上输入特征的感受野覆盖的附加区域rin。
公式四计算第一个输出特征的感受野的中心位置，等于第一个输入特征的中心位置，加上第一个输入特征位置到第一个卷积核中心位置的距离(k-1)/2*jin，再减去填充区域大小p*jin。注意：这里都需要乘上输入特征图的jump，从而获取实际距离或间隔。

经典卷积神经网络

https://blog.****.net/yaoxunji/article/details/88351396

LeNet(1998)-->AlexNet(2012)-->VGG(2014)-->GoogleLeNet(2014)-->ResNet(2015)

LeNet:conv1 (6) -> pool1 -> conv2 (16) -> pool2 -> fc3 (120) -> fc4 (84) -> fc5 (10) -> softmax，括号内数字表示channel数。这是个很小的五层网络（特指卷积或者全连接层）

AlexNet基本架构为：conv1 (96) -> pool1 -> conv2 (256) -> pool2 -> conv3 (384) -> conv4 (384) -> conv5 (256) -> pool5 -> fc6 (4096) -> fc7 (4096) -> fc8 (1000) -> softmax。AlexNet有着和LeNet相似网络结构，但更深、有更多参数。conv1使用11×11的滤波器、步长为4使空间大小迅速减小(227×227 -> 55×55)。

AlexNet的特点：

第一次使用ReLU**函数，有更好的梯度特性、训练更快。
使用了随机失活(dropout)，p=0.5，可以防止过拟合
大量使用数据扩充技术
使用SGD，Momentum 0.9
learning rate 1e-2 (0.01)， reduced by 10 manually when val accuracy plateaus
L2 weight decay 5e-4
batch size 128
使用Norm layers（不再使用）

VGG16的基本架构为conv1^2 (64) -> pool1 -> conv2^2 (128) -> pool2 -> conv3^3 (256) -> pool3 -> conv4^3 (512) -> pool4 -> conv5^3 (512) -> pool5 -> fc6 (4096) -> fc7 (4096) -> fc8 (1000) -> softmax。 ^3代表重复3次。VGG16内存主要消耗在前两层卷积，而参数最主要在第一层全连接中最多。

VGGNet特点：

结构简单，只有3x3，stride 1，pad 1的卷积和2x2，stride 2的max pooling，每过一次pooling，feature map大小JIAN半。
参数量大（参数和内存解析见上图）
合适的网络初始化
使用batch normalization
FC7提取的特征对其他任务有帮助。FC7始于AlexNet，表示某一全连接层，该层提取特征用于分类任务。

GoogleLeNet: