有无bottleneck情况下实现的ResNet v1和v2结构 。
上一篇文章中写了关于ResNet中的conv方法,主要是使用了fixed padding技术。
本文介绍的其实是四种ResNet结构,而大家常用的可能只是其中的带bottleneck结构,也就是1 3 1型的ResNet_v2 ,所以本文介绍顺序是
bottleneck block v2
block v2
bottleneck block v1
block v1
V1 论文地址
V2论文地址
v1和v2的区别
下图左边是v1,右边是v2,区别就是BN和Relu以及卷积操作三者之间的顺序,v2结构在v2论文中被称为full pre-activateion
论文中详细分析了为什么full pre-activation是最好的,这里不解释了,下面看v1和v2在实现block上的区别。
bottleneck 和 block的区别
bottleneck block不同于block的地方就是:
block使用两个3x3的卷积,而bottleneck使用1x1 3x3 1x1型卷积。
在bottleneck中,如果要降采样,那么降采样发生在三个卷积层中第二层卷积操作上,也就是那个3x3卷积上。而block结构中降采样发生在两个卷积层中第一层上。因此,同样是4个block,每个block分别是{3,4,6,3}个units,但是block结构和botleneck结构构造的分别是resnet-34和resnet-50 。
关于降采样的位置
resnet一般有四个block,记为block{1,2,3,4},他们对应的是conv{2,3,4,5}_x,但是并不是每个block都进行了降采样,如下图所示,conv1进行了使用了一个大卷积核和strides=2,然后进行了strides=2的pooling,这么一来进入conv2的input size就是 input_size/4,然后在conv{2,3,4,5}_x的第一个卷积层也就是conv{2,3,4,5}_1的卷积strides分别是[1,2,2,2],也就是四个block{1,2,3,4}的第一次卷积的strides分别是[1,2,2,2] .这样一来,block{1,2,3,4}的输出大小分别是 input_size/4, input_size/8, input_size/16, input_size/32.
下面代码显示了两者构造ResNet上的区别。
def resnet(self, inputs):
"""Build convolutional network layers attached to the given input tensor"""
# training = (mode == learn.ModeKeys.TRAIN) and not FLAGS.bn_freeze
training=self.is_train
use_bn=self.use_bn
blocks = cfgs.backbones_dict[cfgs.backbone]
if self.res_n < 50:
res_block = resnet_utils.res_block2
filters=[[64,64],[128,128],[256,256],[512,512]]
else:
res_block = resnet_utils.res_block3
filters = [[64, 64, 256], [128, 128,512], [256, 256,1024], [512, 512,2048]]
with tf.variable_scope("resnet_model"):
C1 = resnet_utils.conv_layer(inputs, 64, kernel_size=7, strides=2)
################## I/2
C1 = resnet_utils.norm_layer(C1, training, use_bn)
C1 = resnet_utils.pool_layer(C1, (3, 3), stride=(2, 2))
################## C1 I/4,I/4,64|64
C2 = res_block(C1, filters[0], training, use_bn, self.use_se_block, strides=1, downsample=True)
################## C2 I/4,I/4,64|256
for i in range(blocks[0] - 1):
C2 = res_block(C2,filters[0], training, use_bn, self.use_se_block)
C3 = res_block(C2, filters[1], training, use_bn, self.use_se_block, strides=2, downsample=True)
################## C3 I/8,I/8,128|512
for i in range(blocks[1] - 1):
C3 = res_block(C3, filters[1], training, use_bn, self.use_se_block)
C4 = res_block(C3, filters[2], training, use_bn, self.use_se_block, strides=2,downsample=True)
################## C4 I/16,I/16,256|1024
for i in range(blocks[2] - 1):
C4 = res_block(C4, filters[2], training, use_bn, self.use_se_block)
C5 = res_block(C4, filters[3], training, use_bn, self.use_se_block, strides=2,downsample=True)
################## C5 I/32,I/32,512|2048
for i in range(blocks[3] - 1):
C5 = res_block(C5, filters[3], training, use_bn, self.use_se_block)
return [ C1, C2, C3, C4, C5]
def res_block2(bottom, filters, training, use_bn, use_se_block, strides=1, downsample=False):
path_2=bottom
#conv 3x3
path_1 = conv_layer(bottom, filters[0], kernel_size=3,strides=strides)
path_1 = norm_layer(path_1, training, use_bn)
path_1 = relu(path_1)
# conv 3x3
path_1 = conv_layer(path_1, filters[1], kernel_size=3)
path_1 = norm_layer(path_1, training, use_bn)
# path_1 = relu(path_1)
if use_se_block:
path_1=se_block(path_1)
if downsample:
path_2 = conv_layer(path_2, filters[1], kernel_size=1, strides=strides)
path_2 = norm_layer(path_2, training, use_bn)
top = path_1 + path_2
top = relu(top)
return top
def res_block3(bottom, filters, training, use_bn, use_se_block, strides=1, downsample=False):
path_2=bottom
# conv 1x1
path_1 = conv_layer(bottom, filters[0], kernel_size=1)
path_1 = norm_layer(path_1, training, use_bn)
path_1 = relu(path_1) # activation?
# conv 3x3
path_1 = conv_layer(path_1, filters[1], kernel_size=3, strides=strides)
path_1 = norm_layer(path_1, training, use_bn)
path_1 = relu(path_1)
# conv 1x1
path_1 = conv_layer(path_1, filters[2], kernel_size=1)
path_1 = norm_layer(path_1, training, use_bn)
if use_se_block:
path_1 = se_block(path_1)
if downsample:
# shortcut
path_2 = conv_layer(path_2, filters[2], kernel_size=1, strides=strides)
path_2 = norm_layer(path_2, training, use_bn)
top = path_1 + path_2
top = relu(top)
return top
原文和tf.slim中resnet的不同实现架构
首先需要说明的是,官方和作者原文的实现都是用的[1,2,2,2]型strides,并且strides=2的卷积位置是每个block的第一个unit的第二个卷积层,也就是block-{2,3,4}/unit1/conv2
而slim的实现是用的[2,2,2,1]型步长,也就是说在slim中block{1,2,3,4}的大小是input_size/8, input_size/16, input_size/32, input_size/32. 并且slim中是在每个block的最后一个unit的第二个卷积层进行降采样。也就是block{1,2,3}/unit{3,4,6}/conv2.
因此在使用slim的pre-trained模型时候要注意,尤其在用block构建FPN的时候,用block4 block2 block1的最后一个unit最后一个conv层比较合适。
如果用bottleneck结构,也就是下图中的50-layer模型来解释。
那么原文和slim实现的resnet的block{1,2,3,4}中具体卷积层的 输出 map大小
原文版本的bottleneck
conv1 [B , I/4, I/4 64]
conv2_1_1 [B , I/4, I/4 64]
conv2_1_2 [B , I/4, I/4 64]
conv2_1_3 [B , I/4, I/4 256]
conv2_3_3 [B , I/4, I/4 256]
conv3_1_1 [B , I/4, I/4 128]
conv3_1_2 [B , I/8, I/8 128]# 在这里降采样
conv3_1_3 [B , I/8, I/8 128]
conv3_4_3 [B , I/8, I/8 512]#block[2]
conv4_1_1 [B , I/8, I/8 256]
conv4_1_2 [B , I/16, I/16 256]# 在这里降采样
conv4_1_3 [B , I/16, I/16 1024]
conv4_6_3 [B , I/16, I/16 1024]#block[3]
conv5_1_1 [B , I/16, I/16 512]
conv5_1_2 [B , I/32, I/32 512]# 在这里降采样
conv5_1_3 [B , I/32, I/32 2048]
conv5_3_3 [B , I/32, I/32 2048]#block[4]
Slim版本的bottleneck
conv1 [B , I/4, I/4 64]
conv2_1_1 [B , I/4, I/4 64]
conv2_1_2 [B , I/8, I/8 64]# 在这里降采样
conv2_1_3 [B , I/8, I/8 256]
conv2_3_3 [B , I/8, I/8 256]#block[1]
conv3_1_1 [B , I/8, I/8 128]
conv3_1_2 [B , I/16, I/16 128] # 在这里降采样
conv3_1_3 [B , I/16, I/16 128]
conv3_4_3 [B , I/16, I/16 512]#block[2]
conv4_1_1 [B , I/16, I/16 256]
conv4_1_2 [B , I/32, I/32 256]# 在这里降采样
conv4_1_3 [B , I/32, I/32 1024]
conv4_6_3 [B , I/32, I/32 1024]
conv5_1_1 [B , I/32, I/32 512]
conv5_1_2 [B , I/32, I/32 512]
conv5_1_3 [B , I/32, I/32 2048]
conv5_3_3 [B , I/32, I/32 2048]# block[4]
原文版本的block
conv1 [B , I/4, I/4 64]
conv2_1_1 [B , I/4, I/4 64]
conv2_1_2 [B , I/4, I/4 64]
conv2_3_2 [B , I/4, I/4 64]
conv3_1_1 [B , I/8, I/8 128]# 在这里降采样
conv3_1_2 [B , I/8, I/8 128]
conv3_4_2 [B , I/8, I/8 128]#block[2]
conv4_1_1 [B , I/16, I/16 256]# 在这里降采样
conv4_1_2 [B , I/16, I/16 256]
conv4_6_2 [B , I/16, I/16 256]#block[3]
conv5_1_1 [B , I/32, I/32 512]# 在这里降采样
conv5_1_2 [B , I/32, I/32 512]
conv5_3_2 [B , I/32, I/32 512]#block[4]