卷积层

在图像处理中，往往将图像转化为像素，如讲1000*1000的图像转化为1000000的向量。如果假设隐藏层也是为1000000时，则权值参数诶1000000000000个，这种权值参数太多了，无法进行训练，所以需要减少权值参数的个数。一般有两种方法进行减少：

第一种称为局部感知野。一般认为人对外界的认知时从局部到全局的，而图像的空间联系也是局部像素联系较为紧密。所以每个神经元没有必要对全局进行感知，只需要对局部进行感知，然后由更高层将局部的信息综合起来就得到了全局的信息。假如每个神经元只和10*10个像素值相连，那么权值个数为1000000*100，减少了万分之一，而10*10个像素值对应的10*10个参数，就是卷积操作的卷积核。

第二种称为参数共享。虽然采用局部感知的方式减少了权值，但是还是太多，这里通过权值共享，即这1000000个神经元的100个参数都是相同的，那么权值参数就变成了100了。

一个实例：

import tensorflow as tf
import numpy as np

image = np.array([[1,1,1,0,0],
                  [0,1,1,1,0],
                  [0,0,1,1,1],
                  [0,0,1,1,0],
                  [0,1,1,0,0]])

weight = np.array([[1,0,1],
                   [0,1,0],
                   [1,0,1]])
bias = 0
input = tf.constant(image, dtype=tf.float32)
filter = tf.constant(weight, dtype=tf.float32)
input = tf.reshape(input, [1,5,5,1])
filter = tf.reshape(filter, [3,3,1,1])
result = tf.nn.conv2d(input, filter, strides=[1,1,1,1], padding="VALID")
with tf.Session() as sess:
   idata = sess.run(input)
   filt = sess.run(filter)
   res = sess.run(result)
   print(res)

输出为1*3*3*1矩阵：

4 3 4

2 4 3

2 3 4

卷积层

此时卷积的padding="VALID"，输出大小计算公式为(（width - filter + 2*padding）/stride)+1, 对于VALID，为width/stride 向下取整，本例中（5-3+0）+1=3，如果收入的width=6，stride=2，则输出等于2.5，实际输出为2.实例如下：

import tensorflow as tf
import numpy as np

image = np.array([[1,1,1,0,0,1],
                  [0,1,1,1,0,1],
                  [0,0,1,1,1,1],
                  [0,0,1,1,0,1],
                  [0,1,1,0,0,1],
                  [1,1,1,1,1,1]])

weight = np.array([[1,0,1],
                   [0,1,0],
                   [1,0,1]])
bias = 0
input = tf.constant(image, dtype=tf.float32)
filter = tf.constant(weight, dtype=tf.float32)
input = tf.reshape(input, [1,6,6,1])
filter = tf.reshape(filter, [3,3,1,1])
result = tf.nn.conv2d(input, filter, strides=[1,2,2,1], padding="VALID")
with tf.Session() as sess:
   idata = sess.run(input)
   filt = sess.run(filter)
   res = sess.run(result)
   print(res)

输出为1*2*2*1矩阵：

4 4

2 4

如果padding设置为SAME，此时输出为width/stride上取整，则输出为：

import tensorflow as tf
import numpy as np
#convolution padding="SAME"
image = np.array([[1,1,1,0,0],
                  [0,1,1,1,0],
                  [0,0,1,1,1],
                  [0,0,1,1,0],
                  [0,1,1,0,0]])

weight = np.array([[1,0,1],
                   [0,1,0],
                   [1,0,1]])
bias = 0
input = tf.constant(image, dtype=tf.float32)
filter = tf.constant(weight, dtype=tf.float32)
input = tf.reshape(input, [1,5,5,1])
filter = tf.reshape(filter, [3,3,1,1])
result = tf.nn.conv2d(input, filter, strides=[1,1,1,1], padding="SAME")
with tf.Session() as sess:
   idata = sess.run(input)
   filt = sess.run(filter)
   res = sess.run(result)
   print(res)

输出为：1*5*5*1

2 2 3 1 1

1 4 3 4 1

1 2 4 3 1

1 2 3 4 1

0 2 2 1 1

stride=2时：

import tensorflow as tf
import numpy as np
#convolution padding="SAME"
image = np.array([[1,1,1,0,0],
                  [0,1,1,1,0],
                  [0,0,1,1,1],
                  [0,0,1,1,0],
                  [0,1,1,0,0]])

weight = np.array([[1,0,1],
                   [0,1,0],
                   [1,0,1]])
bias = 0
input = tf.constant(image, dtype=tf.float32)
filter = tf.constant(weight, dtype=tf.float32)
input = tf.reshape(input, [1,5,5,1])
filter = tf.reshape(filter, [3,3,1,1])
result = tf.nn.conv2d(input, filter, strides=[1,2,2,1], padding="SAME")
with tf.Session() as sess:
   idata = sess.run(input)
   filt = sess.run(filter)
   res = sess.run(result)
   print(res)

输出1*3*3*1：

2 3 1

1 4 3

0 2 1

相关推荐