深度学习笔记8：利用Tensorflow搭建神经网络

欢迎关注天善智能，我们是专注于商业智能BI，人工智能AI，大数据分析与挖掘领域的垂直社区，学习，问答、求职一站式搞定！

对商业智能BI、大数据分析挖掘、机器学习，python，R等数据领域感兴趣的同学加微信：tsaiedu，并注明消息来源，邀请你进入数据爱好者交流群，数据爱好者们都在这儿。

作者简介：

鲁伟：一个数据科学践行者的学习日记。数据挖掘与机器学习，R与Python，理论与实践并行。

个人公众号：数据科学家养成记（微信ID：louwill12）

配套学习****：手把手教你用Python 实践深度学习

在笔记7中，笔者和大家一起入门了Tensorflow的基本语法，并举了一些实际的例子进行了说明，终于告别了使用numpy手动搭建的日子。所以我们将继续往下走，看看如何利用Tensorflow搭建神经网络模型。

尽管对于初学者而言使用Tensorflow看起来并不那么习惯，需要各种步骤，但简单来说，Tensorflow搭建模型实际就是两个过程：创建计算图和执行计算图。在 deeplearningai 课程中，NG和他的课程组给我们提供了Signs Dataset（手势）数据集，其中训练集包括1080张64x64像素的手势图片，并给定了 6 种标注，测试集包括120张64x64的手势图片，我们需要对训练集构建神经网络模型然后对测试集给出预测。

先来简单看一下数据集：

# Loading the datasetX_train_orig, Y_train_orig, X_test_orig, Y_test_orig, classes = load_dataset()# Flatten the training and test imagesX_train_flatten = X_train_orig.reshape(X_train_orig.shape[0], -1).T

X_test_flatten = X_test_orig.reshape(X_test_orig.shape[0], -1).T# Normalize image vectorsX_train = X_train_flatten/255.X_test = X_test_flatten/255.# Convert training and test labels to one hot matricesY_train = convert_to_one_hot(Y_train_orig, 6)

Y_test= convert_to_one_hot(Y_test_orig,6)print("number of training examples = "+ str(X_train.shape[1]))print("number of test examples = "+ str(X_test.shape[1]))print("X_train shape: "+ str(X_train.shape))print("Y_train shape: "+ str(Y_train.shape))print("X_test shape: "+ str(X_test.shape))print("Y_test shape: "+ str(Y_test.shape))

下面就根据 NG 给定的找个数据集利用Tensorflow搭建神经网络模型。我们选择构建一个包含 2 个隐层的神经网络，网络结构大致如下：

LINEAR -> RELU -> LINEAR -> RELU -> LINEAR -> SOFTMAX

正如我们之前利用numpy手动搭建一样，搭建一个神经网络的主要步骤如下：

-定义网络结构

-初始化模型参数

-执行前向计算/计算当前损失/执行反向传播/权值更新

创建 placeholder

根据Tensorflow的语法，我们首先创建输入X和输出Y的占位符变量，这里需要注意shape参数的设置。

defcreate_placeholders(n_x, n_y):

X = tf.placeholder(tf.float32, shape=(n_x,None), name='X')

Y = tf.placeholder(tf.float32, shape=(n_y,None), name='Y')

returnX, Y

初始化模型参数

其次就是初始化神经网络的模型参数，三层网络包括六个参数，这里我们采用Xavier初始化方法：

def initialize_parameters(): tf.set_random_seed(1)

W1 = tf.get_variable("W1", [25, 12288], initializer = tf.contrib.layers.xavier_initializer(seed = 1))

b1 = tf.get_variable("b1", [25, 1], initializer = tf.zeros_initializer())

W2 = tf.get_variable("W2", [12, 25], initializer = tf.contrib.layers.xavier_initializer(seed = 1))

b2 = tf.get_variable("b2", [12, 1], initializer = tf.zeros_initializer())

W3 = tf.get_variable("W3", [6, 12], initializer = tf.contrib.layers.xavier_initializer(seed = 1))

b3 = tf.get_variable("b3", [6,1], initializer = tf.zeros_initializer()) parameters = {"W1": W1,

"b1": b1,

"W2": W2,

"b2": b2,

"W3": W3,

"b3": b3}

return parameters

执行前向传播

defforward_propagation(X, parameters):"""

Implements the forward propagation for the model: LINEAR -> RELU -> LINEAR -> RELU -> LINEAR -> SOFTMAX

"""

W1 = parameters['W1']

b1 = parameters['b1']

W2 = parameters['W2']

b2 = parameters['b2']

W3 = parameters['W3']

b3 = parameters['b3']

Z1 = tf.add(tf.matmul(W1, X), b1)

A1 = tf.nn.relu(Z1)

Z2 = tf.add(tf.matmul(W2, A1), b2)

A2 = tf.nn.relu(Z2)

Z3 = tf.add(tf.matmul(W3, A2), b3)

returnZ3

计算损失函数

在Tensorflow中损失函数的计算要比手动搭建时方便很多，一行代码即可搞定：

def compute_cost(Z3, Y):

logits = tf.transpose(Z3)

labels = tf.transpose(Y)

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits = logits, labels = labels))

returncost

代码整合：执行反向传播和权值更新

跟计算损失函数类似，Tensorflow中执行反向传播的梯度优化非常简便，两行代码即可搞定，定义完整的神经网络模型如下：

def model(X_train, Y_train, X_test, Y_test, learning_rate =0.0001,

num_epochs =

1500, minibatch_size = 32, print_cost =True):

ops.reset_default_graph()

tf.set_random_seed(

1)

seed=

3

(n_x,m) = X_train.shape

n_y = Y_train.shape[

0]

costs = []

#CreatePlaceholdersofshape (n_x, n_y)

X, Y = create_placeholders(n_x, n_y)

# Initializeparameters

parameters= initialize_parameters()

# Forward propagation:Buildthe forward propagationinthe tensorflow graph

Z3 = forward_propagation(X,parameters)

#Costfunction:Addcostfunctiontotensorflow graph

cost= compute_cost(Z3, Y)

# Backpropagation:Definethe tensorflow optimizer.Usean AdamOptimizer.

optimizer = tf.train.GradientDescentOptimizer(learning_rate = learning_rate).minimize(cost)

# Initialize all thevariables

init = tf.global_variables_initializer()

#Startthesessiontocomputethe tensorflow graphwithtf.Session()as sess: # Run the initialization

sess.run(init)

#Dothe trainingloopfor epoch inrange(num_epochs):

epoch_cost =

0.

num_minibatches =int(m/ minibatch_size)

seed=seed+

1

minibatches = random_mini_batches(X_train, Y_train, minibatch_size,seed)

for minibatch in minibatches: # Select a minibatch

(minibatch_X, minibatch_Y) = minibatch

_ , minibatch_cost = sess.run([optimizer,cost], feed_dict={X: minibatch_X, Y: minibatch_Y})

epoch_cost += minibatch_cost / num_minibatches

# Print thecostevery epochif print_cost == True and epoch % 100 == 0:

print ("Cost after epoch %i: %f" % (epoch, epoch_cost))

if print_cost == True and epoch % 5 == 0:

costs.append(epoch_cost)

# plot thecost

plt.plot(np.squeeze(costs))

plt.ylabel(

'cost')

plt.xlabel(

'iterations (per tens)')

plt.title(

"Learning rate ="+str(learning_rate))

plt.show()

# letssavetheparametersinavariable

parameters= sess.run(parameters)

print ("Parameters have been trained!") # Calculate the correct predictions

correct_prediction = tf.equal(tf.argmax(Z3), tf.argmax(Y))

# Calculate accuracyonthetestset

accuracy = tf.reduce_mean(tf.cast(correct_prediction,

"float"))

print ("Train Accuracy:", accuracy.eval({X: X_train, Y: Y_train}))

print ("Test Accuracy:", accuracy.eval({X: X_test, Y: Y_test}))

returnparameters