keras功能API
准备
!pip3 install tensorflow==2.0.0a0
%matplotlib inline
import tensorflow as tf
from tensorflow import keras
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Requirement already satisfied: tensorflow==2.0.0a0 in /usr/local/lib/python3.7/site-packages (2.0.0a0)
Requirement already satisfied: google-pasta>=0.1.2 in /usr/local/lib/python3.7/site-packages (from tensorflow==2.0.0a0) (0.1.4)
Requirement already satisfied: tf-estimator-nightly<1.14.0.dev2019030116,>=1.14.0.dev2019030115 in /usr/local/lib/python3.7/site-packages (from tensorflow==2.0.0a0) (1.14.0.dev2019030115)
Requirement already satisfied: protobuf>=3.6.1 in /usr/local/lib/python3.7/site-packages (from tensorflow==2.0.0a0) (3.7.0)
Requirement already satisfied: termcolor>=1.1.0 in ./Library/Python/3.7/lib/python/site-packages (from tensorflow==2.0.0a0) (1.1.0)
Requirement already satisfied: keras-applications>=1.0.6 in /usr/local/lib/python3.7/site-packages (from tensorflow==2.0.0a0) (1.0.7)
Requirement already satisfied: grpcio>=1.8.6 in /usr/local/lib/python3.7/site-packages (from tensorflow==2.0.0a0) (1.19.0)
Requirement already satisfied: tb-nightly<1.14.0a20190302,>=1.14.0a20190301 in /usr/local/lib/python3.7/site-packages (from tensorflow==2.0.0a0) (1.14.0a20190301)
Requirement already satisfied: absl-py>=0.7.0 in ./Library/Python/3.7/lib/python/site-packages (from tensorflow==2.0.0a0) (0.7.0)
Requirement already satisfied: astor>=0.6.0 in ./Library/Python/3.7/lib/python/site-packages (from tensorflow==2.0.0a0) (0.7.1)
Requirement already satisfied: keras-preprocessing>=1.0.5 in /usr/local/lib/python3.7/site-packages (from tensorflow==2.0.0a0) (1.0.9)
Requirement already satisfied: wheel>=0.26 in /usr/local/lib/python3.7/site-packages (from tensorflow==2.0.0a0) (0.33.1)
Requirement already satisfied: gast>=0.2.0 in ./Library/Python/3.7/lib/python/site-packages (from tensorflow==2.0.0a0) (0.2.2)
Requirement already satisfied: numpy<2.0,>=1.14.5 in /usr/local/lib/python3.7/site-packages (from tensorflow==2.0.0a0) (1.16.2)
Requirement already satisfied: six>=1.10.0 in ./Library/Python/3.7/lib/python/site-packages (from tensorflow==2.0.0a0) (1.12.0)
Requirement already satisfied: setuptools in /usr/local/lib/python3.7/site-packages (from protobuf>=3.6.1->tensorflow==2.0.0a0) (40.8.0)
Requirement already satisfied: h5py in ./Library/Python/3.7/lib/python/site-packages (from keras-applications>=1.0.6->tensorflow==2.0.0a0) (2.9.0)
Requirement already satisfied: werkzeug>=0.11.15 in ./Library/Python/3.7/lib/python/site-packages (from tb-nightly<1.14.0a20190302,>=1.14.0a20190301->tensorflow==2.0.0a0) (0.14.1)
Requirement already satisfied: markdown>=2.6.8 in ./Library/Python/3.7/lib/python/site-packages (from tb-nightly<1.14.0a20190302,>=1.14.0a20190301->tensorflow==2.0.0a0) (3.0.1)
介绍
从概览中你已经熟悉了如何使用tf.keras.Sequential
来创建模型。不过tf.keras
提供了更为灵活的API函数来创建模型,它可以很好的用于多输入多输出、有共享层、非序列模型的情况。
基于深度学习模型通常是有向无环图(DGA
),因此tf.keras
提供了一系列API来实现基于层构建计算图的方法。
首先,你需要指定模型的输入:
inputs = keras.Input(shape=(32, 32, 3), name='image')
这里你需要指定输入的尺寸,这里是单个数据的尺寸而不是批的尺寸(比如说对于图片,只需要3个维度)
此时我们就获得了一个输入对象,该对象指定了你需要输入到模型的数据(尺寸、名称、数据类型)
然后可以通过在计算图中添加一个层对象,并使用输入作为参数对该对象进行调用,获得该层的输出
dense = keras.layers.Dense(64, activation='relu')
x = dense(inputs)
层的call
的行为,相当于在图中画了一个从输入指向当前层的有向边。
下面多添加几个层:
x = keras.layers.Dense(64, activation='relu')(x)
outputs = keras.layers.Dense(10, activation='softmax')(x)
然后通过关联输入和输出,我们可以创建一个完整的模型:
model = keras.Model(inputs=inputs, outputs=outputs, name='my_model')
可以通过调用summary
方法,查看模型的内容:
model.summary()
Model: "my_model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
image (InputLayer) [(None, 32, 32, 3)] 0
_________________________________________________________________
dense (Dense) (None, 32, 32, 64) 256
_________________________________________________________________
dense_1 (Dense) (None, 32, 32, 64) 4160
_________________________________________________________________
dense_2 (Dense) (None, 32, 32, 10) 650
=================================================================
Total params: 5,066
Trainable params: 5,066
Non-trainable params: 0
_________________________________________________________________
当然,我们也可以将模型的结构绘制到图片上(此时可能需要安装依赖,请自行安装)
keras.utils.plot_model(model, 'model.png', show_shapes=True) # 使用show_shapes参数会在图中显示每一层的输入输出的尺寸
多模型计算
在使用功能API时,模型通过指定输入和输出层来创建,那么一个层就可以用在多个模型当中。
在下面的例子中,我们使用相同的层堆叠出两个模型:encoder
,将图片转换为16维的向量以及一个端到端训练的自编码器模型autoencoder
。
# 编码器部分
encoder_input = keras.Input(shape=(28, 28, 1), name='img')
x = keras.layers.Conv2D(16, 3, activation='relu')(encoder_input)
x = keras.layers.Conv2D(32, 3, activation='relu')(x)
x = keras.layers.MaxPooling2D(3)(x)
x = keras.layers.Conv2D(32, 3, activation='relu')(x)
x = keras.layers.Conv2D(16, 3, activation='relu')(x)
encoder_output = keras.layers.GlobalMaxPooling2D()(x)
encoder = keras.Model(inputs=encoder_input, outputs=encoder_output, name='encoder')
encoder.summary()
keras.utils.plot_model(encoder, 'encoder.png', show_shapes=True)
# 自动编码器
x = keras.layers.Reshape((4, 4, 1))(encoder_output)
x = keras.layers.Conv2DTranspose(16, 3, activation='relu')(x)
x = keras.layers.Conv2DTranspose(32, 3, activation='relu')(x)
x = keras.layers.UpSampling2D(3)(x)
x = keras.layers.Conv2DTranspose(16, 3, activation='relu')(x)
decoder_output = keras.layers.Conv2DTranspose(1, 3, activation='relu')(x)
autoencoder = keras.Model(inputs=encoder_input, outputs=decoder_output, name='autoencoder')
autoencoder.summary()
keras.utils.plot_model(autoencoder, 'autoencoder.png', show_shapes=True)
Model: "encoder"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
img (InputLayer) [(None, 28, 28, 1)] 0
_________________________________________________________________
conv2d (Conv2D) (None, 26, 26, 16) 160
_________________________________________________________________
conv2d_1 (Conv2D) (None, 24, 24, 32) 4640
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 8, 8, 32) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 6, 6, 32) 9248
_________________________________________________________________
conv2d_3 (Conv2D) (None, 4, 4, 16) 4624
_________________________________________________________________
global_max_pooling2d (Global (None, 16) 0
=================================================================
Total params: 18,672
Trainable params: 18,672
Non-trainable params: 0
_________________________________________________________________
Model: "autoencoder"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
img (InputLayer) [(None, 28, 28, 1)] 0
_________________________________________________________________
conv2d (Conv2D) (None, 26, 26, 16) 160
_________________________________________________________________
conv2d_1 (Conv2D) (None, 24, 24, 32) 4640
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 8, 8, 32) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 6, 6, 32) 9248
_________________________________________________________________
conv2d_3 (Conv2D) (None, 4, 4, 16) 4624
_________________________________________________________________
global_max_pooling2d (Global (None, 16) 0
_________________________________________________________________
reshape (Reshape) (None, 4, 4, 1) 0
_________________________________________________________________
conv2d_transpose (Conv2DTran (None, 6, 6, 16) 160
_________________________________________________________________
conv2d_transpose_1 (Conv2DTr (None, 8, 8, 32) 4640
_________________________________________________________________
up_sampling2d (UpSampling2D) (None, 24, 24, 32) 0
_________________________________________________________________
conv2d_transpose_2 (Conv2DTr (None, 26, 26, 16) 4624
_________________________________________________________________
conv2d_transpose_3 (Conv2DTr (None, 28, 28, 1) 145
=================================================================
Total params: 28,241
Trainable params: 28,241
Non-trainable params: 0
_________________________________________________________________
可以看到,我们的编码器encoder
和解码器decoder
是严格对称的,所以autoencoder
的输出也是(28, 28, 1)
,Conv2D
的反操作是Conv2DTranspose
,MaxPooling2D
的反操作是UpSampling2D
。
层和模型都是可调用的
你可以向调用层一样调用模型,比如传入一个input
或者其它层的output
。值得注意的是,调用模型不仅仅是复用模型的网络结构,还复用了模型的权重。
下面同样是实现一个autoencoder
的例子,不同的是,这里创建了两个模型:encoder
和decoder
,然后将两个模型串联一起组成了autoencoder
# 编码器部分
encoder_input = keras.Input(shape=(28, 28, 1), name='original_img')
x = keras.layers.Conv2D(16, 3, activation='relu')(encoder_input)
x = keras.layers.Conv2D(32, 3, activation='relu')(x)
x = keras.layers.MaxPooling2D(3)(x)
x = keras.layers.Conv2D(32, 3, activation='relu')(x)
x = keras.layers.Conv2D(16, 3, activation='relu')(x)
encoder_output = keras.layers.GlobalMaxPooling2D()(x)
encoder = keras.Model(inputs=encoder_input, outputs=encoder_output, name='encoder')
encoder.summary()
# 解码器部分
decoder_input = keras.Input(shape=(16, ), name='encoder_img')
x = keras.layers.Reshape((4, 4, 1))(decoder_input)
x = keras.layers.Conv2DTranspose(16, 3, activation='relu')(x)
x = keras.layers.Conv2DTranspose(32, 3, activation='relu')(x)
x = keras.layers.UpSampling2D(3)(x)
x = keras.layers.Conv2DTranspose(16, 3, activation='relu')(x)
decoder_output = keras.layers.Conv2DTranspose(1, 3, activation='relu')(x)
decoder = keras.Model(inputs=decoder_input, outputs=decoder_output, name='decoder')
decoder.summary()
# 自编码器
autoencoder_input = keras.Input(shape=(28, 28, 1), name='img')
encoded_img = encoder(autoencoder_input)
decoded_img = decoder(encoded_img)
autoencoder = keras.Model(inputs=autoencoder_input, outputs=decoded_img, name='decoder')
autoencoder.summary()
Model: "encoder"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
original_img (InputLayer) [(None, 28, 28, 1)] 0
_________________________________________________________________
conv2d_8 (Conv2D) (None, 26, 26, 16) 160
_________________________________________________________________
conv2d_9 (Conv2D) (None, 24, 24, 32) 4640
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 8, 8, 32) 0
_________________________________________________________________
conv2d_10 (Conv2D) (None, 6, 6, 32) 9248
_________________________________________________________________
conv2d_11 (Conv2D) (None, 4, 4, 16) 4624
_________________________________________________________________
global_max_pooling2d_2 (Glob (None, 16) 0
=================================================================
Total params: 18,672
Trainable params: 18,672
Non-trainable params: 0
_________________________________________________________________
Model: "decoder"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
encoder_img (InputLayer) [(None, 16)] 0
_________________________________________________________________
reshape_2 (Reshape) (None, 4, 4, 1) 0
_________________________________________________________________
conv2d_transpose_8 (Conv2DTr (None, 6, 6, 16) 160
_________________________________________________________________
conv2d_transpose_9 (Conv2DTr (None, 8, 8, 32) 4640
_________________________________________________________________
up_sampling2d_2 (UpSampling2 (None, 24, 24, 32) 0
_________________________________________________________________
conv2d_transpose_10 (Conv2DT (None, 26, 26, 16) 4624
_________________________________________________________________
conv2d_transpose_11 (Conv2DT (None, 28, 28, 1) 145
=================================================================
Total params: 9,569
Trainable params: 9,569
Non-trainable params: 0
_________________________________________________________________
Model: "decoder"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
img (InputLayer) [(None, 28, 28, 1)] 0
_________________________________________________________________
encoder (Model) (None, 16) 18672
_________________________________________________________________
decoder (Model) (None, 28, 28, 1) 9569
=================================================================
Total params: 28,241
Trainable params: 28,241
Non-trainable params: 0
_________________________________________________________________
可以看出来,模型可以是嵌套的,一个模型中可以包含若干个子模型(就像层一样)
模型嵌套的一个非常有用的地方就是平均输出(多个模型的输出取平均),下面一个例子就是这样:
def get_model():
inputs = keras.Input(shape=(128, ))
outputs = keras.layers.Dense(1, activation='sigmoid')(inputs)
return keras.Model(inputs=inputs, outputs=outputs)
model1 = get_model()
model2 = get_model()
model3 = get_model()
inputs = keras.Input(shape=(128, ))
y1 = model1(inputs)
y2 = model2(inputs)
y3 = model3(inputs)
outputs = keras.layers.average([y1, y2, y3])
ensemble_model = keras.Model(inputs=inputs, outputs=outputs)
ensemble_model.summary()
Model: "model_3"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_4 (InputLayer) [(None, 128)] 0
__________________________________________________________________________________________________
model (Model) (None, 1) 129 input_4[0][0]
__________________________________________________________________________________________________
model_1 (Model) (None, 1) 129 input_4[0][0]
__________________________________________________________________________________________________
model_2 (Model) (None, 1) 129 input_4[0][0]
__________________________________________________________________________________________________
average (Average) (None, 1) 0 model[1][0]
model_1[1][0]
model_2[1][0]
==================================================================================================
Total params: 387
Trainable params: 387
Non-trainable params: 0
__________________________________________________________________________________________________
复杂拓扑计算图的控制
多输入和多输出
tf.keras
的功能API可以很容易的创建出多输入和多输出的图,而对于Sequential
API来说,这是不可能实现的。
下面是一个例子:
假设你创建一个用于将用户的故障单按照紧急程度排序并分配到对应处理部门的模型,那么你的模型会有三个输入:
- 故障单的抬头(文本输入)
- 故障单的内容(文本输入)
- 用户添加的标签(固定的若干标签)
然后会有两个输出:
- 紧急程度(0~1之间,
sigmoid
的输出) - 处理该故障单的部门(
softmax
输出的若干个部门)
下面我们就使用功能API来创建该模型:
num_tags = 12 # 故障单的标签总数
num_words = 10000 # 当处理文本时,文本中文字的个数
num_departments = 4 # 部门个数
title_input = keras.Input(shape=(None, ), name='title')
body_input = keras.Input(shape=(None, ), name='body')
tags_input = keras.Input(shape=(num_tags, ), name='tags')
title_features = keras.layers.Embedding(num_words, 64)(title_input)
body_features = keras.layers.Embedding(num_words, 64)(body_input)
title_features = keras.layers.LSTM(128)(title_features)
body_features = keras.layers.LSTM(32)(body_features)
x = keras.layers.concatenate([title_features, body_features, tags_input])
priority_pred = keras.layers.Dense(1, activation='sigmoid', name='priority')(x)
department_pred = keras.layers.Dense(num_departments, activation='softmax', name='department')(x)
model = keras.Model(inputs=[title_input, body_input, tags_input], outputs=[priority_pred, department_pred])
model.summary()
keras.utils.plot_model(model, 'model.png', show_shapes=True)
Model: "model_4"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
title (InputLayer) [(None, None)] 0
__________________________________________________________________________________________________
body (InputLayer) [(None, None)] 0
__________________________________________________________________________________________________
embedding (Embedding) (None, None, 64) 640000 title[0][0]
__________________________________________________________________________________________________
embedding_1 (Embedding) (None, None, 64) 640000 body[0][0]
__________________________________________________________________________________________________
unified_lstm (UnifiedLSTM) (None, 128) 98816 embedding[0][0]
__________________________________________________________________________________________________
unified_lstm_1 (UnifiedLSTM) (None, 32) 12416 embedding_1[0][0]
__________________________________________________________________________________________________
tags (InputLayer) [(None, 12)] 0
__________________________________________________________________________________________________
concatenate (Concatenate) (None, 172) 0 unified_lstm[0][0]
unified_lstm_1[0][0]
tags[0][0]
__________________________________________________________________________________________________
priority (Dense) (None, 1) 173 concatenate[0][0]
__________________________________________________________________________________________________
department (Dense) (None, 4) 692 concatenate[0][0]
==================================================================================================
Total params: 1,392,097
Trainable params: 1,392,097
Non-trainable params: 0
__________________________________________________________________________________________________
模型创建完成后,我们可以对于每一个输出使用不同的loss
计算方法,甚至采用不同的loss
权重
model.compile(optimizer=keras.optimizers.RMSprop(1e-3),
loss=['binary_crossentropy', 'categorical_crossentropy'],
loss_weights=[1., 0.2])
由于我们为每一个输出指定了名称,因此也可以采用下面的方式设置loss
model.compile(optimizer=keras.optimizers.RMSprop(1e-3),
loss={'priority': 'binary_crossentropy',
'department': 'categorical_crossentropy'},
loss_weights=[1., 0.2])
当训练的时候,我们可以传入一组Numpy
数据作为输入和输出
import numpy as np
title_data = np.random.randint(num_words,size=(1280, 10))
body_data = np.random.randint(num_words,size=(1280, 100))
tags_data = np.random.randint(2, size=(1280, num_tags)).astype('float32')
priority_data = np.random.random(size=(1280, 1))
department_data = np.random.randint(2, size=(1280, num_departments))
model.fit({'title': title_data, 'body': body_data, 'tags': tags_data},
{'priority': priority_data, 'department': department_data}, epochs=2, batch_size=32)
Epoch 1/2
1280/1280 [==============================] - 14s 11ms/sample - loss: 1.2485 - priority_loss: 0.6968 - department_loss: 2.7587
Epoch 2/2
1280/1280 [==============================] - 12s 9ms/sample - loss: 1.2022 - priority_loss: 0.6566 - department_loss: 2.7280
<tensorflow.python.keras.callbacks.History at 0x12940e828>
需要注意的是,如果使用tf.data.Dataset
传递数据,那么tf.data.Dataset
的输出必须是列表类型的元组([title_data, body_data, tags_data], [priority_data, department_data])
或者字典类型的元组({'title': title_data, 'body': body_data, 'tags': tags_data}, {'priority': priority_data, 'department': department_data})
一个ResNet
玩具模型
除了可以创建多输入多输出的模型外,功能API还可以很容易的实现跳跃连接的方式,即层与层之间并非连续连接的,这种模型对于Sequential
API来说也是无法实现的。
下面是一个简单的ResNet
的例子:
inputs = keras.Input(shape=(32, 32, 3), name='img')
x = keras.layers.Conv2D(32, 3, activation='relu')(inputs)
x = keras.layers.Conv2D(64, 3, activation='relu')(x)
block_1_output = keras.layers.MaxPooling2D(3)(x)
x = keras.layers.Conv2D(32, 3, activation='relu')(inputs)
x = keras.layers.Conv2D(64, 3, activation='relu')(x)
block_1_output = keras.layers.MaxPooling2D(3)(x)
x = keras.layers.Conv2D(64, 3, activation='relu', padding='same')(block_1_output)
x = keras.layers.Conv2D(64, 3, activation='relu', padding='same')(x)
block_2_output = keras.layers.add([x, block_1_output])
x = keras.layers.Conv2D(64, 3, activation='relu', padding='same')(block_2_output)
x = keras.layers.Conv2D(64, 3, activation='relu', padding='same')(x)
block_3_output = keras.layers.add([x, block_2_output])
x = keras.layers.Conv2D(64, 3, activation='relu')(block_3_output)
x = keras.layers.GlobalAveragePooling2D()(x)
x = keras.layers.Dense(256, activation='relu')(x)
x = keras.layers.Dropout(0.5)(x)
outputs = keras.layers.Dense(10, activation='softmax')(x)
model = keras.Model(inputs=inputs, outputs=outputs, name='toy_resnet')
model.summary()
keras.utils.plot_model(model, 'model.png', show_shapes=True)
Model: "toy_resnet"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
img (InputLayer) [(None, 32, 32, 3)] 0
__________________________________________________________________________________________________
conv2d_14 (Conv2D) (None, 30, 30, 32) 896 img[0][0]
__________________________________________________________________________________________________
conv2d_15 (Conv2D) (None, 28, 28, 64) 18496 conv2d_14[0][0]
__________________________________________________________________________________________________
max_pooling2d_4 (MaxPooling2D) (None, 9, 9, 64) 0 conv2d_15[0][0]
__________________________________________________________________________________________________
conv2d_16 (Conv2D) (None, 9, 9, 64) 36928 max_pooling2d_4[0][0]
__________________________________________________________________________________________________
conv2d_17 (Conv2D) (None, 9, 9, 64) 36928 conv2d_16[0][0]
__________________________________________________________________________________________________
add (Add) (None, 9, 9, 64) 0 conv2d_17[0][0]
max_pooling2d_4[0][0]
__________________________________________________________________________________________________
conv2d_18 (Conv2D) (None, 9, 9, 64) 36928 add[0][0]
__________________________________________________________________________________________________
conv2d_19 (Conv2D) (None, 9, 9, 64) 36928 conv2d_18[0][0]
__________________________________________________________________________________________________
add_1 (Add) (None, 9, 9, 64) 0 conv2d_19[0][0]
add[0][0]
__________________________________________________________________________________________________
conv2d_20 (Conv2D) (None, 7, 7, 64) 36928 add_1[0][0]
__________________________________________________________________________________________________
global_average_pooling2d (Globa (None, 64) 0 conv2d_20[0][0]
__________________________________________________________________________________________________
dense_6 (Dense) (None, 256) 16640 global_average_pooling2d[0][0]
__________________________________________________________________________________________________
dropout (Dropout) (None, 256) 0 dense_6[0][0]
__________________________________________________________________________________________________
dense_7 (Dense) (None, 10) 2570 dropout[0][0]
==================================================================================================
Total params: 223,242
Trainable params: 223,242
Non-trainable params: 0
__________________________________________________________________________________________________
下面我们开始训练该模型:
(x_train, y_train), (x_test, y_test) = keras.datasets.cifar10.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
y_train = keras.utils.to_categorical(y_train, 10)
y_test= keras.utils.to_categorical(y_test, 10)
model.compile(optimizer=keras.optimizers.RMSprop(1e-3),
loss='categorical_crossentropy', metrics=['acc'])
model.fit(x_train, y_train, epochs=1, batch_size=64, validation_split=0.2) # 拆分出20%的数据进行验证
Train on 40000 samples, validate on 10000 samples
40000/40000 [==============================] - 151s 4ms/sample - loss: 1.8996 - acc: 0.2804 - val_loss: 1.5229 - val_acc: 0.4388
<tensorflow.python.keras.callbacks.History at 0x1291aa278>
层共享
功能API另一个很有用的地方就是用于层共享。层共享指同一个层的实力在同一个模型中被多次重复使用,它们学习图中多个路径的特征。
层共享多用于对相似输入空间进行编码(比如说两段文本映射到相同的词空间中),由于从不同的输入中共享了数据,因此使得用更少的数据进行训练称为可能。例如如果一个词在某个输入中出现,那么其他的输入也会通过层的共享而因此受益。
使用功能API实现层共享,只需要简单的多次调用同一层即可。下面是一个多个文本输入共享一个词嵌入层的例子:
share_embedding = keras.layers.Embedding(1000, 128)
text_input_a = keras.Input(shape=(None, ), dtype='int32')
text_input_b = keras.Input(shape=(None, ), dtype='int32')
encode_input_a = share_embedding(text_input_a)
encode_input_b = share_embedding(text_input_b)
提取和复用层
由于使用功能API创建的层是一个静态的数据结构,所以可以对其进行访问和检查。这也是为什么我们可以将图结构输出到图片中。
这也意味着我们可以访问图中的层(**的部分),并且对其进行复用(用于迁移学习),这将对特征提取十分有用。
下面是一个在ImageNet
上预训练的VGG19的模型。
vgg19 = keras.applications.VGG19()
这就是一个模型中间**的部分,通过查询图的数据结构获得。
使用该模型特征,我们可以仅使用3行代码创建新的特征提取模型,该模型输出vgg19
的最后一层**后的输出:
feature_list = [layer.out for layer in vgg19.layers]
feat_extraction_model = keras.Model(inputs=vgg19.input, outputs=feature_list)
img = np.random.random(size=(1, 224, 224, 3))
extracted_feature = feat_extraction_model(img)
自定义层实现API扩展
tf.keras
提供了很多内建的层,比如说:
- 卷积层:
Conv1D
、Conv2D
、Conv3D
以及Conv2DTranspose
等 - 池化层:
MaxPooling1D
、MaxPooling2D
、MaxPooling3D
以及AveragePooling1D
等 -
RNN
层:GRU
、LSTM
、ConvLSTM2D
等 -
BatchNormalization
、Dropout
、Embedding
等
如果没有发现你想用的层,那么你可以简单通过自定义层来实现扩展
所有的层都是继承自Layer
类,并实现了下面几个方法:call
方法,定义了层的计算过程。build
方法,完成了权重之类的创建(当然你也可以在__init__
方法中创建)
下面是一个简单实现Dense
的例子:
class CustmDense(keras.layers.Layer):
def __init__(self, units=32):
super(CustmDense, self).__init__()
self.units = units
def build(self, input_shape):
self.w = self.add_weight(shape=(input_shape[-1], self.units), initializer='random_normal', trainable=True)
self.b = self.add_weight(shape=(self.units, ), initializer='random_normal', trainable=True)
def call(self, inputs):
return tf.matmul(inputs, self.w) + self.b
def get_config(self):
parent_config = super(CustmDense, self).get_config()
parent_config['units'] = self.units
return parent_config
@classmethod
def from_config(cls, config):
return cls(**config)
inputs = keras.Input(shape=(4, ))
outputs = CustmDense(10)(inputs)
model = keras.Model(inputs=inputs, outputs=outputs)
如果想要为层添加序列化功能,那么就实现get_config
方法,该方法能够返回层参数的字典
同样,如果需要支持从配置参数中恢复层,那么就需要实现from_config
方法
何时使用功能API
如何决定在何时使用功能API或者直接从Model
类进行继承?
一般来说,功能API是高层的API,更加容易使用且稳定,并且提供给了很多使用继承Model
无法实现的功能
但是对于继承Model
类来说,能够更加灵活,并且能够实现功能API不易实现的功能(比如说,你无法使用功能API实现Tree-RNN
)
下面列出一些功能API的有点:
- 更少的代码,功能API已经实现了一些功能,因此你无需去自定义
- 自动对参数进行检查
- 模型可以可视化
- 模型可以序列化以及拷贝
功能API的缺点如下:
- 不支持动态图结构
- 有时候你就是需要从头创建…