二值网络--Structured Binary Neural Networks for Accurate Image Classification and Semantic Segmentation
Structured Binary Neural Networks for Accurate Image Classification and Semantic Segmentation
https://arxiv.org/abs/1811.10413
https://github.com/wonnado/binary-nets Pytorch
Binary Ensemble Neural Network 的思路就是将 若干个独立的二值网络组合起来近似 实数值网络,本文提出的 Structured Binary Neural Networks 更进一步,将实数值网络分解为各个模块,然后使用若干二值网络来近似这些模块,对于网络的分解,通过学习完成动态分解。
3.1 Problem definition
Binarization of weights: 权重二值化
Binarization of activations: 响应二值化
3.2 Structured Binary Network Decomposition
propose to decompose a network into binary structures while preserving its representability
如何使用二值网络结构来近似实数值网络,反过来就是将实数值网络分解为二值网络结构,这里提出两个分解思路:layer-wise decomposition and group-wise decomposition
3.2.1 Layer-wise binary decomposition
layer-wise binary decomposition 的思路就是对每个实数值卷积层使用 K 个二值卷积层来近似表示,如上图(c)所示
approximate each layer with multiple branches of binary layers
这么分解可能有一个问题就是误差会累积,通过反向传播导致梯度偏差过大。
each branch will introduce a certain amount of error and the error may accumulate due to the aggregation of multi-branches. As a result, this strategy may incur severe quantization errors and bring large deviation for gradients during backpropagation
为此作者提出了 下面的分解方式
3.2.2 Group-wise binary decomposition
所谓的 Group-wise binary decomposition 就是对于一个给定的实数值网络模块,这里我们使用 K 个二值网络模块来近似这个实数值网络模型
3.2.3 Learning for dynamic decomposition
如何分解网络了?
这里引入了一个 fusion gate as the soft connection between blocks
3.3 Extension to semantic segmentation
这里介绍了一下如何将上面提出的思路应用语义分割中的 atrous convolutional layer ,Binary Parallel Atrous Convolution (BPAC)
Multi-dilations decompose
4 Discussions
Complexity analysis
LBD: It implements the layer-wise binary decomposition strategy
Group-Net: It implements the full model with learnt soft connections ReLU(·) as nonlinearity
Group-Net**: apply shortcut bypassing every binary convolution to improve the convergence Tanh(·) as nonlinearity.
11