Resource

论文地址：https://arxiv.org/abs/1904.00420

论文代码：https://github.com/megvii-model/SinglePathOneShot (官方代码，功能不全)

https://github.com/CanyonWind/Single-Path-One-Shot-NAS-MXNet (其他实现方法，功能齐全)

Introduction

作者首先提出了目前AutoML中的两个问题：

1. 网络参数深耦合，并且继承的权重为什么起作用的原理不得而知

2. 在联合优化网络结构和权重的时候加深了这种耦合

目前的One-Shot模型解决了问题2但是问题1仍旧没有被解决，因此作者提出了这样的假设：超网络训练应该是随机的所有架构的权重被同时优化（Thus, we propose that the supernet training should be stochastic, in that all architectures can have their weights optimized simultaneously）。在此基础上，作者提出了单路径（Single Path）的超网络模型， Single Path顾名思义整个超网络只有一条路径，如图1所示。并且在超网络训练过程中使用均匀采样（uniform sampling）来选择一个子网进行训练。

【论文解析】Single Path One-Shot Neural Architecture Search with Uniform Sampling

Method

1. 方法概述

超网络的权重对于搜索空间中的所有架构同时优化（This gives rise to the principle that the supernet weights WA should be optimized in a way that all architectures in the search space are optimized simultaneously），在超网络的每一步优化过程中，随机选取一个子网络，只有这个子网络的权重被更新（In each step of optimization, an architecture a is randomly sampled. Only
weights W(a) are activated and updated）。整体的公式如下，其中【论文解析】Single Path One-Shot Neural Architecture Search with Uniform Sampling 是先验分布，这里采用均匀采样。

【论文解析】Single Path One-Shot Neural Architecture Search with Uniform Sampling

2. 超网络架构

超网络架构如表2所示，其中包括20个block(表中CB， 4+4+8+4=20),每种choice block有四种选择（shuffleNet中的四种block, 如图所示）。在搜索过程中为了防止模型计算量过大，作者设置了相应的FLOPs和PARAM的限制。搜索空间为4^20。

【论文解析】Single Path One-Shot Neural Architecture Search with Uniform Sampling

3. Trick

为了增加block的多样性,作者引入了两种trick（原文中不叫trick，我这么叫的）分别是 Channel Number Search 和 Mixed-Precision Quantization Search。

Channel Number Search

Channel Number Search 的示意图如下，我们都知道卷积层kernel有很多个channel，这里所要做的事随机将kernel裁剪，得到一个较小的kernel，具体怎么裁剪根据进化算法决定。如果随机裁剪会比较麻烦，因此设定了一个范围[0.2, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0]，有人会奇怪怎么还能大于1？这是因为shuffleNet输入会有一个channel split，将channel分成两部分，具体的解释可以参考 MXNET实现版本，里面给出了详细的解释与过程。

【论文解析】Single Path One-Shot Neural Architecture Search with Uniform Sampling

Mixed-Precision Quantization Search

Mixed-Precision Quantization Search 这个是混合精度搜索，如图所示，这个我没怎么看懂，但是混合精度应该是为了减少计算量吧。In the search space, choices of weight and feature bit widths include {(1， 2); (2，2); (1， 4); (2， 4); (3， 4); (4， 4)}.

【论文解析】Single Path One-Shot Neural Architecture Search with Uniform Sampling

Discussion

SPOS提出了一个结构极其简单的超网络结构，只有一条路径。但是我觉得ShuffleNet已经很强大了，在实验结果中随机组合Imagenet都top1都能达到73.8%,这里的NAS加持之后提升至74.3%，感觉提升不够大。Single Path+Uniform Sampling这种组合还是值得借鉴的，因为其足够简单节省了很多时间，后面小米的FairNAS证明了Uniform Sampling的正确性。回看作者一开始提出的问题，感觉值解决了参数的深耦合问题，对于继承权重的可行性还没有解释。

本人水平有限，有问题欢迎大家讨论指正，谢谢

【论文解析】Single Path One-Shot Neural Architecture Search with Uniform Sampling

Resource

Introduction

Method

1. 方法概述

2. 超网络架构

3. Trick

Channel Number Search

Mixed-Precision Quantization Search

Discussion

相关推荐