SQUEEZENET

Compelling Advantages

  • Smaller CNNs require less communication across servers during distributed training.
  • Smaller CNNs require less bandwidth to export a new model from the cloud to an autonomous car.
  • Smaller CNNs are more feasible to deploy on FPGAs and other hardware with limited memory.
  • SqueezeNet achieves AlexNet-level accuracy on ImageNet with 50x fewer parameters.
  • More efficient distributed training
  • Less overhead when exporting new models to clients

Architectural design strategies

  • Replace 3x3 filters with 1x1 filters
  • Decrease the number of input channels to 3x3 filters
  • Downsample late in the network so that convolution layers have large activation maps

Methods

SQUEEZENET

Architecture

SQUEEZENET

Other squeeznet details

SQUEEZENET

Experiments

SQUEEZENET
SQUEEZENET

Others

  • early layers in the network have large strides, then most layers will have small activation maps.
  • delayed downsampling to four different CNN architectures, and in each case delayed downsampling led to higher classification accuracy