IEEE Trans Pattern Anal Mach Intell. 2020 Aug;42(8):2011-2023. doi: 10.1109/TPAMI.2019.2913372. Epub 2019 Apr 29.
The central building block of convolutional neural networks (CNNs) is the convolution operator, which enables networks to construct informative features by fusing both spatial and channel-wise information within local receptive fields at each layer. A broad range of prior research has investigated the spatial component of this relationship, seeking to strengthen the representational power of a CNN by enhancing the quality of spatial encodings throughout its feature hierarchy. In this work, we focus instead on the channel relationship and propose a novel architectural unit, which we term the "Squeeze-and-Excitation" (SE) block, that adaptively recalibrates channel-wise feature responses by explicitly modelling interdependencies between channels. We show that these blocks can be stacked together to form SENet architectures that generalise extremely effectively across different datasets. We further demonstrate that SE blocks bring significant improvements in performance for existing state-of-the-art CNNs at slight additional computational cost. Squeeze-and-Excitation Networks formed the foundation of our ILSVRC 2017 classification submission which won first place and reduced the top-5 error to 2.251 percent, surpassing the winning entry of 2016 by a relative improvement of ∼ 25 percent. Models and code are available at https://github.com/hujie-frank/SENet.
卷积神经网络(CNN)的核心构建块是卷积运算符,它使网络能够通过在每层的局部感受野内融合空间和通道信息来构建有信息量的特征。大量先前的研究都探讨了这种关系的空间部分,旨在通过增强整个特征层次结构中的空间编码质量来增强 CNN 的表示能力。在这项工作中,我们转而关注通道关系,并提出了一种新的架构单元,我们称之为“挤压激励”(SE)块,它通过显式建模通道之间的相关性来自适应地重新校准通道的特征响应。我们表明,这些块可以组合在一起形成 SENet 架构,可以在不同的数据集上非常有效地进行泛化。我们进一步证明,SE 块在略微增加计算成本的情况下,为现有的最先进的 CNN 带来了显著的性能提升。Squeeze-and-Excitation Networks 是我们 2017 年 ILSVRC 分类提交的基础,该提交获得了第一名,并将前五名的错误率降低到 2.251%,相对提高了 2016 年的获胜成绩约 25%。模型和代码可在 https://github.com/hujie-frank/SENet 上获得。