Li Xiangtai, Zhang Li, Cheng Guangliang, Yang Kuiyuan, Tong Yunhai, Zhu Xiatian, Xiang Tao
IEEE Trans Image Process. 2021;30:6829-6842. doi: 10.1109/TIP.2021.3099366.
Modelling long-range contextual relationships is critical for pixel-wise prediction tasks such as semantic segmentation. However, convolutional neural networks (CNNs) are inherently limited to model such dependencies due to the naive structure in its building modules (e.g., local convolution kernel). While recent global aggregation methods are beneficial for long-range structure information modelling, they would oversmooth and bring noise to the regions contain fine details (e.g., boundaries and small objects), which are very much cared in the semantic segmentation task. To alleviate this problem, we propose to explore the local context for making the aggregated long-range relationship being distributed more accurately in local regions. In particular, we design a novel local distribution module which models the affinity map between global and local relationship for each pixel adaptively. Integrating existing global aggregation modules, we show that our approach can be modularized as an end-to-end trainable block and easily plugged into existing semantic segmentation networks, giving rise to the GALD networks. Despite its simplicity and versatility, our approach allows us to build new state of the art on major semantic segmentation benchmarks including Cityscapes, ADE20K, Pascal Context, Camvid and COCO-stuff. Code and trained models are released at https://github.com/lxtGH/GALD-DGCNet to foster further research.
对诸如语义分割之类的逐像素预测任务而言,建模长距离上下文关系至关重要。然而,卷积神经网络(CNN)由于其构建模块中的简单结构(例如局部卷积核),在对这种依赖性进行建模时存在固有局限性。虽然最近的全局聚合方法有利于长距离结构信息建模,但它们会过度平滑并给包含精细细节的区域(例如边界和小物体)带来噪声,而这些在语义分割任务中是非常受关注的。为了缓解这个问题,我们建议探索局部上下文,以使聚合的长距离关系在局部区域中更准确地分布。具体而言,我们设计了一种新颖的局部分布模块,该模块为每个像素自适应地建模全局与局部关系之间的亲和度图。结合现有的全局聚合模块,我们表明我们的方法可以模块化成为一个端到端可训练的模块,并轻松插入现有的语义分割网络中,从而产生GALD网络。尽管我们的方法简单且通用,但它使我们能够在包括Cityscapes、ADE20K、Pascal Context、Camvid和COCO-stuff在内的主要语义分割基准上建立新的最先进水平。代码和训练模型已在https://github.com/lxtGH/GALD-DGCNet上发布,以促进进一步的研究。