TITLE:
New Fusion Approach of Spatial and Channel Attention for Semantic Segmentation of Very High Spatial Resolution Remote Sensing Images
AUTHORS:
Armand Kodjo Atiampo, Gokou Hervé Fabrice Diédié
KEYWORDS:
Spatial-Channel Attention, Super-Token Segmentation, Self-Attention, Vision Transformer
JOURNAL NAME:
Open Journal of Applied Sciences,
Vol.14 No.2,
February
9,
2024
ABSTRACT: The semantic segmentation of very
high spatial resolution remote sensing images is difficult due to the
complexity of interpreting the interactions between the objects in the scene.
Indeed, effective segmentation requires considering spatial local context and
long-term dependencies. To address this problem, the proposed approach is
inspired by the MAC-UNet network which is an extension of U-Net, densely
connected combined with channel attention. The advantages of this solution are
as follows: 1) The new model introduces a new attention called propagate
attention to build an attention-based encoder. 2) The fusion of multi-scale
information is achieved by a weighted linear combination of the attentions
whose coefficients are learned during the
training phase. 3) Introducing in the decoder, the Spatial-Channel-Global-Local
block which is an attention layer that uniquely combines channel
attention and spatial attention locally and globally. The performances of the
model are evaluated on 2 datasets WHDLD and DLRSD and show results of mean
intersection over union (mIoU) index in progress between 1.54% and 10.47% for
DLRSD and between 1.04% and 4.37% for WHDLD compared with the most efficient
algorithms with attention mechanisms like MAU-Net and transformers like TMNet.