Strengthening Layer Interaction via Dynamic Layer Attention

Kaishen Wang; Xun Xia; Jian Liu; Zhang Yi; Tao He

doi:10.24963/ijcai.2024/561

Strengthening Layer Interaction via Dynamic Layer Attention

Kaishen Wang, Xun Xia, Jian Liu, Zhang Yi, Tao He

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence

Main Track. Pages 5073-5081. https://doi.org/10.24963/ijcai.2024/561

PDF BibTeX

In recent years, employing layer attention to enhance interaction among hierarchical layers has proven to be a significant advancement in building network structures. In this paper, we delve into the distinction between layer attention and the general attention mechanism, noting that existing layer attention methods achieve layer interaction on fixed feature maps in a static manner. These static layer attention methods limit the ability for context feature extraction among layers. To restore the dynamic context representation capability of the attention mechanism, we propose a Dynamic Layer Attention (DLA) architecture. The DLA comprises dual paths, where the forward path utilizes an improved recurrent neural network block, named Dynamic Sharing Unit (DSU), for context feature extraction. The backward path updates features using these shared context representations. Finally, the attention mechanism is applied to these dynamically refreshed feature maps among layers. Experimental results demonstrate the effectiveness of the proposed DLA architecture, outperforming other state-of-the-art methods in image recognition and object detection tasks. Additionally, the DSU block has been evaluated as an efficient plugin in the proposed DLA architecture. The code is available at https://github.com/tunantu/Dynamic-Layer-attention.

Keywords:

Machine Learning: ML: Attention models

Computer Vision: CV: Recognition (object detection, categorization)

Computer Vision: CV: Representation learning

Machine Learning: ML: Theory of deep learning