Core-Structures-Guided Multi-Modal Classification Neural Architecture Search

Core-Structures-Guided Multi-Modal Classification Neural Architecture Search

Pinhan Fu, Xinyan Liang, Tingjin Luo, Qian Guo, Yayu Zhang, Yuhua Qian

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence
Main Track. Pages 3980-3988. https://doi.org/10.24963/ijcai.2024/440

The multi-modal classification methods based on neural architecture search (NAS-MMC) can automatically learn a satisfied classifier from a given multi-modal search space. However, as the number of multi-modal features and fusion operators increases, the complexity of search space has increased dramatically. Rapidly identifying the satisfied fusion model from this vast space is very challenging. In this paper, we propose an efficient NAS-MMC method based on an idea of shrink-and-expansion search space, called core-structure-guided neural architecture search (CSG-NAS). Specifically, an evolutionary algorithm is first used to find core structures from a shrunk space (also called core structure search space) determined by high-quality features and fusion operators. Then a local search algorithm is used to find the optimal MMC model from the expanded space determined by the discovered core structures and the rest features as well as fusion operators. Moreover, a knowledge transfer strategy is introduced to further improve the overall performance and efficiency of the entire search process. Finally, extensive experimental results demonstrate the effectiveness of our CSG-NAS, attaining the superiority of classification performance, training efficiency and model complexity, compared to state-of-the-art ompetitors on several public benchmark multi-modal tasks. The source code is available at https://github.com/fupinhan123/CSG-NAS.
Keywords:
Machine Learning: ML: Multi-view learning
Machine Learning: ML: Classification
Machine Learning: ML: Evolutionary learning
Machine Learning: ML: Multi-modal learning