Multimodal emotion analysis involves the integration of information from various modalities to better understand human emotions. In this paper. we propose the Cross-modal Emotion Recognition based on multi-layer semantic fusion (CM-MSF) model. which aims to leverage the complementarity of important information between modalities and extract advanced features in an adaptive manner. https://leoners.shop/product-category/armoire/