Self-adaptive attention fusion for multimodal aspect-based sentiment analysis
Self-adaptive attention fusion for multimodal aspect-based sentiment analysis
Blog Article
COCONUT OIL ORG VIR Multimodal aspect term extraction (MATE) and multimodal aspect-oriented sentiment classification (MASC) are two crucial subtasks in multimodal sentiment analysis.The use of pretrained generative models has attracted increasing attention in aspect-based sentiment analysis (ABSA).However, the inherent semantic gap between textual and visual modalities poses a challenge in transferring text-based generative pretraining models to image-text multimodal sentiment analysis tasks.
To tackle this issue, this paper proposes a self-adaptive cross-modal attention fusion architecture for joint multimodal aspect-based sentiment analysis (JMABSA), which is a generative model based on an image-text selective fusion mechanism that aims to bridge the semantic gap between text and image representations and adaptively transfer a textual-based pretraining model to the multimodal JMASA task.We conducted extensive experiments on two benchmark datasets, and the experimental results show that our model significantly outperforms other state Removers of the art approaches by a significant margin.