Title: A new double attention decoding model based on cascade RCNN and word embedding fusion for Chinese-English multimodal translation
Authors: Haiying Liu
Addresses: School of Foreign Languages, Zhengzhou University of Science and Technology, Zhengzhou, 450064, China
Abstract: Traditional multimodal machine translation (MMT) is to optimise the translation process from the source language to the target language with the help of important feature information in images. However, the information in the image does not necessarily appear in the text, which will interfere with the translation. Compared with the reference translation, mistranslation can be appeared in the translation results. In order to solve above problems, we propose a double attention decoding method based on cascade RCNN to optimise existing multimodal neural machine translation models. The cascade RCNN is applied to source language and source image respectively. Word embedding is used to fuse the initialisation and the semantic information of the dual encoder. In attention computation process, it can reduce the focus on the repetitive information in the past. Finally, experiments are carried out on Chinese-English test sets to verify the effectiveness of the proposed method. Compared with other state-of-the-art methods, the proposed method can obtain better translation results.
Keywords: multimodal machine translation; MMT; double attention decoding; cascade RCNN; word embedding fusion.
DOI: 10.1504/IJRIS.2024.137429
International Journal of Reasoning-based Intelligent Systems, 2024 Vol.16 No.1, pp.26 - 36
Received: 06 Nov 2022
Accepted: 22 Nov 2022
Published online: 19 Mar 2024 *