Article: A new double attention decoding model based on cascade RCNN and word embedding fusion for Chinese-English multimodal translation Journal: International Journal of Reasoning-based Intelligent Systems (IJRIS) 2024 Vol.16 No.1 pp.26 - 36 Abstract: Traditional multimodal machine translation (MMT) is to optimise the translation process from the source language to the target language with the help of important feature information in images. However, the information in the image does not necessarily appear in the text, which will interfere with the translation. Compared with the reference translation, mistranslation can be appeared in the translation results. In order to solve above problems, we propose a double attention decoding method based on cascade RCNN to optimise existing multimodal neural machine translation models. The cascade RCNN is applied to source language and source image respectively. Word embedding is used to fuse the initialisation and the semantic information of the dual encoder. In attention computation process, it can reduce the focus on the repetitive information in the past. Finally, experiments are carried out on Chinese-English test sets to verify the effectiveness of the proposed method. Compared with other state-of-the-art methods, the proposed method can obtain better translation results. Inderscience Publishers - linking academia, business and industry through research

You can view the full text of this article for free using the link below.

Title: A new double attention decoding model based on cascade RCNN and word embedding fusion for Chinese-English multimodal translation

Authors: Haiying Liu

Addresses: School of Foreign Languages, Zhengzhou University of Science and Technology, Zhengzhou, 450064, China

Abstract: Traditional multimodal machine translation (MMT) is to optimise the translation process from the source language to the target language with the help of important feature information in images. However, the information in the image does not necessarily appear in the text, which will interfere with the translation. Compared with the reference translation, mistranslation can be appeared in the translation results. In order to solve above problems, we propose a double attention decoding method based on cascade RCNN to optimise existing multimodal neural machine translation models. The cascade RCNN is applied to source language and source image respectively. Word embedding is used to fuse the initialisation and the semantic information of the dual encoder. In attention computation process, it can reduce the focus on the repetitive information in the past. Finally, experiments are carried out on Chinese-English test sets to verify the effectiveness of the proposed method. Compared with other state-of-the-art methods, the proposed method can obtain better translation results.

Keywords: multimodal machine translation; MMT; double attention decoding; cascade RCNN; word embedding fusion.

DOI: 10.1504/IJRIS.2024.137429

International Journal of Reasoning-based Intelligent Systems, 2024 Vol.16 No.1, pp.26 - 36

Received: 06 Nov 2022
Accepted: 22 Nov 2022
Published online: 19 Mar 2024 *

Full-text access for editors Full-text access for subscribers Free access Comment on this article

Title: A new double attention decoding model based on cascade RCNN and word embedding fusion for Chinese-English multimodal translation

Keep up-to-date