Open Access Article

Title: 3D image reconstruction using an improved BEV model and global convolutional attention fusion

Authors: HuaShun Yan; XiaoJie Li; ZeLin Mou

Addresses: Computer Science and Technology, Chengdu University of Information Technology, Chengdu, Sichuan, 610225, China ' Computer Science and Technology, Chengdu University of Information Technology, Chengdu, Sichuan, 610225, China ' Computer Science and Technology, Chengdu University of Information Technology, Chengdu, Sichuan, 610225, China

Abstract: In autonomous driving and computer vision, 3D object detection plays a critical role but faces challenges related to the effective extraction and integration of multi-view features. The existing BEVFormer model, which uses CNNs to convert images into a bird's-eye view (BEV), shows potential but struggles to capture fine-grained details and multi-scale information, especially in high-resolution, complex scenes. To address these limitations, we propose the MultiCAN-DEBEV model, which integrates the MSF-DySample, GCAF, and MSDE modules. These modules improve the handling of multi-scale features, enhance feature expressiveness, and strengthen detail representation. Experiments on the nuScenes dataset show significant performance improvements, and the modular design ensures broad adaptability to other 3D detection models.

Keywords: computer vision; 3D object detection; BEV object detection; autonomous driving.

DOI: 10.1504/IJICT.2025.145403

International Journal of Information and Communication Technology, 2025 Vol.26 No.6, pp.98 - 116

Received: 03 Dec 2024
Accepted: 08 Jan 2025

Published online: 31 Mar 2025 *