Title: VGG16 and Bi-LSTM fused with an attention mechanism for human action recognition in infrared images

Authors: Gao Cheng; Tang Chao; Tong Anyang; Wang Wenjian

Addresses: School of Artificial Intelligence and Big Data, Hefei University, Hefei, 230601, China ' School of Artificial Intelligence and Big Data, Hefei University, Hefei, 230601, China ' School of Artificial Intelligence and Big Data, Hefei University, Hefei, 230601, China ' School of Computer and Information Technology, Shanxi University, Taiyuan, Shanxi, 030006, China

Abstract: Action recognition has long been a popular subject of research in computer vision because of its wide prospects for application. Infrared videos are suitable for monitoring in any kind of weather and can ensure the privacy of the data. We propose a method of human action recognition in infrared videos by fusing the visual geometry group 16 (VGG16) and bi-directional long short-term memory (Bi-LSTM) with an attention mechanism. First, we extract infrared images from an infrared video and pre-process them. Second, we use the VGG16 model to extract the spatial features of the images through convolution and pooling, and we apply the Bi-LSTM fused with the attention mechanism to extract their temporal features. Finally, the two networks obtain the results of classification through the score fusion strategy at the decision level. The method is tested on various infrared datasets and the results show that it is effective.

Keywords: human action recognition; deep learning; fusion model; infrared video; attention mechanism.

DOI: 10.1504/IJCSM.2024.139925

International Journal of Computing Science and Mathematics, 2024 Vol.20 No.1, pp.86 - 97

Received: 04 May 2023
Accepted: 05 Mar 2024

Published online: 11 Jul 2024 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article