Title: Real-time sign language recognition based on video stream
Authors: Kai Zhao; Kejun Zhang; Yu Zhai; Daotong Wang; Jianbo Su
Addresses: Department of Automation, Shanghai Jiao Tong University, Shanghai, 200240, China ' Shanghai Lingzhi High-Tech Corporation, Shanghai, 200240, China ' Shanghai Lingzhi High-Tech Corporation, Shanghai, 200240, China ' Department of Automation, Shanghai Jiao Tong University, Shanghai, 200240, China ' Department of Automation, Shanghai Jiao Tong University, Shanghai, 200240, China
Abstract: In this paper, a real-time Chinese sign language recognition system is investigated. This system can recognise deaf-mute Chinese sign language, and output the recognition results in real time through text. A Chinese sign language dataset is firstly created with a normal RGB camera, and the entire dataset contains 500,000 video samples. In order to improve the recognition accuracy of the system for real-time applications, three-dimensional convolutional neural network (3D-CNN) is investigated, combined with optical flow processing base on total variation regularisation and L1-norm robust (TV-L1). A two-step down-frame processing is employed to extract the equal number of key frames from each video stream, and finally put into 3D-CNN to extract feature vectors. Comparative studies are conducted with that of the hidden Markov model (HMM) and recurrent neural network (RNN), with 92.6% recognition accuracy on a dataset containing 1,000 vocabularies. A complete real-time sign language recognition system is finally developed and reported, which is composed of a human interaction interface, a motion detection module, a hand and head detection module, and a video acquisition mechanism. Experimental results verify the generalisation performance of the system in real-time.
Keywords: sign language recognition; three-dimensional convolutional neural network; 3D-CNN; TV-L1 optical flow; motion detection; hand and head detection.
DOI: 10.1504/IJSCC.2021.114616
International Journal of Systems, Control and Communications, 2021 Vol.12 No.2, pp.158 - 174
Received: 21 Jul 2020
Accepted: 24 Nov 2020
Published online: 28 Apr 2021 *