Title: Evolutionary structure of hidden Markov models for audio-visual Arabic speech recognition
Authors: Amina Makhlouf; Lilia Lazli; Bachir Bensaker
Addresses: LRI (Laboratory of Computer Research), Department of Computer Science, University of Badji Mokhtar, BP. 12, Annaba, Algeria ' LRI (Laboratory of Computer Research), Department of Computer Science, University of Badji Mokhtar, BP. 12, Annaba, Algeria ' Department of Electronics, University of Badji Mokhtar, BP. 12, Annaba, Algeria
Abstract: In this paper, we present an Audio-Visual Automatic Speech Recognition System that combines the acoustic and the visual data. The proposed algorithm here, for modelling the multimodal data, is a Hidden Markov Model (HMM) hybridised with the Genetic Algorithm (GA) to determine its optimal structure. This algorithm is combined with the Baum-Welch algorithm, which allows an effective re-estimation of the probabilities of the HMM. Our experiments show the improvement in the performance of the most promising audio-visual system, based on the combination of GA/HMM model compared to the traditional HMM.
Keywords: automatic speech recognition; computer vision; HMM; hidden Markov models; GAs; genetic algorithms; hybrid models; signal processing; audio-visual fusion; AV fusion; Arabic; multimodal data modelling.
DOI: 10.1504/IJSISE.2016.074651
International Journal of Signal and Imaging Systems Engineering, 2016 Vol.9 No.1, pp.55 - 66
Received: 29 Apr 2013
Accepted: 23 Feb 2014
Published online: 12 Feb 2016 *