Title: Acoustic model training for speech recognition over mobile networks
Authors: Juraj Vojtko; Juraj Kačur; Gregor Rozinaj; Ján Kőrösi
Addresses: Institute of Telecommunications, Faculty of Electrical Engineering and Information Technology, Slovak University of Technology, Ilkovičova 3, Bratislava 812 19, Slovakia ' Institute of Telecommunications, Faculty of Electrical Engineering and Information Technology, Slovak University of Technology, Ilkovičova 3, Bratislava 812 19, Slovakia ' Institute of Telecommunications, Faculty of Electrical Engineering and Information Technology, Slovak University of Technology, Ilkovičova 3, Bratislava 812 19, Slovakia ' Institute of Telecommunications, Faculty of Electrical Engineering and Information Technology, Slovak University of Technology, Ilkovičova 3, Bratislava 812 19, Slovakia
Abstract: The goal of this article is to provide and present information about the training procedure SpinxTrain and its eligible modifications to get accurate and robust speech recognition in a mobile GSM environment. Some modifications are based on effective preprocessing of input data in combination with the optimal setting of the number of states per model, through the adjustment of the number of tied states or number of Gaussian mixtures. Another source of increased recognition rate is the 'optimal' setting of the speech decoder. As it is a non-linear, mathematically not well tractable task containing both real and integer values, methods of evolution strategies can be successfully used (an 18.6% improvement in WER was observed compared to the original setting). All experiments and results were obtained for the Slovak speech database Mobildat, which contains recordings of 1100 speakers. The Sphinx4 recognition system was used for evaluation of the trained model.
Keywords: Sphinx 4; SphinxTrain; ATK; ASR; automatic speech recognition; HMM; hidden Markov mode; mobildat; evolution strategies; acoustic models; model training; speech recognition; mobile networks; GSM environment; modelling.
DOI: 10.1504/IJSISE.2013.053431
International Journal of Signal and Imaging Systems Engineering, 2013 Vol.6 No.2, pp.65 - 74
Published online: 21 Apr 2013 *
Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article