Weighted finite-state transducer-based dysarthric speech recognition error correction using context-dependent pronunciation variation modelling Online publication date: Tue, 17-Jun-2014
by Woo Kyeong Seong; Ji Hun Park
International Journal of Engineering Systems Modelling and Simulation (IJESMS), Vol. 6, No. 1/2, 2014
Abstract: In this paper, an error correction method is proposed to improve the performance of dysarthric automatic speech recognition (ASR). To this end, context-dependent pronunciation variations are modelled by using a weighted Kullback-Leibler (KL) distance between acoustic models of ASR. Then, the context-dependent pronunciation variation model is converted into a weighted finite-state transducer (WFST) and combined with a lexicon and a language model. It is shown from ASR experiments that average word error rate (WER) of a WFST-based ASR system with the proposed error correction method is relatively reduced by 19.73%, compared to an ASR system without error correction. Moreover, it is shown that the error correction method using a weighted KL distance relatively reduces average WER by 3.81%, compared to that using a KL distance.
Existing subscribers:
Go to Inderscience Online Journals to access the Full Text of this article.
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of Engineering Systems Modelling and Simulation (IJESMS):
Login with your Inderscience username and password:
Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.
If you still need assistance, please email subs@inderscience.com