Title: Group delay-based minimum variance distortion-less response cepstral features for speaker identification in whispered speech
Authors: Vijay M. Sardar; Manisha L. Jadhav; Saurabh H. Deshmukh; Makarand M. Jadhav
Addresses: Jayawantrao Sawant College of Engineering, Pune, Maharashtra, India ' MET's Institute of Engineering, Nasik, Maharashtra, India ' Maharashtra Institute of Technology, Aurangabad, Maharashtra, India ' NBN Sinhgad Technical Institutes Campus, Pune, Maharashtra, India
Abstract: The whispering voice shows a wide difference in characteristics compared to the neutral voice. It makes identification of a person from the whispered sound difficult. The Group Delay Function (GDF) in its spectral form considers the phase information in the short-time FT phase function, which is otherwise ignored in traditional front-end processing. A Minimum Variance Distortion-less Response (MVDR) based on smoothing on the denominator of group delay that enhances speech quality and intelligibility is proposed in this paper. The experiment uses MVDR spectral coefficient features with a multi-class Support Vector Machine (SVM) for classification. The proposed method reported an improvement of 2.41% over the baseline system using the CHAINs database and SVM classifier. The five-fold cross-validation is exercised for accuracy and Speaker Error Rate (SER) to verify the consistency of the results. The proposed system is also evaluated for False Positives (FP) and Precision and reported the enhancement compared to the baseline system.
Keywords: whispered speech; group delay function; MVDR; support vector machine; MFCC.
DOI: 10.1504/IJCAT.2023.134753
International Journal of Computer Applications in Technology, 2023 Vol.73 No.2, pp.104 - 112
Received: 07 May 2022
Received in revised form: 25 Dec 2022
Accepted: 08 Jan 2023
Published online: 09 Nov 2023 *