Title: Exploring the mel scale features using supervised learning classifiers for emotion classification
Authors: Kalpana Rangra; Monit Kapoor
Addresses: Department of Cybernetics, School of Computer Science, University of Petroleum and Energy Studies, Dehradun, 248007, India ' Department of Cybernetics, School of Computer Science, University of Petroleum and Energy Studies, Dehradun, 248007, India
Abstract: Human emotions are inherently ambiguous and impure but emotions are important while considering the human uttered speech. The role of human speech is intensified by the aspect of the emotion it conveys. There are several characteristics of speech that differentiates it among different utterances. Various prosodic features like pitch, timbre, loudness and vocal tone categorise speech into several emotions and other domains. The sample speech is changed when it is subjected to various emotional environments. Researches support various experimental analyses for phonetics and prosodic parameters that quantify the quality of speech. Identification of different emotional states of an actor (speaker) can also be done on the basis of mel scale. MFCC is one such variant to study the emotional aspects of the utterances by the speaker. The paper implements a model to identify several emotional states from MFCC for two datasets. The work classifies emotions for two datasets on the basis of MFCC features and gives the comparison of both. This work implements a classification model based on dataset minimisation that is done by taking the mean of features for the improvement of classification accuracy on different machine learning algorithms.
Keywords: speech recognition; emotion recognition; mel-frequency cepstral coefficient; MFCC; machine learning; supervised learning; ANN.
DOI: 10.1504/IJAPR.2021.117204
International Journal of Applied Pattern Recognition, 2021 Vol.6 No.3, pp.232 - 253
Received: 19 Sep 2020
Accepted: 14 Jan 2021
Published online: 23 Aug 2021 *