Article: Improved harmonic spectral envelope extraction for singer classification with hybridised model Journal: International Journal of Bio-Inspired Computation (IJBIC) 2024 Vol.24 No.3 pp.150 - 163 Abstract: The singing voice has an effect on humans with the addition of expressions, lyrics, and instruments. It is easier for human beings to distinguish the singing tone of voice from a specified auditory clip owing to an individual's perceptual tools and audible physiology. On the other, without human intervention, it is not simple to identify non-vocal portions, vocal portions, feelings, and singers from the related signal owing to intrinsic complications. This proposed a new singer classification mechanism with four stages: 'pre-processing, vocal segmentation, feature extraction, and classification'. Initially, first stage, an 'improved convolutional neural network (CNN)' is deployed for the segmentation of the vocal part. Further, features like 'zero crossing rate (ZCR), Mel-frequency cepstral coefficients (MFCCs), vibration estimation and improved harmonic spectral envelope' are derived to 'bidirectional gated recurrent unit (BI-GRU) and long short-term memory (LSTM)'. The results from LSTM and BI-GRU are median and the final result is attained. Inderscience Publishers - linking academia, business and industry through research

Title: Improved harmonic spectral envelope extraction for singer classification with hybridised model

Authors: Balachandra Kumaraswamy

Addresses: B.M.S. College of Engineering, Bangalore, India

Abstract: The singing voice has an effect on humans with the addition of expressions, lyrics, and instruments. It is easier for human beings to distinguish the singing tone of voice from a specified auditory clip owing to an individual's perceptual tools and audible physiology. On the other, without human intervention, it is not simple to identify non-vocal portions, vocal portions, feelings, and singers from the related signal owing to intrinsic complications. This proposed a new singer classification mechanism with four stages: 'pre-processing, vocal segmentation, feature extraction, and classification'. Initially, first stage, an 'improved convolutional neural network (CNN)' is deployed for the segmentation of the vocal part. Further, features like 'zero crossing rate (ZCR), Mel-frequency cepstral coefficients (MFCCs), vibration estimation and improved harmonic spectral envelope' are derived to 'bidirectional gated recurrent unit (BI-GRU) and long short-term memory (LSTM)'. The results from LSTM and BI-GRU are median and the final result is attained.

Keywords: singer classification; zero crossing rate; ZCR; convolutional neural network; improved CNN; bidirectional gated recurrent unit; BI-GRU; long short-term memory; LSTM; Mel-frequency cepstral coefficients; MFCCs.

DOI: 10.1504/IJBIC.2024.141676

International Journal of Bio-Inspired Computation, 2024 Vol.24 No.3, pp.150 - 163

Received: 03 Jan 2023
Accepted: 15 Jul 2023
Published online: 30 Sep 2024 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article

Title: Improved harmonic spectral envelope extraction for singer classification with hybridised model

Keep up-to-date