Title: Method to integrate speaker identification, speech recognition, and information retrieval algorithms for speaker-based information retrieval
Authors: Muhammad Muneeb
Addresses: Department of Electrical Engineering and Computer Science, Khalifa University of Science, Technology and Research (KUSTAR), Abu Dhabi, UAE
Abstract: This article proposes speakers' voice-based information (audio and video) retrieval systems, which combines speaker identification, speech recognition, and information retrieval algorithms. Information retrieval systems encompass system structure and a way to query the system for information retrieval. This article illustrates both, including how it is deployed on top of existing systems. The input to the system is a speaker voice sample and a text query. Based on the speaker's voice, the size of the corpus is reduced, and based on the text query, documents are retrieved and ranked. For the speaker identification, we used the LPC coefficient, for voice recognition, we used a Python speech recognition library, and for ranking, we used cosine similarity and TF-IDF. Other algorithms can replace any intermediate modules depending on the system, like crime investigation, news analysis, and lecture retrieval. We demonstrated the proposed method on simulated data generated from online websites.
Keywords: audio retrieval; information retrieval; speaker identification; TF-IDF; voice recognition.
DOI: 10.1504/IJKEDM.2022.126069
International Journal of Knowledge Engineering and Data Mining, 2022 Vol.7 No.3/4, pp.234 - 251
Received: 15 Dec 2021
Accepted: 04 Feb 2022
Published online: 10 Oct 2022 *