Title: Multi-task deep learning approach for sound event recognition and tracking
Authors: Tzung-Shi Chen; Ming-Ju Chen; Tzung-Cheng Chen
Addresses: Department of Computer Science and Information Engineering, National University of Tainan, Tainan 700301, Taiwan ' Department of Computer Science and Information Engineering, National University of Tainan, Tainan 700301, Taiwan ' Department of Aerospace and Systems Engineering, Feng Chia University, Taichung 407802, Taiwan
Abstract: In smart cities, it is important to detect abnormal activities through cameras. However, cameras have limitations such as blind spots and blocked areas that can result in detection failures. Sound, on the other hand, is less likely to be obstructed. This paper proposes using microphone arrays to identify sound events, predict their locations, and track their trajectories using multi-task deep learning approaches. Experimental results show high predictive accuracy. Finally, the proposed models are also converted to quantised versions and deployed on embedded devices in vehicles to analyse memory footprint and execution time.
Keywords: deep learning; microphone arrays; sound event classification; sound tracking; localisation.
DOI: 10.1504/IJAHUC.2024.138747
International Journal of Ad Hoc and Ubiquitous Computing, 2024 Vol.46 No.2, pp.104 - 121
Received: 05 Dec 2023
Accepted: 20 Feb 2024
Published online: 29 May 2024 *