Multiple Sound Sources Localization Using Sub-Band Spatial Features and Attention Mechanism

11 Pages Posted: 31 Oct 2023

See all articles by DongZhe Zhang

DongZhe Zhang

affiliation not provided to SSRN

Jianfeng Chen

affiliation not provided to SSRN

Jisheng Bai

affiliation not provided to SSRN

Muhammad Saad Ayub

affiliation not provided to SSRN

Mou Wang

affiliation not provided to SSRN

Qingli Yan

affiliation not provided to SSRN

Abstract

Deep learning based sound source localization is a growing research topic for wireless acoustic sensor network. However, current methods just combine the DOA estimates provided by each microphone array node, or use end-to-end architecture with multichannel features of the arrays. The above methods suffer from performance degradation in high noise and reverberation environments. In this paper, we propose a deep learning-based method using spatial spectrum features and attention mechanism to estimate the locations of sound sources. We first propose a new set of features to represent the spatial information in multiple frequency bands. By using sub-band spatial representations, the model can adequately utilize the geometric properties and the spatial spectrum of the array nodes. Then we propose to use CNN-Transformer-based network to identify the correct peaks and suppresses spurious peaks by modeling both local and global information from the spatial spectrum features. To evaluate the proposed method, we conduct experiments using the simulated dataset with different levels of noise and reverberation. Experimental results show that the proposed method achieves the lowest RMSE and the highest F1 score compared with baseline methods. Further analysis demonstrates that the proposed method has robust sound sources localization performance when the sources are influenced with strong reverberation.

Keywords: Wireless Acoustic Sensor Network, Source Localization, deep learning, Microphone Array

Suggested Citation

Zhang, DongZhe and Chen, Jianfeng and Bai, Jisheng and Ayub, Muhammad Saad and Wang, Mou and Yan, Qingli, Multiple Sound Sources Localization Using Sub-Band Spatial Features and Attention Mechanism. Available at SSRN: https://ssrn.com/abstract=4618444 or http://dx.doi.org/10.2139/ssrn.4618444

DongZhe Zhang

affiliation not provided to SSRN ( email )

No Address Available

Jianfeng Chen (Contact Author)

affiliation not provided to SSRN ( email )

No Address Available

Jisheng Bai

affiliation not provided to SSRN ( email )

No Address Available

Muhammad Saad Ayub

affiliation not provided to SSRN ( email )

No Address Available

Mou Wang

affiliation not provided to SSRN ( email )

No Address Available

Qingli Yan

affiliation not provided to SSRN ( email )

No Address Available

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
59
Abstract Views
225
Rank
783,134
PlumX Metrics