Learning an interpretable end-to-end network for real-time acoustic beamforming
(2024) In Journal of Sound and Vibration 591.- Abstract
Recently, many forms of audio industrial applications, such as sound monitoring and source localization, have begun exploiting smart multi-modal devices equipped with a microphone array. Regrettably, model-based methods are often difficult to employ for such devices due to their high computational complexity, as well as the difficulty of appropriately selecting the user-determined parameters. As an alternative, one may use deep network-based methods, but these are often difficult to generalize, nor can they generate the desired beamforming map directly. In this paper, a computationally efficient acoustic beamforming algorithm is proposed, which may be unrolled to form a model-based deep learning network for real-time imaging, here... (More)
Recently, many forms of audio industrial applications, such as sound monitoring and source localization, have begun exploiting smart multi-modal devices equipped with a microphone array. Regrettably, model-based methods are often difficult to employ for such devices due to their high computational complexity, as well as the difficulty of appropriately selecting the user-determined parameters. As an alternative, one may use deep network-based methods, but these are often difficult to generalize, nor can they generate the desired beamforming map directly. In this paper, a computationally efficient acoustic beamforming algorithm is proposed, which may be unrolled to form a model-based deep learning network for real-time imaging, here termed the DAMAS-FISTA-Net. By exploiting the natural structure of an acoustic beamformer, the proposed network inherits the physical knowledge of the acoustic system, and thus learns the underlying physical properties of the propagation. As a result, all the network parameters may be learned end-to-end, guided by a model-based prior using back-propagation. Notably, the proposed network enables an excellent interpretability and the ability of being able to process the raw data directly. Extensive numerical experiments using both simulated and real-world data illustrate the preferable performance of the DAMAS-FISTA-Net as compared to alternative approaches.
(Less)
- author
- Liang, Hao
; Zhou, Guanxing
; Tu, Xiaotong
LU
; Jakobsson, Andreas LU
; Ding, Xinghao and Huang, Yue
- organization
- publishing date
- 2024-11
- type
- Contribution to journal
- publication status
- published
- subject
- keywords
- Acoustic beamforming, Acoustic imaging, Array signal processing, Interpretable network, Model-based deep learning, Source localization
- in
- Journal of Sound and Vibration
- volume
- 591
- article number
- 118620
- publisher
- Academic Press
- external identifiers
-
- scopus:85198609264
- ISSN
- 0022-460X
- DOI
- 10.1016/j.jsv.2024.118620
- language
- English
- LU publication?
- yes
- id
- 7a5a6121-3616-420b-bf3d-6a0cc6c8d189
- date added to LUP
- 2024-08-26 15:47:59
- date last changed
- 2025-04-04 14:19:15
@article{7a5a6121-3616-420b-bf3d-6a0cc6c8d189, abstract = {{<p>Recently, many forms of audio industrial applications, such as sound monitoring and source localization, have begun exploiting smart multi-modal devices equipped with a microphone array. Regrettably, model-based methods are often difficult to employ for such devices due to their high computational complexity, as well as the difficulty of appropriately selecting the user-determined parameters. As an alternative, one may use deep network-based methods, but these are often difficult to generalize, nor can they generate the desired beamforming map directly. In this paper, a computationally efficient acoustic beamforming algorithm is proposed, which may be unrolled to form a model-based deep learning network for real-time imaging, here termed the DAMAS-FISTA-Net. By exploiting the natural structure of an acoustic beamformer, the proposed network inherits the physical knowledge of the acoustic system, and thus learns the underlying physical properties of the propagation. As a result, all the network parameters may be learned end-to-end, guided by a model-based prior using back-propagation. Notably, the proposed network enables an excellent interpretability and the ability of being able to process the raw data directly. Extensive numerical experiments using both simulated and real-world data illustrate the preferable performance of the DAMAS-FISTA-Net as compared to alternative approaches.</p>}}, author = {{Liang, Hao and Zhou, Guanxing and Tu, Xiaotong and Jakobsson, Andreas and Ding, Xinghao and Huang, Yue}}, issn = {{0022-460X}}, keywords = {{Acoustic beamforming; Acoustic imaging; Array signal processing; Interpretable network; Model-based deep learning; Source localization}}, language = {{eng}}, publisher = {{Academic Press}}, series = {{Journal of Sound and Vibration}}, title = {{Learning an interpretable end-to-end network for real-time acoustic beamforming}}, url = {{http://dx.doi.org/10.1016/j.jsv.2024.118620}}, doi = {{10.1016/j.jsv.2024.118620}}, volume = {{591}}, year = {{2024}}, }