Learning an interpretable end-to-end network for real-time acoustic beamforming
(2024) In Journal of Sound and Vibration 591.- Abstract
Recently, many forms of audio industrial applications, such as sound monitoring and source localization, have begun exploiting smart multi-modal devices equipped with a microphone array. Regrettably, model-based methods are often difficult to employ for such devices due to their high computational complexity, as well as the difficulty of appropriately selecting the user-determined parameters. As an alternative, one may use deep network-based methods, but these are often difficult to generalize, nor can they generate the desired beamforming map directly. In this paper, a computationally efficient acoustic beamforming algorithm is proposed, which may be unrolled to form a model-based deep learning network for real-time imaging, here... (More)
Recently, many forms of audio industrial applications, such as sound monitoring and source localization, have begun exploiting smart multi-modal devices equipped with a microphone array. Regrettably, model-based methods are often difficult to employ for such devices due to their high computational complexity, as well as the difficulty of appropriately selecting the user-determined parameters. As an alternative, one may use deep network-based methods, but these are often difficult to generalize, nor can they generate the desired beamforming map directly. In this paper, a computationally efficient acoustic beamforming algorithm is proposed, which may be unrolled to form a model-based deep learning network for real-time imaging, here termed the DAMAS-FISTA-Net. By exploiting the natural structure of an acoustic beamformer, the proposed network inherits the physical knowledge of the acoustic system, and thus learns the underlying physical properties of the propagation. As a result, all the network parameters may be learned end-to-end, guided by a model-based prior using back-propagation. Notably, the proposed network enables an excellent interpretability and the ability of being able to process the raw data directly. Extensive numerical experiments using both simulated and real-world data illustrate the preferable performance of the DAMAS-FISTA-Net as compared to alternative approaches.
(Less)
- author
- Liang, Hao
; Zhou, Guanxing
; Tu, Xiaotong
LU
; Jakobsson, Andreas
LU
; Ding, Xinghao
and Huang, Yue
- organization
-
- LU Profile Area: Natural and Artificial Cognition
- LTH Profile Area: AI and Digitalization
- LTH Profile Area: Engineering Health
- ELLIIT: the Linköping-Lund initiative on IT and mobile communication
- eSSENCE: The e-Science Collaboration
- Mathematical Statistics
- Biomedical Modelling and Computation (research group)
- Statistical Signal Processing Group (research group)
- publishing date
- 2024-11
- type
- Contribution to journal
- publication status
- published
- subject
- keywords
- Acoustic beamforming, Acoustic imaging, Array signal processing, Interpretable network, Model-based deep learning, Source localization
- in
- Journal of Sound and Vibration
- volume
- 591
- article number
- 118620
- publisher
- Academic Press
- external identifiers
-
- scopus:85198609264
- ISSN
- 0022-460X
- DOI
- 10.1016/j.jsv.2024.118620
- language
- English
- LU publication?
- yes
- id
- 7a5a6121-3616-420b-bf3d-6a0cc6c8d189
- date added to LUP
- 2024-08-26 15:47:59
- date last changed
- 2025-10-14 13:29:34
@article{7a5a6121-3616-420b-bf3d-6a0cc6c8d189,
abstract = {{<p>Recently, many forms of audio industrial applications, such as sound monitoring and source localization, have begun exploiting smart multi-modal devices equipped with a microphone array. Regrettably, model-based methods are often difficult to employ for such devices due to their high computational complexity, as well as the difficulty of appropriately selecting the user-determined parameters. As an alternative, one may use deep network-based methods, but these are often difficult to generalize, nor can they generate the desired beamforming map directly. In this paper, a computationally efficient acoustic beamforming algorithm is proposed, which may be unrolled to form a model-based deep learning network for real-time imaging, here termed the DAMAS-FISTA-Net. By exploiting the natural structure of an acoustic beamformer, the proposed network inherits the physical knowledge of the acoustic system, and thus learns the underlying physical properties of the propagation. As a result, all the network parameters may be learned end-to-end, guided by a model-based prior using back-propagation. Notably, the proposed network enables an excellent interpretability and the ability of being able to process the raw data directly. Extensive numerical experiments using both simulated and real-world data illustrate the preferable performance of the DAMAS-FISTA-Net as compared to alternative approaches.</p>}},
author = {{Liang, Hao and Zhou, Guanxing and Tu, Xiaotong and Jakobsson, Andreas and Ding, Xinghao and Huang, Yue}},
issn = {{0022-460X}},
keywords = {{Acoustic beamforming; Acoustic imaging; Array signal processing; Interpretable network; Model-based deep learning; Source localization}},
language = {{eng}},
publisher = {{Academic Press}},
series = {{Journal of Sound and Vibration}},
title = {{Learning an interpretable end-to-end network for real-time acoustic beamforming}},
url = {{http://dx.doi.org/10.1016/j.jsv.2024.118620}},
doi = {{10.1016/j.jsv.2024.118620}},
volume = {{591}},
year = {{2024}},
}