Improving auditory attention decoding in noisy environments for listeners with hearing impairment through contrastive learning
(2025) In Journal of Neural Engineering 22(3).- Abstract
Objective. This study aimed to investigate the potential of contrastive learning to improve auditory attention decoding (AAD) using electroencephalography (EEG) data in challenging cocktail-party scenarios with competing speech and background noise.
Approach. Three different models were implemented for comparison: a baseline linear model (LM), a non-LM without contrastive learning (NLM), and a non-LM with contrastive learning (NLMwCL). The EEG data and speech envelopes were used to train these models. The NLMwCL model used SigLIP, a variant of CLIP loss, to embed the data. The speech envelopes were reconstructed from the models and compared with the attended and ignored speech envelopes to assess reconstruction accuracy,... (More)
(Less)
Objective. This study aimed to investigate the potential of contrastive learning to improve auditory attention decoding (AAD) using electroencephalography (EEG) data in challenging cocktail-party scenarios with competing speech and background noise.
Approach. Three different models were implemented for comparison: a baseline linear model (LM), a non-LM without contrastive learning (NLM), and a non-LM with contrastive learning (NLMwCL). The EEG data and speech envelopes were used to train these models. The NLMwCL model used SigLIP, a variant of CLIP loss, to embed the data. The speech envelopes were reconstructed from the models and compared with the attended and ignored speech envelopes to assess reconstruction accuracy, measured as the correlation between the reconstructed and actual speech envelopes. These reconstruction accuracies were then compared to classify attention. All models were evaluated in 34 listeners with hearing impairment.
Results. The reconstruction accuracy for attended and ignored speech, along with attention classification accuracy, was calculated for each model across various time windows. The NLMwCL consistently outperformed the other models in both speech reconstruction and attention classification. For a 3-second time window, the NLMwCL model achieved a mean attended speech reconstruction accuracy of 0.105 and a mean attention classification accuracy of 68.0%, while the NLM model scored 0.096 and 64.4%, and the LM achieved 0.084 and 62.6%, respectively.
Significance. These findings demonstrate the promise of contrastive learning in improving AAD and highlight the potential of EEG-based tools for clinical applications, and progress in hearing technology, particularly in the design of new neuro-steered signal processing algorithms.
- author
- Sridhar, Gautam
LU
; Boselli, Sofía
; Skoglund, Martin A
; Bernhardsson, Bo
LU
and Alickovic, Emina
- organization
- publishing date
- 2025-06-18
- type
- Contribution to journal
- publication status
- published
- subject
- keywords
- Humans, Attention/physiology, Male, Electroencephalography/methods, Female, Noise, Adult, Middle Aged, Speech Perception/physiology, Hearing Loss/physiopathology, Acoustic Stimulation/methods, Aged, Young Adult, Auditory Perception/physiology, Learning/physiology
- in
- Journal of Neural Engineering
- volume
- 22
- issue
- 3
- publisher
- IOP Publishing
- external identifiers
-
- scopus:105008524464
- pmid:40489989
- ISSN
- 1741-2560
- DOI
- 10.1088/1741-2552/ade28a
- project
- Real-Time Brain-Computer Interfaces
- language
- English
- LU publication?
- yes
- additional info
- Creative Commons Attribution license.
- id
- db29c812-5486-4d8e-827b-7170700a0bd1
- date added to LUP
- 2025-10-13 17:42:00
- date last changed
- 2025-11-11 06:50:04
@article{db29c812-5486-4d8e-827b-7170700a0bd1,
abstract = {{<p><br>
Objective. This study aimed to investigate the potential of contrastive learning to improve auditory attention decoding (AAD) using electroencephalography (EEG) data in challenging cocktail-party scenarios with competing speech and background noise.<br>
Approach. Three different models were implemented for comparison: a baseline linear model (LM), a non-LM without contrastive learning (NLM), and a non-LM with contrastive learning (NLMwCL). The EEG data and speech envelopes were used to train these models. The NLMwCL model used SigLIP, a variant of CLIP loss, to embed the data. The speech envelopes were reconstructed from the models and compared with the attended and ignored speech envelopes to assess reconstruction accuracy, measured as the correlation between the reconstructed and actual speech envelopes. These reconstruction accuracies were then compared to classify attention. All models were evaluated in 34 listeners with hearing impairment.<br>
Results. The reconstruction accuracy for attended and ignored speech, along with attention classification accuracy, was calculated for each model across various time windows. The NLMwCL consistently outperformed the other models in both speech reconstruction and attention classification. For a 3-second time window, the NLMwCL model achieved a mean attended speech reconstruction accuracy of 0.105 and a mean attention classification accuracy of 68.0%, while the NLM model scored 0.096 and 64.4%, and the LM achieved 0.084 and 62.6%, respectively.<br>
Significance. These findings demonstrate the promise of contrastive learning in improving AAD and highlight the potential of EEG-based tools for clinical applications, and progress in hearing technology, particularly in the design of new neuro-steered signal processing algorithms.<br>
</p>}},
author = {{Sridhar, Gautam and Boselli, Sofía and Skoglund, Martin A and Bernhardsson, Bo and Alickovic, Emina}},
issn = {{1741-2560}},
keywords = {{Humans; Attention/physiology; Male; Electroencephalography/methods; Female; Noise; Adult; Middle Aged; Speech Perception/physiology; Hearing Loss/physiopathology; Acoustic Stimulation/methods; Aged; Young Adult; Auditory Perception/physiology; Learning/physiology}},
language = {{eng}},
month = {{06}},
number = {{3}},
publisher = {{IOP Publishing}},
series = {{Journal of Neural Engineering}},
title = {{Improving auditory attention decoding in noisy environments for listeners with hearing impairment through contrastive learning}},
url = {{http://dx.doi.org/10.1088/1741-2552/ade28a}},
doi = {{10.1088/1741-2552/ade28a}},
volume = {{22}},
year = {{2025}},
}