Improving auditory attention decoding in noisy environments for listeners with hearing impairment through contrastive learning

Sridhar, Gautam; Boselli, Sofía; Skoglund, Martin A; Bernhardsson, Bo; Alickovic, Emina

Improving auditory attention decoding in noisy environments for listeners with hearing impairment through contrastive learning

Mark

Sridhar, Gautam ^LU ; Boselli, Sofía ; Skoglund, Martin A ; Bernhardsson, Bo ^LU

and Alickovic, Emina (2025) In Journal of Neural Engineering 22(3).

Abstract: Objective. This study aimed to investigate the potential of contrastive learning to improve auditory attention decoding (AAD) using electroencephalography (EEG) data in challenging cocktail-party scenarios with competing speech and background noise.
Approach. Three different models were implemented for comparison: a baseline linear model (LM), a non-LM without contrastive learning (NLM), and a non-LM with contrastive learning (NLMwCL). The EEG data and speech envelopes were used to train these models. The NLMwCL model used SigLIP, a variant of CLIP loss, to embed the data. The speech envelopes were reconstructed from the models and compared with the attended and ignored speech envelopes to assess reconstruction accuracy,... (More); Objective. This study aimed to investigate the potential of contrastive learning to improve auditory attention decoding (AAD) using electroencephalography (EEG) data in challenging cocktail-party scenarios with competing speech and background noise.
Approach. Three different models were implemented for comparison: a baseline linear model (LM), a non-LM without contrastive learning (NLM), and a non-LM with contrastive learning (NLMwCL). The EEG data and speech envelopes were used to train these models. The NLMwCL model used SigLIP, a variant of CLIP loss, to embed the data. The speech envelopes were reconstructed from the models and compared with the attended and ignored speech envelopes to assess reconstruction accuracy, measured as the correlation between the reconstructed and actual speech envelopes. These reconstruction accuracies were then compared to classify attention. All models were evaluated in 34 listeners with hearing impairment.
Results. The reconstruction accuracy for attended and ignored speech, along with attention classification accuracy, was calculated for each model across various time windows. The NLMwCL consistently outperformed the other models in both speech reconstruction and attention classification. For a 3-second time window, the NLMwCL model achieved a mean attended speech reconstruction accuracy of 0.105 and a mean attention classification accuracy of 68.0%, while the NLM model scored 0.096 and 64.4%, and the LM achieved 0.084 and 62.6%, respectively.
Significance. These findings demonstrate the promise of contrastive learning in improving AAD and highlight the potential of EEG-based tools for clinical applications, and progress in hearing technology, particularly in the design of new neuro-steered signal processing algorithms.

(Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/db29c812-5486-4d8e-827b-7170700a0bd1

author

Sridhar, Gautam ^LU ; Boselli, Sofía ; Skoglund, Martin A ; Bernhardsson, Bo ^LU

and Alickovic, Emina

organization

publishing date

2025-06-18

type

Contribution to journal

publication status

published

subject

Control Engineering

keywords

Humans, Attention/physiology, Male, Electroencephalography/methods, Female, Noise, Adult, Middle Aged, Speech Perception/physiology, Hearing Loss/physiopathology, Acoustic Stimulation/methods, Aged, Young Adult, Auditory Perception/physiology, Learning/physiology

in

Journal of Neural Engineering

volume

22

issue

3

publisher

IOP Publishing

external identifiers

pmid:40489989
scopus:105008524464

ISSN

1741-2560

DOI

10.1088/1741-2552/ade28a

project

Real-Time Brain-Computer Interfaces

language

English

LU publication?

yes

additional info

Creative Commons Attribution license.

id

db29c812-5486-4d8e-827b-7170700a0bd1

date added to LUP

2025-10-13 17:42:00

date last changed

2025-12-23 11:08:56

@article{db29c812-5486-4d8e-827b-7170700a0bd1,
  abstract     = {{<p><br>
 Objective. This study aimed to investigate the potential of contrastive learning to improve auditory attention decoding (AAD) using electroencephalography (EEG) data in challenging cocktail-party scenarios with competing speech and background noise.<br>
 Approach. Three different models were implemented for comparison: a baseline linear model (LM), a non-LM without contrastive learning (NLM), and a non-LM with contrastive learning (NLMwCL). The EEG data and speech envelopes were used to train these models. The NLMwCL model used SigLIP, a variant of CLIP loss, to embed the data. The speech envelopes were reconstructed from the models and compared with the attended and ignored speech envelopes to assess reconstruction accuracy, measured as the correlation between the reconstructed and actual speech envelopes. These reconstruction accuracies were then compared to classify attention. All models were evaluated in 34 listeners with hearing impairment.<br>
 Results. The reconstruction accuracy for attended and ignored speech, along with attention classification accuracy, was calculated for each model across various time windows. The NLMwCL consistently outperformed the other models in both speech reconstruction and attention classification. For a 3-second time window, the NLMwCL model achieved a mean attended speech reconstruction accuracy of 0.105 and a mean attention classification accuracy of 68.0%, while the NLM model scored 0.096 and 64.4%, and the LM achieved 0.084 and 62.6%, respectively.<br>
 Significance. These findings demonstrate the promise of contrastive learning in improving AAD and highlight the potential of EEG-based tools for clinical applications, and progress in hearing technology, particularly in the design of new neuro-steered signal processing algorithms.<br>
 </p>}},
  author       = {{Sridhar, Gautam and Boselli, Sofía and Skoglund, Martin A and Bernhardsson, Bo and Alickovic, Emina}},
  issn         = {{1741-2560}},
  keywords     = {{Humans; Attention/physiology; Male; Electroencephalography/methods; Female; Noise; Adult; Middle Aged; Speech Perception/physiology; Hearing Loss/physiopathology; Acoustic Stimulation/methods; Aged; Young Adult; Auditory Perception/physiology; Learning/physiology}},
  language     = {{eng}},
  month        = {{06}},
  number       = {{3}},
  publisher    = {{IOP Publishing}},
  series       = {{Journal of Neural Engineering}},
  title        = {{Improving auditory attention decoding in noisy environments for listeners with hearing impairment through contrastive learning}},
  url          = {{http://dx.doi.org/10.1088/1741-2552/ade28a}},
  doi          = {{10.1088/1741-2552/ade28a}},
  volume       = {{22}},
  year         = {{2025}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Improving auditory attention decoding in noisy environments for listeners with hearing impairment through contrastive learning