Auditory Attention Classification with Contrastive Learning
(2024)Department of Automatic Control
- Abstract
- Auditory attention detection is crucial for understanding speech in noisy environments, a challenge known as the "cocktail party problem." This project investigates the use of electroencephalography (EEG) to identify which speaker a listener attends to. EEG’s portability and real-time recording capabilities make it a promising tool for practical applications.
We propose a novel neural network model for auditory attention detection using EEG data. The model reconstructs the attended speech envelope while simultaneously classifying attended vs. unattended speech. It incorporates a contrastive learning loss function (SigLIP), which, to our knowledge, has not been previously applied to EEG-based auditory attention detection. The model... (More) - Auditory attention detection is crucial for understanding speech in noisy environments, a challenge known as the "cocktail party problem." This project investigates the use of electroencephalography (EEG) to identify which speaker a listener attends to. EEG’s portability and real-time recording capabilities make it a promising tool for practical applications.
We propose a novel neural network model for auditory attention detection using EEG data. The model reconstructs the attended speech envelope while simultaneously classifying attended vs. unattended speech. It incorporates a contrastive learning loss function (SigLIP), which, to our knowledge, has not been previously applied to EEG-based auditory attention detection. The model architecture combines convolutional, fully connected, and attention layers.
Evaluated on an EEG dataset with 31 subjects, the model achieves a mean accuracy of 68% and a mean correlation of 0.105 between the reconstructed and attended envelopes. This surpasses the baseline performance of linear methods (63% accuracy, 0.084 correlation). These results suggest the potential of contrastive learning for improving auditory attention detection accuracy, warranting further investigation. (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/9173516
- author
- Sridhar, Gautam and Boselli, Sofía
- supervisor
- organization
- year
- 2024
- type
- H3 - Professional qualifications (4 Years - )
- subject
- report number
- TFRT-6235
- other publication id
- 0280-5316
- language
- English
- id
- 9173516
- date added to LUP
- 2024-09-09 09:19:37
- date last changed
- 2024-09-09 09:19:37
@misc{9173516, abstract = {{Auditory attention detection is crucial for understanding speech in noisy environments, a challenge known as the "cocktail party problem." This project investigates the use of electroencephalography (EEG) to identify which speaker a listener attends to. EEG’s portability and real-time recording capabilities make it a promising tool for practical applications. We propose a novel neural network model for auditory attention detection using EEG data. The model reconstructs the attended speech envelope while simultaneously classifying attended vs. unattended speech. It incorporates a contrastive learning loss function (SigLIP), which, to our knowledge, has not been previously applied to EEG-based auditory attention detection. The model architecture combines convolutional, fully connected, and attention layers. Evaluated on an EEG dataset with 31 subjects, the model achieves a mean accuracy of 68% and a mean correlation of 0.105 between the reconstructed and attended envelopes. This surpasses the baseline performance of linear methods (63% accuracy, 0.084 correlation). These results suggest the potential of contrastive learning for improving auditory attention detection accuracy, warranting further investigation.}}, author = {{Sridhar, Gautam and Boselli, Sofía}}, language = {{eng}}, note = {{Student Paper}}, title = {{Auditory Attention Classification with Contrastive Learning}}, year = {{2024}}, }