Improving Classification of Auditory Attention using Optimized Multitapers and Machine Learning

Forsgren, Ante

Improving Classification of Auditory Attention using Optimized Multitapers and Machine Learning

Mark

Forsgren, Ante ^LU (2020) In Master's Theses in Mathematical Sciences MASM01 20201
Mathematical Statistics

Abstract: The cocktail-party problem relates to the challenge of separating a single sound source from a noisy and crowded background based on the listener’s attention. Earlier work has been done in order to use the relations between the sounds heard and the following brain response to develop a model that can accurately classify which sound an individual listener is listening to. The model deals with an reconstruction based approach, where the brain response is used to reconstruct the speech stimuli. Pearson’s correlation coefficient is then used to decide which speech is attended to. This model has been evaluated on a dataset with low to no noise, but in this thesis however, background noise at different levels have been added further complicating... (More); The cocktail-party problem relates to the challenge of separating a single sound source from a noisy and crowded background based on the listener’s attention. Earlier work has been done in order to use the relations between the sounds heard and the following brain response to develop a model that can accurately classify which sound an individual listener is listening to. The model deals with an reconstruction based approach, where the brain response is used to reconstruct the speech stimuli. Pearson’s correlation coefficient is then used to decide which speech is attended to. This model has been evaluated on a dataset with low to no noise, but in this thesis however, background noise at different levels have been added further complicating the problem. The objective of this thesis is to further improve on this classification model, and to evaluate whether the background noise has an effect or not. The first proposed area of improvement is to use different methods of estimating the cepstra, for improved feature extraction. In particular, it shall be investigated if using optimally weighted multitapers notably improves results. The second proposed area for improvement is using more sophisticated reconstruction algorithms, using instead Machine Learning techniques, in particular support vector regression and neural networks. (Less)
Popular Abstract (Swedish): Kommunikation är en sådan central del av vår vardag att vi ofta glömmer hur komplicerat både att tala, men även att lyssna egentligen är. Forskning kring dessa ämnen har gjorts länge, men ännu finns det saker vi fortfarande inte förstår till fullo. En sådan sådan sak är det så kallade Cocktailpartyproblemet, vilket avser hur människor så enkelt kan urskilja röster även i fall där mycketbakgrundsljud finns. Detta är ett mycket enkelt problem för människor med normalt fungerande hörsel. Svårare blir det dock för människor med hörselnedsättningar, då vanliga hörapparater tenderar att förstärka även bakgrundsljud. Nyare hörapparater har dock börjat att introducera riktade mikrofoner, som potentiellt kan användas till detta problem. Det... (More); Kommunikation är en sådan central del av vår vardag att vi ofta glömmer hur komplicerat både att tala, men även att lyssna egentligen är. Forskning kring dessa ämnen har gjorts länge, men ännu finns det saker vi fortfarande inte förstår till fullo. En sådan sådan sak är det så kallade Cocktailpartyproblemet, vilket avser hur människor så enkelt kan urskilja röster även i fall där mycketbakgrundsljud finns. Detta är ett mycket enkelt problem för människor med normalt fungerande hörsel. Svårare blir det dock för människor med hörselnedsättningar, då vanliga hörapparater tenderar att förstärka även bakgrundsljud. Nyare hörapparater har dock börjat att introducera riktade mikrofoner, som potentiellt kan användas till detta problem. Det framtida målet är att man skall kunna styra dessa mikrofoner med hjälp av de elektriska signaler som hjärnan skickar, och då kunna rikta mikrofonerna mot den ljudkälla man lyssnar på. Det är således av stort intresse att försöka förstå vilka signaler som skickas i hjärnan då man försöker lyssna på en specifikt ljudkälla. Detta arbete vidareutvecklar och utvärderar metoder som försöker bestämma vilken talare en lyssnare faktiskt lyssnar på, med hjälp av mätningar ifrån hjärnan. Datan som använts kommer från ett experiment där en lyssnare har två talare framför sig, samtidigt som en inspelning av en större folksamling spelats upp i bakgrunden. Lyssnaren har då försökt lyssna på vad en förbestämd talare säger, samtidigt som mätningar från hjärnan har samlats. Olika volymnivåer på folksamlingen i bakgrunden användes i olika försök av experimentet, och det har även utvärderas vilken effekt dessa volymändringarhar på de framtagna metoderna. (Less)

Please use this url to cite or link to this publication: http://lup.lub.lu.se/student-papers/record/9022689

author

Forsgren, Ante ^LU

supervisor

Maria Sandsten ^LU

organization

Mathematical Statistics

alternative title

Förbättrandet av auditiv uppmärksamhetsklassifiering med hjälp av optimerade multifönster och maskininlärning

course

MASM01 20201

year

2020

type

H2 - Master's Degree (Two Years)

subject

Mathematics and Statistics

publication/series

Master's Theses in Mathematical Sciences

report number

LUNFMS-3092-2020

ISSN

1404-6342

2020:E56

language

English

id

9022689

date added to LUP

2020-10-05 13:29:25

date last changed

2021-06-07 09:04:02

@misc{9022689,
  abstract     = {{The cocktail-party problem relates to the challenge of separating a single sound source from a noisy and crowded background based on the listener’s attention. Earlier work has been done in order to use the relations between the sounds heard and the following brain response to develop a model that can accurately classify which sound an individual listener is listening to. The model deals with an reconstruction based approach, where the brain response is used to reconstruct the speech stimuli. Pearson’s correlation coefficient is then used to decide which speech is attended to. This model has been evaluated on a dataset with low to no noise, but in this thesis however, background noise at different levels have been added further complicating the problem. The objective of this thesis is to further improve on this classification model, and to evaluate whether the background noise has an effect or not. The first proposed area of improvement is to use different methods of estimating the cepstra, for improved feature extraction. In particular, it shall be investigated if using optimally weighted multitapers notably improves results. The second proposed area for improvement is using more sophisticated reconstruction algorithms, using instead Machine Learning techniques, in particular support vector regression and neural networks.}},
  author       = {{Forsgren, Ante}},
  issn         = {{1404-6342}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{Master's Theses in Mathematical Sciences}},
  title        = {{Improving Classification of Auditory Attention using Optimized Multitapers and Machine Learning}},
  year         = {{2020}},
}

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Improving Classification of Auditory Attention using Optimized Multitapers and Machine Learning