Advanced

Complex-valued independent component analysis for online blind speech extraction

Sällberg, Benny; Grbic, Nedelko LU and Claesson, Ingvar LU (2008) In IEEE Transactions on Audio, Speech, and Language Processing 16(8). p.1624-1632
Abstract
This paper presents a theoretical analysis of a
certain criterion for complex-valued independent component
analysis (ICA) with a focus on blind speech extraction (BSE) of a
spatio–temporally nonstationary speech source. In the paper, the
proposed criteria denoted KSICA is related to the well-known FastICA
method with the Kurtosis contrast function. The proposed
method is shown to share the important fixed-point feature with
the FastICA method, although an improvement with the proposed
method is that it does not exhibit the divergent behavior for a
mixture of Gaussian-only sources that the FastICA method tends
to do, and it shows better performance in online implementations.
Compared to the... (More)
This paper presents a theoretical analysis of a
certain criterion for complex-valued independent component
analysis (ICA) with a focus on blind speech extraction (BSE) of a
spatio–temporally nonstationary speech source. In the paper, the
proposed criteria denoted KSICA is related to the well-known FastICA
method with the Kurtosis contrast function. The proposed
method is shown to share the important fixed-point feature with
the FastICA method, although an improvement with the proposed
method is that it does not exhibit the divergent behavior for a
mixture of Gaussian-only sources that the FastICA method tends
to do, and it shows better performance in online implementations.
Compared to the FastICA, the KSICA method provides a 10 dB
higher source extraction performance and a 10 dB lower standard
deviation in a data batch approach when the data batch size is
less than 100 samples. For larger batch sizes, the KSICA metod
performs equally well. In an online application with spatially
stationary sources the KSICA method provides around 10 dB
higher interference suppression, and 1 MOS-unit lower speech
distortion compared to the FastICA for 0.15 s time constant in
the algorithm update parameter. Thus, the FastICA performance
matches the KSICA performance for a time constant above 1 s.
Finally, in an online application with a moving speech source, the
KSICA method provides 10 dB higher interference suppression,
compared to the FastICA for the same algorithm settings. All in
all, the proposed KSICA method is shown to be a viab (Less)
Please use this url to cite or link to this publication:
author
organization
publishing date
type
Contribution to journal
publication status
published
subject
in
IEEE Transactions on Audio, Speech, and Language Processing
volume
16
issue
8
pages
9 pages
publisher
IEEE--Institute of Electrical and Electronics Engineers Inc.
external identifiers
  • scopus:70350573623
ISSN
1558-7916
DOI
10.1109/TASL.2008.2002058
language
English
LU publication?
yes
id
92d6040f-9836-4cc1-858b-f736131fe66c
date added to LUP
2016-06-23 14:09:05
date last changed
2017-05-28 04:50:37
@article{92d6040f-9836-4cc1-858b-f736131fe66c,
  abstract     = {This paper presents a theoretical analysis of a<br/>certain criterion for complex-valued independent component<br/>analysis (ICA) with a focus on blind speech extraction (BSE) of a<br/>spatio–temporally nonstationary speech source. In the paper, the<br/>proposed criteria denoted KSICA is related to the well-known FastICA<br/>method with the Kurtosis contrast function. The proposed<br/>method is shown to share the important fixed-point feature with<br/>the FastICA method, although an improvement with the proposed<br/>method is that it does not exhibit the divergent behavior for a<br/>mixture of Gaussian-only sources that the FastICA method tends<br/>to do, and it shows better performance in online implementations.<br/>Compared to the FastICA, the KSICA method provides a 10 dB<br/>higher source extraction performance and a 10 dB lower standard<br/>deviation in a data batch approach when the data batch size is<br/>less than 100 samples. For larger batch sizes, the KSICA metod<br/>performs equally well. In an online application with spatially<br/>stationary sources the KSICA method provides around 10 dB<br/>higher interference suppression, and 1 MOS-unit lower speech<br/>distortion compared to the FastICA for 0.15 s time constant in<br/>the algorithm update parameter. Thus, the FastICA performance<br/>matches the KSICA performance for a time constant above 1 s.<br/>Finally, in an online application with a moving speech source, the<br/>KSICA method provides 10 dB higher interference suppression,<br/>compared to the FastICA for the same algorithm settings. All in<br/>all, the proposed KSICA method is shown to be a viab},
  author       = {Sällberg, Benny and Grbic, Nedelko and Claesson, Ingvar},
  issn         = {1558-7916},
  language     = {eng},
  number       = {8},
  pages        = {1624--1632},
  publisher    = {IEEE--Institute of Electrical and Electronics Engineers Inc.},
  series       = {IEEE Transactions on Audio, Speech, and Language Processing},
  title        = {Complex-valued independent component analysis for online blind speech extraction},
  url          = {http://dx.doi.org/10.1109/TASL.2008.2002058},
  volume       = {16},
  year         = {2008},
}