Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Gaussian mixture model based mutual information estimation between frequency bands in speech

Nilsson, M ; Gustafsson, H ; Andersen, Sören Vang LU and Kleijn, WB (2002) IEEE International Conference on Acoustics, Speech and Signal Processing, 2002 p.525-528
Abstract
In this paper, we investigate the dependency between the spectral envelopes of speech in disjoint frequency bands, one covering the telephone bandwidth from 0.3 kHz to 3.4 kHz and one covering the frequencies from 3.7 kHz to 8 kHz. The spectral envelopes are jointly modeled with a Gaussian mixture model based on mel-frequency cepstral coefficients and the log-energy-ratio of the disjoint frequency bands. Using this model, we quantify the dependency between bands through their mutual information and the perceived entropy of the high frequency band. Our results indicate that the mutual information is only a small fraction of the perceived entropy of the high band. This suggests that speech bandwidth extension should not rely only on mutual... (More)
In this paper, we investigate the dependency between the spectral envelopes of speech in disjoint frequency bands, one covering the telephone bandwidth from 0.3 kHz to 3.4 kHz and one covering the frequencies from 3.7 kHz to 8 kHz. The spectral envelopes are jointly modeled with a Gaussian mixture model based on mel-frequency cepstral coefficients and the log-energy-ratio of the disjoint frequency bands. Using this model, we quantify the dependency between bands through their mutual information and the perceived entropy of the high frequency band. Our results indicate that the mutual information is only a small fraction of the perceived entropy of the high band. This suggests that speech bandwidth extension should not rely only on mutual information between narrow- and high-band spectra. Rather, such methods need to make use of perceptual properties to ensure that the extended signal sounds pleasant. (Less)
Please use this url to cite or link to this publication:
author
; ; and
publishing date
type
Chapter in Book/Report/Conference proceeding
publication status
published
subject
host publication
2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS
pages
525 - 528
publisher
IEEE - Institute of Electrical and Electronics Engineers Inc.
conference name
IEEE International Conference on Acoustics, Speech and Signal Processing, 2002
conference location
Orlando, FL, United States
conference dates
2002-05-13 - 2002-05-17
external identifiers
  • wos:000177510400132
  • scopus:0036293935
ISSN
1520-6149
language
English
LU publication?
no
id
3acda647-3acf-428d-a6ac-aef990812f82 (old id 4092571)
date added to LUP
2016-04-01 16:25:52
date last changed
2022-04-22 21:57:17
@inproceedings{3acda647-3acf-428d-a6ac-aef990812f82,
  abstract     = {{In this paper, we investigate the dependency between the spectral envelopes of speech in disjoint frequency bands, one covering the telephone bandwidth from 0.3 kHz to 3.4 kHz and one covering the frequencies from 3.7 kHz to 8 kHz. The spectral envelopes are jointly modeled with a Gaussian mixture model based on mel-frequency cepstral coefficients and the log-energy-ratio of the disjoint frequency bands. Using this model, we quantify the dependency between bands through their mutual information and the perceived entropy of the high frequency band. Our results indicate that the mutual information is only a small fraction of the perceived entropy of the high band. This suggests that speech bandwidth extension should not rely only on mutual information between narrow- and high-band spectra. Rather, such methods need to make use of perceptual properties to ensure that the extended signal sounds pleasant.}},
  author       = {{Nilsson, M and Gustafsson, H and Andersen, Sören Vang and Kleijn, WB}},
  booktitle    = {{2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS}},
  issn         = {{1520-6149}},
  language     = {{eng}},
  pages        = {{525--528}},
  publisher    = {{IEEE - Institute of Electrical and Electronics Engineers Inc.}},
  title        = {{Gaussian mixture model based mutual information estimation between frequency bands in speech}},
  year         = {{2002}},
}