Nonlinear vocal phenomena and speech intelligibility
(2025) In Philosophical Transactions of the Royal Society B: Biological Sciences- Abstract
- At some point in our evolutionary history, humans lost vocal membranes and air sacs, representing an unexpected simplification of the vocal apparatus relative to other great apes. One hypothesis is that these simplifications represent anatomical adaptations for speech because a simpler larynx provides a suitably stable and tonal vocal source with fewer nonlinear vocal phenomena (NLP). The key assumption that NLP reduce speech intelligibility is indirectly supported by studies of dysphonia, but it has not been experimentally tested. Here, we manipulate NLP in vocal stimuli ranging from single vowels to sentences, showing that the vocal source needs to be stable, but not necessarily tonal, for speech to be readily understood. When the task... (More)
- At some point in our evolutionary history, humans lost vocal membranes and air sacs, representing an unexpected simplification of the vocal apparatus relative to other great apes. One hypothesis is that these simplifications represent anatomical adaptations for speech because a simpler larynx provides a suitably stable and tonal vocal source with fewer nonlinear vocal phenomena (NLP). The key assumption that NLP reduce speech intelligibility is indirectly supported by studies of dysphonia, but it has not been experimentally tested. Here, we manipulate NLP in vocal stimuli ranging from single vowels to sentences, showing that the vocal source needs to be stable, but not necessarily tonal, for speech to be readily understood. When the task is to discriminate synthesized monophthong and diphthong vowels, continuous NLP (subharmonics, amplitude modulation and even deterministic chaos) actually improve vowel perception in high-pitched voices, likely because the resulting dense spectrum reveals formant transitions. Rough-sounding voices also remain highly intelligible when continuous NLP are added to recorded words and sentences. In contrast, voicing interruptions and pitch jumps dramatically reduce speech intelligibility, likely by interfering with voicing contrasts and normal intonation. We argue that NLP were not eliminated from the human vocal repertoire as we evolved for speech, but only brought under better control.
This article is part of the theme issue ‘Nonlinear phenomena in vertebrate vocalizations: mechanisms and communicative functions’. (Less)
Please use this url to cite or link to this publication:
https://lup.lub.lu.se/record/21be370a-bbd5-4df6-9ee9-072a2466062d
- author
- Anikin, Andrey
LU
; Reby, David and Pisanski, Katarzyna
- organization
- publishing date
- 2025
- type
- Contribution to journal
- publication status
- published
- subject
- keywords
- speech, evolution of language, nonlinear vocal phenomena, voice, formant frequencies, vocal membranes
- in
- Philosophical Transactions of the Royal Society B: Biological Sciences
- publisher
- Royal Society Publishing
- external identifiers
-
- pmid:40176514
- scopus:105001863545
- ISSN
- 1471-2970
- DOI
- 10.1098/rstb.2024.0254
- language
- English
- LU publication?
- yes
- id
- 21be370a-bbd5-4df6-9ee9-072a2466062d
- date added to LUP
- 2025-04-03 18:42:14
- date last changed
- 2025-06-18 04:00:56
@article{21be370a-bbd5-4df6-9ee9-072a2466062d, abstract = {{At some point in our evolutionary history, humans lost vocal membranes and air sacs, representing an unexpected simplification of the vocal apparatus relative to other great apes. One hypothesis is that these simplifications represent anatomical adaptations for speech because a simpler larynx provides a suitably stable and tonal vocal source with fewer nonlinear vocal phenomena (NLP). The key assumption that NLP reduce speech intelligibility is indirectly supported by studies of dysphonia, but it has not been experimentally tested. Here, we manipulate NLP in vocal stimuli ranging from single vowels to sentences, showing that the vocal source needs to be stable, but not necessarily tonal, for speech to be readily understood. When the task is to discriminate synthesized monophthong and diphthong vowels, continuous NLP (subharmonics, amplitude modulation and even deterministic chaos) actually improve vowel perception in high-pitched voices, likely because the resulting dense spectrum reveals formant transitions. Rough-sounding voices also remain highly intelligible when continuous NLP are added to recorded words and sentences. In contrast, voicing interruptions and pitch jumps dramatically reduce speech intelligibility, likely by interfering with voicing contrasts and normal intonation. We argue that NLP were not eliminated from the human vocal repertoire as we evolved for speech, but only brought under better control.<br/><br/>This article is part of the theme issue ‘Nonlinear phenomena in vertebrate vocalizations: mechanisms and communicative functions’.}}, author = {{Anikin, Andrey and Reby, David and Pisanski, Katarzyna}}, issn = {{1471-2970}}, keywords = {{speech; evolution of language; nonlinear vocal phenomena; voice; formant frequencies; vocal membranes}}, language = {{eng}}, publisher = {{Royal Society Publishing}}, series = {{Philosophical Transactions of the Royal Society B: Biological Sciences}}, title = {{Nonlinear vocal phenomena and speech intelligibility}}, url = {{http://dx.doi.org/10.1098/rstb.2024.0254}}, doi = {{10.1098/rstb.2024.0254}}, year = {{2025}}, }