Soundgen : An open-source tool for synthesizing nonverbal vocalizations

Anikin, Andrey

Soundgen : An open-source tool for synthesizing nonverbal vocalizations

Mark

Anikin, Andrey ^LU

(2019) In Behavior Research Methods 51(2). p.778-792

Abstract: Voice synthesis is a useful method for investigating the communicative role of different acoustic features. Although many text-to-speech systems are available, researchers of human nonverbal vocalizations and bioacousticians may profit from a dedicated simple tool for synthesizing and manipulating natural-sounding vocalizations. Soundgen (https://CRAN.R-project.org/package=soundgen) is an open-source R package that synthesizes nonverbal vocalizations based on meaningful acoustic parameters, which can be specified from the command line or in an interactive app. This tool was validated by comparing the perceived emotion, valence, arousal, and authenticity of 60 recorded human nonverbal vocalizations (screams, moans, laughs, and so on) and... (More); Voice synthesis is a useful method for investigating the communicative role of different acoustic features. Although many text-to-speech systems are available, researchers of human nonverbal vocalizations and bioacousticians may profit from a dedicated simple tool for synthesizing and manipulating natural-sounding vocalizations. Soundgen (https://CRAN.R-project.org/package=soundgen) is an open-source R package that synthesizes nonverbal vocalizations based on meaningful acoustic parameters, which can be specified from the command line or in an interactive app. This tool was validated by comparing the perceived emotion, valence, arousal, and authenticity of 60 recorded human nonverbal vocalizations (screams, moans, laughs, and so on) and their approximate synthetic reproductions. Each synthetic sound was created by manually specifying only a small number of high-level control parameters, such as syllable length and a few anchors for the intonation contour. Nevertheless, the valence and arousal ratings of synthetic sounds were similar to those of the original recordings, and the authenticity ratings were comparable, maintaining parity with the originals for less complex vocalizations. Manipulating the precise acoustic characteristics of synthetic sounds may shed light on the salient predictors of emotion in the human voice. More generally, soundgen may prove useful for any studies that require precise control over the acoustic features of nonspeech sounds, including research on animal vocalizations and auditory perception. (Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/c6f81cec-b319-45ce-8eaf-215955782de7

author

Anikin, Andrey ^LU

organization

publishing date

2019

type

Contribution to journal

publication status

published

subject

Computer and Information Sciences

in

Behavior Research Methods

volume

51

issue

2

pages

15 pages

publisher

Springer

external identifiers

scopus:85051106705
pmid:30054898

ISSN

1554-3528

DOI

10.3758/s13428-018-1095-7

language

English

LU publication?

yes

id

c6f81cec-b319-45ce-8eaf-215955782de7

date added to LUP

2018-07-29 07:41:00

date last changed

2025-11-12 14:13:55

@article{c6f81cec-b319-45ce-8eaf-215955782de7,
  abstract     = {{Voice synthesis is a useful method for investigating the communicative role of different acoustic features. Although many text-to-speech systems are available, researchers of human nonverbal vocalizations and bioacousticians may profit from a dedicated simple tool for synthesizing and manipulating natural-sounding vocalizations. Soundgen (https://CRAN.R-project.org/package=soundgen) is an open-source R package that synthesizes nonverbal vocalizations based on meaningful acoustic parameters, which can be specified from the command line or in an interactive app. This tool was validated by comparing the perceived emotion, valence, arousal, and authenticity of 60 recorded human nonverbal vocalizations (screams, moans, laughs, and so on) and their approximate synthetic reproductions. Each synthetic sound was created by manually specifying only a small number of high-level control parameters, such as syllable length and a few anchors for the intonation contour. Nevertheless, the valence and arousal ratings of synthetic sounds were similar to those of the original recordings, and the authenticity ratings were comparable, maintaining parity with the originals for less complex vocalizations. Manipulating the precise acoustic characteristics of synthetic sounds may shed light on the salient predictors of emotion in the human voice. More generally, soundgen may prove useful for any studies that require precise control over the acoustic features of nonspeech sounds, including research on animal vocalizations and auditory perception.}},
  author       = {{Anikin, Andrey}},
  issn         = {{1554-3528}},
  language     = {{eng}},
  number       = {{2}},
  pages        = {{778--792}},
  publisher    = {{Springer}},
  series       = {{Behavior Research Methods}},
  title        = {{Soundgen : An open-source tool for synthesizing nonverbal vocalizations}},
  url          = {{http://dx.doi.org/10.3758/s13428-018-1095-7}},
  doi          = {{10.3758/s13428-018-1095-7}},
  volume       = {{51}},
  year         = {{2019}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Soundgen : An open-source tool for synthesizing nonverbal vocalizations