Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Interpolation of Perceived Gender in Speech Signals

Hagelborn, Alexander LU and Hulme Geber, Jack (2020) In Master's Theses in Mathematical Sciences FMSM01 20201
Mathematical Statistics
Abstract
For individuals with gender dysphoria, voice therapy can be an important tool to change characteristics about their voice to align better with their gender identity. This is often done by practising with a speech therapist and can be a long and difficult process. A useful tool in this setting would be software that can generate a voice, based on the patients voice, which lies slightly closer to their desired voice. The patient could then mimic the generated voice in order to train their voice.

The purpose of this thesis is to explore how voices can be digitally modified in order to change how their gender is perceived. The aim is to find a method of interpolation where a voice could gradually be modified to sound like a target voice,... (More)
For individuals with gender dysphoria, voice therapy can be an important tool to change characteristics about their voice to align better with their gender identity. This is often done by practising with a speech therapist and can be a long and difficult process. A useful tool in this setting would be software that can generate a voice, based on the patients voice, which lies slightly closer to their desired voice. The patient could then mimic the generated voice in order to train their voice.

The purpose of this thesis is to explore how voices can be digitally modified in order to change how their gender is perceived. The aim is to find a method of interpolation where a voice could gradually be modified to sound like a target voice, and where all intermediate points on the path sound natural. Two methods were evaluated, but only one produced adequate results that were evaluated with a participant survey.

Survey participants listened to voices that are a mix of female and male voices, and rated on a scale how they perceived the gender and if the voices sounded natural. The results show that there is a decrease in how natural the modified voices sound. On average there is a consensus that the perceived gender is changed, however the individual participant results showed that there is a need for improvement. (Less)
Popular Abstract
What if you could talk into a microphone and another persons voice would come out of the speaker? What would a voice in between yours and mine sound like? And what does a voice which is 50% male and 50% female sound like? These questions arose in our thesis where we explored if it is possible to gradually change the gender identity of a voice.

Gender dysphoria is a condition of psychological distress caused by a mismatch of a persons gender and biological sex. The voice is an important gender communicator, so people with gender dysphoria often consult a speech therapist for voice therapy. The motivation for our project was to create a tool that could assist in this setting. The idea was that a person could record their voice which is... (More)
What if you could talk into a microphone and another persons voice would come out of the speaker? What would a voice in between yours and mine sound like? And what does a voice which is 50% male and 50% female sound like? These questions arose in our thesis where we explored if it is possible to gradually change the gender identity of a voice.

Gender dysphoria is a condition of psychological distress caused by a mismatch of a persons gender and biological sex. The voice is an important gender communicator, so people with gender dysphoria often consult a speech therapist for voice therapy. The motivation for our project was to create a tool that could assist in this setting. The idea was that a person could record their voice which is then played back slightly moved towards their goal voice. The person can then train to sound like this voice.

Voice training takes a lot of practise and effort and the tool could be helpful since learning can be easier when imitating. In addition, training the voice in smaller steps can prevent straining it.

As you might know, voices are composed of a fundamental tone and the multiples of this tone (harmonics). The male voice is not simply a lower tone than the female voice. In addition to the tone being lower the relative amplitude between the harmonics in a male voice differs from a female voice. Knowing this, we focused on changing both the tone and the relation between harmonics of the voices. A tone played by a guitar or piano, however, can also be described as a tone with harmonics with a certain relation. So the question is: how do we change this relation while making all intermediate signals sound human? We tried two different approaches to answer this question.

In our first approach we tried to teach a neural network to extract what makes a voice unique, known as an embedding, and then recreate the voice with only this knowledge. The idea was to then tell the network to create new voices by mixing the embeddings of speakers.

In our second approach we modeled each persons voice production organs using just a recording of their voice. The new voices were then generated by creating voice production organs from a mix of two people.

Out of the two approaches, the second one was the most successful. Using a survey, we were able to determine that people perceived that we had created voices which were a mix of male and female characteristics. Future research can include improving the naturalness of the voices. (Less)
Please use this url to cite or link to this publication:
author
Hagelborn, Alexander LU and Hulme Geber, Jack
supervisor
organization
course
FMSM01 20201
year
type
H2 - Master's Degree (Two Years)
subject
keywords
speech modelling, speech morph, interpolation, gender dysphoria
publication/series
Master's Theses in Mathematical Sciences
report number
LUTFMS-3402-2020
ISSN
1404-6342
other publication id
2020:E81
language
English
id
9032262
date added to LUP
2021-05-12 09:59:31
date last changed
2021-06-04 16:34:59
@misc{9032262,
  abstract     = {{For individuals with gender dysphoria, voice therapy can be an important tool to change characteristics about their voice to align better with their gender identity. This is often done by practising with a speech therapist and can be a long and difficult process. A useful tool in this setting would be software that can generate a voice, based on the patients voice, which lies slightly closer to their desired voice. The patient could then mimic the generated voice in order to train their voice. 

The purpose of this thesis is to explore how voices can be digitally modified in order to change how their gender is perceived. The aim is to find a method of interpolation where a voice could gradually be modified to sound like a target voice, and where all intermediate points on the path sound natural. Two methods were evaluated, but only one produced adequate results that were evaluated with a participant survey.

Survey participants listened to voices that are a mix of female and male voices, and rated on a scale how they perceived the gender and if the voices sounded natural. The results show that there is a decrease in how natural the modified voices sound. On average there is a consensus that the perceived gender is changed, however the individual participant results showed that there is a need for improvement.}},
  author       = {{Hagelborn, Alexander and Hulme Geber, Jack}},
  issn         = {{1404-6342}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{Master's Theses in Mathematical Sciences}},
  title        = {{Interpolation of Perceived Gender in Speech Signals}},
  year         = {{2020}},
}