Burcas : A Simple Concatenation-based MIDI-to-Singing Voice Synthesis System for Swedish

Uneson, Marcus

Burcas : A Simple Concatenation-based MIDI-to-Singing Voice Synthesis System for Swedish

Mark

Uneson, Marcus (2002)
General Linguistics

Abstract: After a brief outlook on the field of concatenative synthesis of singing, with emphasis on the differences in comparison to synthesis of speech, the present paper gives an overview of a simple system for singing synthesis in Swedish based on concatenation of diphones. The system, called Burcas, accepts as input a text file for lyrics, from which it extracts a target phoneme sequence using basic letter-to-sound conversion, and a MIDI file?possibly holding multiple parts?, from which it extracts melodical information, i.e. note duration and frequency. After associating a syllable (or a part of a syllable) to each note, a simple model of segment durations is used to calculate the duration of each segment of the syllable. Finally, segment data... (More); After a brief outlook on the field of concatenative synthesis of singing, with emphasis on the differences in comparison to synthesis of speech, the present paper gives an overview of a simple system for singing synthesis in Swedish based on concatenation of diphones. The system, called Burcas, accepts as input a text file for lyrics, from which it extracts a target phoneme sequence using basic letter-to-sound conversion, and a MIDI file?possibly holding multiple parts?, from which it extracts melodical information, i.e. note duration and frequency. After associating a syllable (or a part of a syllable) to each note, a simple model of segment durations is used to calculate the duration of each segment of the syllable. Finally, segment data are then used as control parameters (allophone, duration, frequency) for the MBROLA speech generator. The speech generator outputs sound files in standard format, given a suitable diphone database. In a concluding section, the far more sophisticated corpus-based approach to concatenative synthesis of singing is considered. (Less)

Please use this url to cite or link to this publication: http://lup.lub.lu.se/student-papers/record/1330121

author

Uneson, Marcus

supervisor

Joost van de Weijer

organization

General Linguistics

year

2002

type

H1 - Master's Degree (One Year)

subject

Languages and Literatures

keywords

Syntetisk sång utifrån svenska språket, Sångsyntes, Burcas, Bokstav - ljud - sång, Sångrösten, Phonetics, phonology, Fonetik, fonologi

language

English

id

1330121

date added to LUP

2005-05-30 00:00:00

date last changed

2005-07-01 00:00:00

@misc{1330121,
  abstract     = {{After a brief outlook on the field of concatenative synthesis of singing, with emphasis on the differences in comparison to synthesis of speech, the present paper gives an overview of a simple system for singing synthesis in Swedish based on concatenation of diphones. The system, called Burcas, accepts as input a text file for lyrics, from which it extracts a target phoneme sequence using basic letter-to-sound conversion, and a MIDI file?possibly holding multiple parts?, from which it extracts melodical information, i.e. note duration and frequency. After associating a syllable (or a part of a syllable) to each note, a simple model of segment durations is used to calculate the duration of each segment of the syllable. Finally, segment data are then used as control parameters (allophone, duration, frequency) for the MBROLA speech generator. The speech generator outputs sound files in standard format, given a suitable diphone database. In a concluding section, the far more sophisticated corpus-based approach to concatenative synthesis of singing is considered.}},
  author       = {{Uneson, Marcus}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Burcas : A Simple Concatenation-based MIDI-to-Singing Voice Synthesis System for Swedish}},
  year         = {{2002}},
}

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Burcas : A Simple Concatenation-based MIDI-to-Singing Voice Synthesis System for Swedish