Knowledge-light Letter-to-Sound Conversion for Swedish with FST and TBL
(2006) Fonetik 2006 p.141-144- Abstract
- This paper describes some exploratory attempts to apply a combination of finite state
transducers (FST) and transformation-based learning (TBL, Brill 1992) to the problem of
letter-to-sound (LTS) conversion for Swedish. Following Bouma (2000) for Dutch, we employ
FST for segmentation of the textual input into groups of letters and a first transcription stage;
we feed the output of this step into a TBL system. With this setup, we reach 96.2% correctly
transcribed segments with rather restricted means (a small set of hand-crafted rules for the
FST stage; a set of 12 templates and a training set of 30kw for the TBL stage).
Observing that quantity is the major error source and that... (More) - This paper describes some exploratory attempts to apply a combination of finite state
transducers (FST) and transformation-based learning (TBL, Brill 1992) to the problem of
letter-to-sound (LTS) conversion for Swedish. Following Bouma (2000) for Dutch, we employ
FST for segmentation of the textual input into groups of letters and a first transcription stage;
we feed the output of this step into a TBL system. With this setup, we reach 96.2% correctly
transcribed segments with rather restricted means (a small set of hand-crafted rules for the
FST stage; a set of 12 templates and a training set of 30kw for the TBL stage).
Observing that quantity is the major error source and that compound morpheme
boundaries can be useful for inferring quantity, we exploratively add good precision-low
recall compound splitting based on graphotactic constraints. With this simple-minded
method, targeting only a subset of the compounds, performance improves to 96.9%. (Less)
Please use this url to cite or link to this publication:
https://lup.lub.lu.se/record/538838
- author
- Uneson, Marcus LU
- organization
- publishing date
- 2006
- type
- Chapter in Book/Report/Conference proceeding
- publication status
- published
- subject
- keywords
- LTS, Swedish, grapheme-to-phoneme conversion for Swedish, letter-to-sound conversion for Swedish
- host publication
- Proceedings of Fonetik 2006
- editor
- Ambrazaitis, Gilbert and Schötz, Susanne
- pages
- 141 - 144
- publisher
- Lund University
- conference name
- Fonetik 2006
- conference location
- Lund, Sweden
- conference dates
- 2006-06-07 - 2006-06-09
- language
- English
- LU publication?
- yes
- additional info
- The information about affiliations in this record was updated in December 2015. The record was previously connected to the following departments: Linguistics and Phonetics (015010003)
- id
- 8a7843d7-9be3-47bd-8d0a-84e299613bf3 (old id 538838)
- date added to LUP
- 2016-04-04 10:31:24
- date last changed
- 2018-11-21 20:59:14
@inproceedings{8a7843d7-9be3-47bd-8d0a-84e299613bf3, abstract = {{This paper describes some exploratory attempts to apply a combination of finite state<br/><br> transducers (FST) and transformation-based learning (TBL, Brill 1992) to the problem of<br/><br> letter-to-sound (LTS) conversion for Swedish. Following Bouma (2000) for Dutch, we employ<br/><br> FST for segmentation of the textual input into groups of letters and a first transcription stage;<br/><br> we feed the output of this step into a TBL system. With this setup, we reach 96.2% correctly<br/><br> transcribed segments with rather restricted means (a small set of hand-crafted rules for the<br/><br> FST stage; a set of 12 templates and a training set of 30kw for the TBL stage).<br/><br> Observing that quantity is the major error source and that compound morpheme<br/><br> boundaries can be useful for inferring quantity, we exploratively add good precision-low<br/><br> recall compound splitting based on graphotactic constraints. With this simple-minded<br/><br> method, targeting only a subset of the compounds, performance improves to 96.9%.}}, author = {{Uneson, Marcus}}, booktitle = {{Proceedings of Fonetik 2006}}, editor = {{Ambrazaitis, Gilbert and Schötz, Susanne}}, keywords = {{LTS; Swedish; grapheme-to-phoneme conversion for Swedish; letter-to-sound conversion for Swedish}}, language = {{eng}}, pages = {{141--144}}, publisher = {{Lund University}}, title = {{Knowledge-light Letter-to-Sound Conversion for Swedish with FST and TBL}}, url = {{https://lup.lub.lu.se/search/files/5558836/625914.pdf}}, year = {{2006}}, }