Advanced

Coreference Resolution for Swedish and German using Distant Supervision

Wallin, Alexander and Nugues, Pierre LU (2017) 21st Nordic Conference of Computational Linguistics In Proceedings of the 21st Nordic Conference of Computational Linguistics 131.
Abstract
Coreference resolution is the identification of phrases that refer to the same entity in a text. Current techniques to solve coreferences use machine-learning algorithms, which require large annotated data sets. Such annotated Resources are no tavailable for most languages today. In this paper, we describe a method for solving coreferences for Swedish and German using distant supervision that does not use manually annotated texts. We generate a weakly labelled training set using parallel corpora, English-Swedish and English-German, where we solve the coreference for English using CoreNLP and transfer it to Swedish and German using word alignments. To carry this out, we identify mentions from dependency graphs in both target languages using... (More)
Coreference resolution is the identification of phrases that refer to the same entity in a text. Current techniques to solve coreferences use machine-learning algorithms, which require large annotated data sets. Such annotated Resources are no tavailable for most languages today. In this paper, we describe a method for solving coreferences for Swedish and German using distant supervision that does not use manually annotated texts. We generate a weakly labelled training set using parallel corpora, English-Swedish and English-German, where we solve the coreference for English using CoreNLP and transfer it to Swedish and German using word alignments. To carry this out, we identify mentions from dependency graphs in both target languages using hand-written rules. Finally, we evaluate the end-to-end results using the evaluation script from the CoNLL 2012 shared task for which we obtain a score of 34.98 for Swedish and 13.16 for German and, respectively, 46.73 and 36.98 using gold mentions. (Less)
Please use this url to cite or link to this publication:
author
organization
publishing date
type
Chapter in Book/Report/Conference proceeding
publication status
published
subject
in
Proceedings of the 21st Nordic Conference of Computational Linguistics
volume
131
publisher
Linköping University Electronic Press
conference name
21st Nordic Conference of Computational Linguistics
ISSN
1650-3686
1650-3740
ISBN
978-91-7685-601-7
language
English
LU publication?
yes
id
8ca7d9c9-0a52-466e-a18b-e85822fe4339
alternative location
http://www.ep.liu.se/ecp/131/006/ecp17131006.pdf
date added to LUP
2017-05-23 11:41:53
date last changed
2017-05-23 12:59:48
@inproceedings{8ca7d9c9-0a52-466e-a18b-e85822fe4339,
  abstract     = {Coreference resolution is the identification of phrases that refer to the same entity in a text. Current techniques to solve coreferences use machine-learning algorithms, which require large annotated data sets. Such annotated Resources are no tavailable for most languages today. In this paper, we describe a method for solving coreferences for Swedish and German using distant supervision that does not use manually annotated texts. We generate a weakly labelled training set using parallel corpora, English-Swedish and English-German, where we solve the coreference for English using CoreNLP and transfer it to Swedish and German using word alignments. To carry this out, we identify mentions from dependency graphs in both target languages using hand-written rules. Finally, we evaluate the end-to-end results using the evaluation script from the CoNLL 2012 shared task for which we obtain a score of 34.98 for Swedish and 13.16 for German and, respectively, 46.73 and 36.98 using gold mentions. },
  author       = {Wallin, Alexander and Nugues, Pierre},
  booktitle    = {Proceedings of the 21st Nordic Conference of Computational Linguistics},
  isbn         = {978-91-7685-601-7},
  issn         = {1650-3686},
  language     = {eng},
  publisher    = {Linköping University Electronic Press},
  title        = {Coreference Resolution for Swedish and German using Distant Supervision},
  volume       = {131},
  year         = {2017},
}