Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Coreference Resolution for Swedish and German using Distant Supervision

Wallin, Alexander and Nugues, Pierre LU orcid (2017) 21st Nordic Conference of Computational Linguistics In Linköping Electronic Conference Proceedings 131.
Abstract
Coreference resolution is the identification of phrases that refer to the same entity in a text. Current techniques to solve coreferences use machine-learning algorithms, which require large annotated data sets. Such annotated Resources are no tavailable for most languages today. In this paper, we describe a method for solving coreferences for Swedish and German using distant supervision that does not use manually annotated texts. We generate a weakly labelled training set using parallel corpora, English-Swedish and English-German, where we solve the coreference for English using CoreNLP and transfer it to Swedish and German using word alignments. To carry this out, we identify mentions from dependency graphs in both target languages using... (More)
Coreference resolution is the identification of phrases that refer to the same entity in a text. Current techniques to solve coreferences use machine-learning algorithms, which require large annotated data sets. Such annotated Resources are no tavailable for most languages today. In this paper, we describe a method for solving coreferences for Swedish and German using distant supervision that does not use manually annotated texts. We generate a weakly labelled training set using parallel corpora, English-Swedish and English-German, where we solve the coreference for English using CoreNLP and transfer it to Swedish and German using word alignments. To carry this out, we identify mentions from dependency graphs in both target languages using hand-written rules. Finally, we evaluate the end-to-end results using the evaluation script from the CoNLL 2012 shared task for which we obtain a score of 34.98 for Swedish and 13.16 for German and, respectively, 46.73 and 36.98 using gold mentions. (Less)
Please use this url to cite or link to this publication:
author
and
organization
publishing date
type
Chapter in Book/Report/Conference proceeding
publication status
published
subject
host publication
Proceedings of the 21st Nordic Conference of Computational Linguistics
series title
Linköping Electronic Conference Proceedings
volume
131
publisher
Linköping University Electronic Press
conference name
21st Nordic Conference of Computational Linguistics
conference location
Gothenburg, Sweden
conference dates
2017-05-23 - 2017-05-24
external identifiers
  • scopus:85122991305
ISSN
1650-3740
1650-3686
ISBN
978-91-7685-601-7
language
English
LU publication?
yes
id
8ca7d9c9-0a52-466e-a18b-e85822fe4339
alternative location
http://www.ep.liu.se/ecp/131/006/ecp17131006.pdf
date added to LUP
2017-05-23 11:41:53
date last changed
2024-05-12 14:30:27
@inproceedings{8ca7d9c9-0a52-466e-a18b-e85822fe4339,
  abstract     = {{Coreference resolution is the identification of phrases that refer to the same entity in a text. Current techniques to solve coreferences use machine-learning algorithms, which require large annotated data sets. Such annotated Resources are no tavailable for most languages today. In this paper, we describe a method for solving coreferences for Swedish and German using distant supervision that does not use manually annotated texts. We generate a weakly labelled training set using parallel corpora, English-Swedish and English-German, where we solve the coreference for English using CoreNLP and transfer it to Swedish and German using word alignments. To carry this out, we identify mentions from dependency graphs in both target languages using hand-written rules. Finally, we evaluate the end-to-end results using the evaluation script from the CoNLL 2012 shared task for which we obtain a score of 34.98 for Swedish and 13.16 for German and, respectively, 46.73 and 36.98 using gold mentions.}},
  author       = {{Wallin, Alexander and Nugues, Pierre}},
  booktitle    = {{Proceedings of the 21st Nordic Conference of Computational Linguistics}},
  isbn         = {{978-91-7685-601-7}},
  issn         = {{1650-3740}},
  language     = {{eng}},
  publisher    = {{Linköping University Electronic Press}},
  series       = {{Linköping Electronic Conference Proceedings}},
  title        = {{Coreference Resolution for Swedish and German using Distant Supervision}},
  url          = {{http://www.ep.liu.se/ecp/131/006/ecp17131006.pdf}},
  volume       = {{131}},
  year         = {{2017}},
}