Statistical Identification of Pleonastic Pronouns

Stamborg, Marcus; Nugues, Pierre

Statistical Identification of Pleonastic Pronouns

Mark

Stamborg, Marcus and Nugues, Pierre ^LU

(2012) The Fourth Swedish Language Technology Conference p.67-68

Abstract: This paper describes an algorithm to identify pleonastic pronouns using statistical techniques. The training step uses a coreference

annotated corpus of English and focuses on a set of pronouns such as it. As far as we know, there is no corpus with a pleonastic

annotation. The main idea of the algorithm was then to recast the definition of pleonastic pronouns as pronouns that never occur

in a coreference chain. We integrated this algorithm in an existing coreference solver (Bj¨orkelund and Nugues, 2011) and we

measured the overall performance gains brought by the pleonastic it removal. We observed an improvement of 0.42 from 59.15

of the CoNLL score. The complete system (Stamborg et al., 2012)... (More); This paper describes an algorithm to identify pleonastic pronouns using statistical techniques. The training step uses a coreference

annotated corpus of English and focuses on a set of pronouns such as it. As far as we know, there is no corpus with a pleonastic

annotation. The main idea of the algorithm was then to recast the definition of pleonastic pronouns as pronouns that never occur

in a coreference chain. We integrated this algorithm in an existing coreference solver (Bj¨orkelund and Nugues, 2011) and we

measured the overall performance gains brought by the pleonastic it removal. We observed an improvement of 0.42 from 59.15

of the CoNLL score. The complete system (Stamborg et al., 2012) participated in the CoNLL 2012 shared task (Pradhan et al.,

2012), where it obtained the 4th rank. (Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/3191726

author

Stamborg, Marcus and Nugues, Pierre ^LU

organization

publishing date

2012

type

Chapter in Book/Report/Conference proceeding

publication status

published

subject

Computer Sciences

host publication

SLTC 2012 : The Fourth Swedish Language Technology Conference - The Fourth Swedish Language Technology Conference

pages

67 - 68

publisher

SLTC

conference name

The Fourth Swedish Language Technology Conference

conference location

Lund, Sweden

conference dates

2012-10-24 - 2012-10-26

language

English

LU publication?

yes

id

72f4afad-dfa7-4209-b5ba-d9961fb27163 (old id 3191726)

date added to LUP

2016-04-04 14:13:31

date last changed

2025-04-04 14:57:01

@inproceedings{72f4afad-dfa7-4209-b5ba-d9961fb27163,
  abstract     = {{This paper describes an algorithm to identify pleonastic pronouns using statistical techniques. The training step uses a coreference<br/><br>
annotated corpus of English and focuses on a set of pronouns such as it. As far as we know, there is no corpus with a pleonastic<br/><br>
annotation. The main idea of the algorithm was then to recast the definition of pleonastic pronouns as pronouns that never occur<br/><br>
in a coreference chain. We integrated this algorithm in an existing coreference solver (Bj¨orkelund and Nugues, 2011) and we<br/><br>
measured the overall performance gains brought by the pleonastic it removal. We observed an improvement of 0.42 from 59.15<br/><br>
of the CoNLL score. The complete system (Stamborg et al., 2012) participated in the CoNLL 2012 shared task (Pradhan et al.,<br/><br>
2012), where it obtained the 4th rank.}},
  author       = {{Stamborg, Marcus and Nugues, Pierre}},
  booktitle    = {{SLTC 2012 : The Fourth Swedish Language Technology Conference}},
  language     = {{eng}},
  pages        = {{67--68}},
  publisher    = {{SLTC}},
  title        = {{Statistical Identification of Pleonastic Pronouns}},
  url          = {{https://lup.lub.lu.se/search/files/19769020/3191727.pdf}},
  year         = {{2012}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Statistical Identification of Pleonastic Pronouns