Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

PAL, a tool for pre-annotation and active learning

Skeppstedt, Maria ; Paradis, Carita LU orcid and Kerren, Andreas (2017) In Journal for Language Technology and Computational Linguistics 31(1). p.91-110
Abstract
Many natural language processing systems rely on machine learning models that are trained on large amounts of manually annotated text data. The lack of sufficient amounts of annotated data is, however, a common obstacle for such systems, since manual annotation of text is often expensive and time-consuming. The aim of “PAL", a tool for Pre-annotation and Active Learning” is to provide a ready-made package that can be used to simplify annotation and to reduce the amount of annotated data required to train a machine learning classifier. The package provides support for two techniques that have been shown to be successful in previous studies, namely active learning and pre-annotation. The output of the pre-annotation is provided in the... (More)
Many natural language processing systems rely on machine learning models that are trained on large amounts of manually annotated text data. The lack of sufficient amounts of annotated data is, however, a common obstacle for such systems, since manual annotation of text is often expensive and time-consuming. The aim of “PAL", a tool for Pre-annotation and Active Learning” is to provide a ready-made package that can be used to simplify annotation and to reduce the amount of annotated data required to train a machine learning classifier. The package provides support for two techniques that have been shown to be successful in previous studies, namely active learning and pre-annotation. The output of the pre-annotation is provided in the annotation format of the annotation tool BRAT, but PAL is a stand-alone package that can be adapted to other formats. (Less)
Please use this url to cite or link to this publication:
author
; and
organization
publishing date
type
Contribution to journal
publication status
published
subject
in
Journal for Language Technology and Computational Linguistics
volume
31
issue
1
pages
19 pages
ISSN
2190-6858
project
StaViCTA - Advances in the description and explanation of stance in discourse using visual and computational text analytics
language
English
LU publication?
yes
id
63763f56-a18c-47bb-9ede-a1f2f225778c
date added to LUP
2017-05-15 20:50:16
date last changed
2019-03-08 02:29:00
@article{63763f56-a18c-47bb-9ede-a1f2f225778c,
  abstract     = {{Many natural language processing systems rely on machine learning models that are trained on large amounts of manually annotated text data. The lack of sufficient amounts of annotated data is, however, a common obstacle for such systems, since manual annotation of text is often expensive and time-consuming. The aim of “PAL", a tool for Pre-annotation and Active Learning” is to provide a ready-made package that can be used to simplify annotation and to reduce the amount of annotated data required to train a machine learning classifier. The package provides support for two techniques that have been shown to be successful in previous studies, namely active learning and pre-annotation. The output of the pre-annotation is provided in the annotation format of the annotation tool BRAT, but PAL is a stand-alone package that can be adapted to other formats.}},
  author       = {{Skeppstedt, Maria and Paradis, Carita and Kerren, Andreas}},
  issn         = {{2190-6858}},
  language     = {{eng}},
  number       = {{1}},
  pages        = {{91--110}},
  series       = {{Journal for Language Technology and Computational Linguistics}},
  title        = {{PAL, a tool for pre-annotation and active learning}},
  volume       = {{31}},
  year         = {{2017}},
}