Constructing Large Multilingual Proposition Databases
(2016)- Abstract
- This thesis explores methods for generating proposition databases in a large-scale and multilingual setting. Our methods are centered on using semantic role labeling for extracting predicate-argument structures, and the subsequent transformation of such structures for knowledge base population and generation. By extending semantic role labeling with entity detection, we demonstrate how predicate-argument structures can be transformed to represent real world concepts and also act as a bridge connecting relational facts in multiple languages.
We introduce a framework, KOSHIK, for large scale extraction of propositions from unstructured text and an annotation model for the incremental addition of annotation layers. In addition, we... (More) - This thesis explores methods for generating proposition databases in a large-scale and multilingual setting. Our methods are centered on using semantic role labeling for extracting predicate-argument structures, and the subsequent transformation of such structures for knowledge base population and generation. By extending semantic role labeling with entity detection, we demonstrate how predicate-argument structures can be transformed to represent real world concepts and also act as a bridge connecting relational facts in multiple languages.
We introduce a framework, KOSHIK, for large scale extraction of propositions from unstructured text and an annotation model for the incremental addition of annotation layers. In addition, we introduce an alignment method based on entities for aligning disparate ontologies and also for generating ontologies for new proposition databases. Using KOSHIK, we perform large-scale natural language processing of the entire English, Swedish, and French editions of Wikipedia. By transforming the structures extracted from Wikipedias, we extend existing knowledge bases in addition to generating new proposition databases. We demonstrate how generated proposition databases in Swedish and French can be used to effectively train semantic role labelers. (Less)
Please use this url to cite or link to this publication:
https://lup.lub.lu.se/record/4def1d7b-1f45-4a95-9f7c-b2f45cfdb04a
- author
- Exner, Peter LU
- supervisor
-
- Pierre Nugues LU
- Görel Hedin LU
- opponent
-
- Professor Sebastian Pado, Universität Stuttgart, Germany
- organization
- publishing date
- 2016-10-14
- type
- Thesis
- publication status
- published
- pages
- 116 pages
- defense location
- Lecture hall E:1406, E-building, Ole Römers väg 3, Lund University, Faculty of Engineering
- defense date
- 2016-10-14 13:00:00
- ISBN
- 978-91-7623-955-1
- 978-91-7623-954-4
- language
- English
- LU publication?
- yes
- id
- 4def1d7b-1f45-4a95-9f7c-b2f45cfdb04a
- date added to LUP
- 2016-09-19 11:04:21
- date last changed
- 2021-05-06 08:28:09
@phdthesis{4def1d7b-1f45-4a95-9f7c-b2f45cfdb04a, abstract = {{This thesis explores methods for generating proposition databases in a large-scale and multilingual setting. Our methods are centered on using semantic role labeling for extracting predicate-argument structures, and the subsequent transformation of such structures for knowledge base population and generation. By extending semantic role labeling with entity detection, we demonstrate how predicate-argument structures can be transformed to represent real world concepts and also act as a bridge connecting relational facts in multiple languages.<br/>We introduce a framework, KOSHIK, for large scale extraction of propositions from unstructured text and an annotation model for the incremental addition of annotation layers. In addition, we introduce an alignment method based on entities for aligning disparate ontologies and also for generating ontologies for new proposition databases. Using KOSHIK, we perform large-scale natural language processing of the entire English, Swedish, and French editions of Wikipedia. By transforming the structures extracted from Wikipedias, we extend existing knowledge bases in addition to generating new proposition databases. We demonstrate how generated proposition databases in Swedish and French can be used to effectively train semantic role labelers.}}, author = {{Exner, Peter}}, isbn = {{978-91-7623-955-1}}, language = {{eng}}, month = {{10}}, school = {{Lund University}}, title = {{Constructing Large Multilingual Proposition Databases}}, year = {{2016}}, }