Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Connecting a French Dictionary from the Beginning of the 20th Century to Wikidata

Nugues, Pierre LU orcid (2022) 13th International Conference on Language Resources and Evaluation Conference, LREC 2022 In 2022 Language Resources and Evaluation Conference, LREC 2022 p.2548-2555
Abstract

The Petit Larousse illustré is a French dictionary first published in 1905. Its division in two main parts on language and on history and geography corresponds to a major milestone in French lexicography as well as a repository of general knowledge from this period. Although the value of many entries from 1905 remains intact, some descriptions now have a dimension that is more historical than contemporary. They are nonetheless significant to analyze and understand cultural representations from this time. A comparison with more recent information or a verification of these entries would require a tedious manual work. In this paper, we describe a new lexical resource, where we connected all the dictionary entries of the history and... (More)

The Petit Larousse illustré is a French dictionary first published in 1905. Its division in two main parts on language and on history and geography corresponds to a major milestone in French lexicography as well as a repository of general knowledge from this period. Although the value of many entries from 1905 remains intact, some descriptions now have a dimension that is more historical than contemporary. They are nonetheless significant to analyze and understand cultural representations from this time. A comparison with more recent information or a verification of these entries would require a tedious manual work. In this paper, we describe a new lexical resource, where we connected all the dictionary entries of the history and geography part to current data sources. For this, we linked each of these entries to a wikidata identifier. Using the wikidata links, we can automate more easily the identification, comparison, and verification of historically-situated representations. We give a few examples on how to process wikidata identifiers and we carried out a small analysis of the entities described in the dictionary to outline possible applications. The resource, i.e. the annotation of 20,245 dictionary entries with wikidata links, is available from GitHub (https://github.com/pnugues/petit_larousse_1905/).

(Less)
Please use this url to cite or link to this publication:
author
organization
publishing date
type
Chapter in Book/Report/Conference proceeding
publication status
published
subject
keywords
digital humanities, entity annotation, entity linking
host publication
2022 Language Resources and Evaluation Conference, LREC 2022
series title
2022 Language Resources and Evaluation Conference, LREC 2022
editor
Calzolari, Nicoletta ; Bechet, Frederic ; Blache, Philippe ; Choukri, Khalid ; Cieri, Christopher ; Declerck, Thierry ; Goggi, Sara ; Isahara, Hitoshi ; Maegaard, Bente ; Mariani, Joseph ; Mazo, Helene ; Odijk, Jan and Piperidis, Stelios
pages
8 pages
publisher
European Language Resources Association
conference name
13th International Conference on Language Resources and Evaluation Conference, LREC 2022
conference location
Marseille, France
conference dates
2022-06-20 - 2022-06-25
external identifiers
  • scopus:85144479024
ISBN
9791095546726
language
English
LU publication?
yes
id
d40a62e9-3a50-4301-851a-c370b481ce56
date added to LUP
2023-01-12 10:45:02
date last changed
2023-01-12 10:45:02
@inproceedings{d40a62e9-3a50-4301-851a-c370b481ce56,
  abstract     = {{<p>The Petit Larousse illustré is a French dictionary first published in 1905. Its division in two main parts on language and on history and geography corresponds to a major milestone in French lexicography as well as a repository of general knowledge from this period. Although the value of many entries from 1905 remains intact, some descriptions now have a dimension that is more historical than contemporary. They are nonetheless significant to analyze and understand cultural representations from this time. A comparison with more recent information or a verification of these entries would require a tedious manual work. In this paper, we describe a new lexical resource, where we connected all the dictionary entries of the history and geography part to current data sources. For this, we linked each of these entries to a wikidata identifier. Using the wikidata links, we can automate more easily the identification, comparison, and verification of historically-situated representations. We give a few examples on how to process wikidata identifiers and we carried out a small analysis of the entities described in the dictionary to outline possible applications. The resource, i.e. the annotation of 20,245 dictionary entries with wikidata links, is available from GitHub (https://github.com/pnugues/petit_larousse_1905/).</p>}},
  author       = {{Nugues, Pierre}},
  booktitle    = {{2022 Language Resources and Evaluation Conference, LREC 2022}},
  editor       = {{Calzolari, Nicoletta and Bechet, Frederic and Blache, Philippe and Choukri, Khalid and Cieri, Christopher and Declerck, Thierry and Goggi, Sara and Isahara, Hitoshi and Maegaard, Bente and Mariani, Joseph and Mazo, Helene and Odijk, Jan and Piperidis, Stelios}},
  isbn         = {{9791095546726}},
  keywords     = {{digital humanities; entity annotation; entity linking}},
  language     = {{eng}},
  pages        = {{2548--2555}},
  publisher    = {{European Language Resources Association}},
  series       = {{2022 Language Resources and Evaluation Conference, LREC 2022}},
  title        = {{Connecting a French Dictionary from the Beginning of the 20th Century to Wikidata}},
  year         = {{2022}},
}