Mining semantics for culturomics: towards a knowledge-based approach

Lars, Borin; Devdatt, Dubhashi; Markus, Forsberg; Johansson, Richard; Dimitrios, Kokkinakis; Nugues, Pierre

Mining semantics for culturomics: towards a knowledge-based approach

Mark

Lars, Borin ; Devdatt, Dubhashi ; Markus, Forsberg ; Johansson, Richard ; Dimitrios, Kokkinakis and Nugues, Pierre ^LU

(2013) International workshop on Mining unstructured big data using natural language processing, MNLP '13 p.3-10

Abstract: The massive amounts of text data made available through the Google Books digitization project have inspired a new field of big-data textual research. Named culturomics, this field has attracted the attention of a growing number of scholars over recent years. However, initial studies based on these data have been criticized for not referring to relevant work in linguistics and language technology. This paper provides some ideas, thoughts and first steps towards a new culturomics initiative, based this time on Swedish data, which pursues a more knowledge-based approach than previous work in this emerging field. The amount of new Swedish text produced daily and older texts being digitized in cultural heritage projects grows at an accelerating... (More); The massive amounts of text data made available through the Google Books digitization project have inspired a new field of big-data textual research. Named culturomics, this field has attracted the attention of a growing number of scholars over recent years. However, initial studies based on these data have been criticized for not referring to relevant work in linguistics and language technology. This paper provides some ideas, thoughts and first steps towards a new culturomics initiative, based this time on Swedish data, which pursues a more knowledge-based approach than previous work in this emerging field. The amount of new Swedish text produced daily and older texts being digitized in cultural heritage projects grows at an accelerating rate. These volumes of text being available in digital form have grown far beyond the capacity of human readers, leaving automated semantic processing of the texts as the only realistic option for accessing and using the information contained in them. The aim of our recently initiated research program is to advance the state of the art in language technology resources and methods for semantic processing of Big Swedish text and focus on the theoretical and methodological advancement of the state of the art in extracting and correlating information from large volumes of Swedish text using a combination of knowledge-based and statistical methods. (Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/4191700

author

Lars, Borin ; Devdatt, Dubhashi ; Markus, Forsberg ; Johansson, Richard ; Dimitrios, Kokkinakis and Nugues, Pierre ^LU

organization

publishing date

2013

type

Chapter in Book/Report/Conference proceeding

publication status

published

subject

Computer Sciences

host publication

UnstructureNLP '13 Proceedings of the 2013 international workshop on Mining unstructured big data using natural language processing

pages

3 - 10

publisher

Association for Computing Machinery (ACM)

conference name

International workshop on Mining unstructured big data using natural language processing, MNLP '13

conference dates

2013-10-28

external identifiers

scopus:84889587228

ISBN

978-1-4503-2415-1

DOI

10.1145/2513549.2513551

language

English

LU publication?

yes

id

3e68bb1b-b5d9-4bc5-b5c5-76a2f470efac (old id 4191700)

date added to LUP

2016-04-04 14:30:14

date last changed

2025-10-14 13:03:16

@inproceedings{3e68bb1b-b5d9-4bc5-b5c5-76a2f470efac,
  abstract     = {{The massive amounts of text data made available through the Google Books digitization project have inspired a new field of big-data textual research. Named culturomics, this field has attracted the attention of a growing number of scholars over recent years. However, initial studies based on these data have been criticized for not referring to relevant work in linguistics and language technology. This paper provides some ideas, thoughts and first steps towards a new culturomics initiative, based this time on Swedish data, which pursues a more knowledge-based approach than previous work in this emerging field. The amount of new Swedish text produced daily and older texts being digitized in cultural heritage projects grows at an accelerating rate. These volumes of text being available in digital form have grown far beyond the capacity of human readers, leaving automated semantic processing of the texts as the only realistic option for accessing and using the information contained in them. The aim of our recently initiated research program is to advance the state of the art in language technology resources and methods for semantic processing of Big Swedish text and focus on the theoretical and methodological advancement of the state of the art in extracting and correlating information from large volumes of Swedish text using a combination of knowledge-based and statistical methods.}},
  author       = {{Lars, Borin and Devdatt, Dubhashi and Markus, Forsberg and Johansson, Richard and Dimitrios, Kokkinakis and Nugues, Pierre}},
  booktitle    = {{UnstructureNLP '13 Proceedings of the 2013 international workshop on Mining unstructured big data using natural language processing}},
  isbn         = {{978-1-4503-2415-1}},
  language     = {{eng}},
  pages        = {{3--10}},
  publisher    = {{Association for Computing Machinery (ACM)}},
  title        = {{Mining semantics for culturomics: towards a knowledge-based approach}},
  url          = {{https://lup.lub.lu.se/search/files/19729378/4191706.pdf}},
  doi          = {{10.1145/2513549.2513551}},
  year         = {{2013}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Mining semantics for culturomics: towards a knowledge-based approach