Semantic Similarity Analysis on English Translations of the Iliad

Bijkerk, Maria

Semantic Similarity Analysis on English Translations of the Iliad

Mark

Bijkerk, Maria ^LU (2024) DABN01 20241
Department of Economics
Department of Statistics

Abstract: Studying translations gives us more insight into cultures and languages. Machine Translation is an application area of the field Natural Language Processing (NLP), used to transfer information from one language to another. Creating these tools require a lot of data, including data about the semantic relationships of the texts, and for unspoken languages like Ancient Greek, there does not exist a lot of (digital) data. In this study, we explore 16 different English translations of the first book of the Iliad, an Ancient Greek epic seen as one of the most influential literary works on modern western literature. We use three different algorithms (GloVe, Word2Vec, and BERT) to create document embeddings for each translation. We then analyse... (More); Studying translations gives us more insight into cultures and languages. Machine Translation is an application area of the field Natural Language Processing (NLP), used to transfer information from one language to another. Creating these tools require a lot of data, including data about the semantic relationships of the texts, and for unspoken languages like Ancient Greek, there does not exist a lot of (digital) data. In this study, we explore 16 different English translations of the first book of the Iliad, an Ancient Greek epic seen as one of the most influential literary works on modern western literature. We use three different algorithms (GloVe, Word2Vec, and BERT) to create document embeddings for each translation. We then analyse how three features (publication year, genre, name versions) influence the cosine similarity scores between the documents. We also use hierarchical clustering to group the translations together without needed a pre-determined number of clusters, to see how the full document embeddings relate to each other. We find that the publication year does not have a significant influence on the similarity scores, but the genre and name versions do seem to have a significant influence. (Less)

Please use this url to cite or link to this publication: http://lup.lub.lu.se/student-papers/record/9156845

author

Bijkerk, Maria ^LU

supervisor

Jakob Bergman

organization

course

DABN01 20241

year

2024

type

H1 - Master's Degree (One Year)

subject

Business and Economics

keywords

natural language processing, textual analysis, document embedding, multidimensional scaling, hierarchical clustering

language

English

id

9156845

date added to LUP

2024-09-24 08:32:33

date last changed

2024-09-24 08:32:33

@misc{9156845,
  abstract     = {{Studying translations gives us more insight into cultures and languages. Machine Translation is an application area of the field Natural Language Processing (NLP), used to transfer information from one language to another. Creating these tools require a lot of data, including data about the semantic relationships of the texts, and for unspoken languages like Ancient Greek, there does not exist a lot of (digital) data. In this study, we explore 16 different English translations of the first book of the Iliad, an Ancient Greek epic seen as one of the most influential literary works on modern western literature. We use three different algorithms (GloVe, Word2Vec, and BERT) to create document embeddings for each translation. We then analyse how three features (publication year, genre, name versions) influence the cosine similarity scores between the documents. We also use hierarchical clustering to group the translations together without needed a pre-determined number of clusters, to see how the full document embeddings relate to each other. We find that the publication year does not have a significant influence on the similarity scores, but the genre and name versions do seem to have a significant influence.}},
  author       = {{Bijkerk, Maria}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Semantic Similarity Analysis on English Translations of the Iliad}},
  year         = {{2024}},
}

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Semantic Similarity Analysis on English Translations of the Iliad