Comparison of traditional machine learning algorithms with large language models for developing personalized recommender systems for enhancing passenger experience on flights

Mähl, Joshua

Comparison of traditional machine learning algorithms with large language models for developing personalized recommender systems for enhancing passenger experience on flights

Mark

Mähl, Joshua ^LU (2024) DABN01 20241
Department of Economics
Department of Statistics

Abstract: This thesis compares traditional machine learning algorithms and large language models (LLMs) in developing personalized recommender systems to enhance passenger experiences on flights. Traditional methods like collaborative filtering have been widely used but face challenges such as data sparsity and cold-start problems. LLMs, with their advanced natural language processing capabilities, offer promising solutions to these issues. The study uses a synthetic dataset simulating unary passenger interactions with in-flight entertainment and services, including features like movie preferences, meal choices, and beverage selections. Traditional models, including cosine similarity, singular value decomposition and Bayesian personalized rankings,... (More); This thesis compares traditional machine learning algorithms and large language models (LLMs) in developing personalized recommender systems to enhance passenger experiences on flights. Traditional methods like collaborative filtering have been widely used but face challenges such as data sparsity and cold-start problems. LLMs, with their advanced natural language processing capabilities, offer promising solutions to these issues. The study uses a synthetic dataset simulating unary passenger interactions with in-flight entertainment and services, including features like movie preferences, meal choices, and beverage selections. Traditional models, including cosine similarity, singular value decomposition and Bayesian personalized rankings, were benchmarked against a recommendation system developed using the GPT-3.5 Turbo model. Evaluation metrics such as accuracy, precision, recall, F1-score, diversity, and coverage were used to assess performance.
Traditional models showed high precision but struggled with recall, leading to lower coverage and diversity. In contrast, the LLM-based system demonstrated superior recall and diversity. This indicates the LLM's effectiveness in identifying a broader range of relevant items, even with sparse data. However, the LLM approach also had trade-offs, including a higher rate of false positives and still significant popularity bias. Despite these challenges, the LLM outperformed traditional methods in providing diverse and accurate recommendations. (Less)

Please use this url to cite or link to this publication: http://lup.lub.lu.se/student-papers/record/9161962

author

Mähl, Joshua ^LU

supervisor

Simon Reese ^LU

organization

course

DABN01 20241

year

2024

type

H1 - Master's Degree (One Year)

subject

Business and Economics

keywords

recommendation systems, large language models, collaborative filtering

language

English

id

9161962

date added to LUP

2024-09-24 08:35:47

date last changed

2024-09-24 08:35:47

@misc{9161962,
  abstract     = {{This thesis compares traditional machine learning algorithms and large language models (LLMs) in developing personalized recommender systems to enhance passenger experiences on flights. Traditional methods like collaborative filtering have been widely used but face challenges such as data sparsity and cold-start problems. LLMs, with their advanced natural language processing capabilities, offer promising solutions to these issues. The study uses a synthetic dataset simulating unary passenger interactions with in-flight entertainment and services, including features like movie preferences, meal choices, and beverage selections. Traditional models, including cosine similarity, singular value decomposition and Bayesian personalized rankings, were benchmarked against a recommendation system developed using the GPT-3.5 Turbo model. Evaluation metrics such as accuracy, precision, recall, F1-score, diversity, and coverage were used to assess performance.
Traditional models showed high precision but struggled with recall, leading to lower coverage and diversity. In contrast, the LLM-based system demonstrated superior recall and diversity. This indicates the LLM's effectiveness in identifying a broader range of relevant items, even with sparse data. However, the LLM approach also had trade-offs, including a higher rate of false positives and still significant popularity bias. Despite these challenges, the LLM outperformed traditional methods in providing diverse and accurate recommendations.}},
  author       = {{Mähl, Joshua}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Comparison of traditional machine learning algorithms with large language models for developing personalized recommender systems for enhancing passenger experience on flights}},
  year         = {{2024}},
}

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Comparison of traditional machine learning algorithms with large language models for developing personalized recommender systems for enhancing passenger experience on flights