Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Decoding the News with Transformers

Nordansjö, William LU and Fourong, Fredrik (2025) DABN01 20251
Department of Statistics
Department of Economics
Abstract
This thesis evaluates an ensemble-based framework for predicting stock prices using pretrained transformer models for sentiment analysis and locally trained transformers for time series forecasting. Sentiment signals were generated from financial news using FinBERT, RoBERTa, and VADER, then combined with historical stock price data from Refinitiv Eikon. To enhance label quality, a custom ensemble labeling pipeline was developed, combining weak keyword-based labeling methods with the spaCy transformer-based named entity recognition model, using weighted voting. Manual evaluation of 1,400 article-label pairs ensured a balance between precision and recall. The study focuses on ``The Magnificent Seven'' companies and evaluates 42 model... (More)
This thesis evaluates an ensemble-based framework for predicting stock prices using pretrained transformer models for sentiment analysis and locally trained transformers for time series forecasting. Sentiment signals were generated from financial news using FinBERT, RoBERTa, and VADER, then combined with historical stock price data from Refinitiv Eikon. To enhance label quality, a custom ensemble labeling pipeline was developed, combining weak keyword-based labeling methods with the spaCy transformer-based named entity recognition model, using weighted voting. Manual evaluation of 1,400 article-label pairs ensured a balance between precision and recall. The study focuses on ``The Magnificent Seven'' companies and evaluates 42 model configurations, combining three sentiment models, two transformer architectures, and seven company-specific datasets. Forecasting was performed using Informer and PatchTST, with hyperparameter tuning. Evaluation included regression and directional metrics as well as a modified Diebold-Mariano test. Due to the inclusion of multiple company-specific models, results are also reported as average scores across the seven companies. Results show that PatchTST consistently outperformed Informer in regression tasks, while Informer achieved marginally higher directional accuracy. These outcomes were partly driven by systematic differences in sentiment strength across models, with FinBERT outputting more optimistic signals than RoBERTa and VADER. Key contributions include the end-to-end use of state-of-the-art transformers, a custom ensemble labeling pipeline, and extensive empirical evaluation. (Less)
Please use this url to cite or link to this publication:
author
Nordansjö, William LU and Fourong, Fredrik
supervisor
organization
alternative title
Evaluating Sentiment-Driven Stock Price Prediction
course
DABN01 20251
year
type
H1 - Master's Degree (One Year)
subject
keywords
sentiment analysis, stock prediction, labeling, transformers, deep learning
language
English
id
9199576
date added to LUP
2025-09-12 09:05:03
date last changed
2025-09-12 09:05:03
@misc{9199576,
  abstract     = {{This thesis evaluates an ensemble-based framework for predicting stock prices using pretrained transformer models for sentiment analysis and locally trained transformers for time series forecasting. Sentiment signals were generated from financial news using FinBERT, RoBERTa, and VADER, then combined with historical stock price data from Refinitiv Eikon. To enhance label quality, a custom ensemble labeling pipeline was developed, combining weak keyword-based labeling methods with the spaCy transformer-based named entity recognition model, using weighted voting. Manual evaluation of 1,400 article-label pairs ensured a balance between precision and recall. The study focuses on ``The Magnificent Seven'' companies and evaluates 42 model configurations, combining three sentiment models, two transformer architectures, and seven company-specific datasets. Forecasting was performed using Informer and PatchTST, with hyperparameter tuning. Evaluation included regression and directional metrics as well as a modified Diebold-Mariano test. Due to the inclusion of multiple company-specific models, results are also reported as average scores across the seven companies. Results show that PatchTST consistently outperformed Informer in regression tasks, while Informer achieved marginally higher directional accuracy. These outcomes were partly driven by systematic differences in sentiment strength across models, with FinBERT outputting more optimistic signals than RoBERTa and VADER. Key contributions include the end-to-end use of state-of-the-art transformers, a custom ensemble labeling pipeline, and extensive empirical evaluation.}},
  author       = {{Nordansjö, William and Fourong, Fredrik}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Decoding the News with Transformers}},
  year         = {{2025}},
}