Decoding the News with Transformers

Nordansjö, William; Fourong, Fredrik

Decoding the News with Transformers

Mark

Nordansjö, William ^LU and Fourong, Fredrik (2025) DABN01 20251
Department of Statistics
Department of Economics

Abstract: This thesis evaluates an ensemble-based framework for predicting stock prices using pretrained transformer models for sentiment analysis and locally trained transformers for time series forecasting. Sentiment signals were generated from financial news using FinBERT, RoBERTa, and VADER, then combined with historical stock price data from Refinitiv Eikon. To enhance label quality, a custom ensemble labeling pipeline was developed, combining weak keyword-based labeling methods with the spaCy transformer-based named entity recognition model, using weighted voting. Manual evaluation of 1,400 article-label pairs ensured a balance between precision and recall. The study focuses on ``The Magnificent Seven'' companies and evaluates 42 model... (More); This thesis evaluates an ensemble-based framework for predicting stock prices using pretrained transformer models for sentiment analysis and locally trained transformers for time series forecasting. Sentiment signals were generated from financial news using FinBERT, RoBERTa, and VADER, then combined with historical stock price data from Refinitiv Eikon. To enhance label quality, a custom ensemble labeling pipeline was developed, combining weak keyword-based labeling methods with the spaCy transformer-based named entity recognition model, using weighted voting. Manual evaluation of 1,400 article-label pairs ensured a balance between precision and recall. The study focuses on ``The Magnificent Seven'' companies and evaluates 42 model configurations, combining three sentiment models, two transformer architectures, and seven company-specific datasets. Forecasting was performed using Informer and PatchTST, with hyperparameter tuning. Evaluation included regression and directional metrics as well as a modified Diebold-Mariano test. Due to the inclusion of multiple company-specific models, results are also reported as average scores across the seven companies. Results show that PatchTST consistently outperformed Informer in regression tasks, while Informer achieved marginally higher directional accuracy. These outcomes were partly driven by systematic differences in sentiment strength across models, with FinBERT outputting more optimistic signals than RoBERTa and VADER. Key contributions include the end-to-end use of state-of-the-art transformers, a custom ensemble labeling pipeline, and extensive empirical evaluation. (Less)

Please use this url to cite or link to this publication: http://lup.lub.lu.se/student-papers/record/9199576

author

Nordansjö, William ^LU and Fourong, Fredrik

supervisor

Muhammad Qasim

organization

alternative title

Evaluating Sentiment-Driven Stock Price Prediction

course

DABN01 20251

year

2025

type

H1 - Master's Degree (One Year)

subject

Business and Economics

keywords

sentiment analysis, stock prediction, labeling, transformers, deep learning

language

English

id

9199576

date added to LUP

2025-09-12 09:05:03

date last changed

2025-09-12 09:05:03

@misc{9199576,
  abstract     = {{This thesis evaluates an ensemble-based framework for predicting stock prices using pretrained transformer models for sentiment analysis and locally trained transformers for time series forecasting. Sentiment signals were generated from financial news using FinBERT, RoBERTa, and VADER, then combined with historical stock price data from Refinitiv Eikon. To enhance label quality, a custom ensemble labeling pipeline was developed, combining weak keyword-based labeling methods with the spaCy transformer-based named entity recognition model, using weighted voting. Manual evaluation of 1,400 article-label pairs ensured a balance between precision and recall. The study focuses on ``The Magnificent Seven'' companies and evaluates 42 model configurations, combining three sentiment models, two transformer architectures, and seven company-specific datasets. Forecasting was performed using Informer and PatchTST, with hyperparameter tuning. Evaluation included regression and directional metrics as well as a modified Diebold-Mariano test. Due to the inclusion of multiple company-specific models, results are also reported as average scores across the seven companies. Results show that PatchTST consistently outperformed Informer in regression tasks, while Informer achieved marginally higher directional accuracy. These outcomes were partly driven by systematic differences in sentiment strength across models, with FinBERT outputting more optimistic signals than RoBERTa and VADER. Key contributions include the end-to-end use of state-of-the-art transformers, a custom ensemble labeling pipeline, and extensive empirical evaluation.}},
  author       = {{Nordansjö, William and Fourong, Fredrik}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Decoding the News with Transformers}},
  year         = {{2025}},
}

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Decoding the News with Transformers