Decoding the News with Transformers
(2025) DABN01 20251Department of Statistics
Department of Economics
- Abstract
- This thesis evaluates an ensemble-based framework for predicting stock prices using pretrained transformer models for sentiment analysis and locally trained transformers for time series forecasting. Sentiment signals were generated from financial news using FinBERT, RoBERTa, and VADER, then combined with historical stock price data from Refinitiv Eikon. To enhance label quality, a custom ensemble labeling pipeline was developed, combining weak keyword-based labeling methods with the spaCy transformer-based named entity recognition model, using weighted voting. Manual evaluation of 1,400 article-label pairs ensured a balance between precision and recall. The study focuses on ``The Magnificent Seven'' companies and evaluates 42 model... (More)
- This thesis evaluates an ensemble-based framework for predicting stock prices using pretrained transformer models for sentiment analysis and locally trained transformers for time series forecasting. Sentiment signals were generated from financial news using FinBERT, RoBERTa, and VADER, then combined with historical stock price data from Refinitiv Eikon. To enhance label quality, a custom ensemble labeling pipeline was developed, combining weak keyword-based labeling methods with the spaCy transformer-based named entity recognition model, using weighted voting. Manual evaluation of 1,400 article-label pairs ensured a balance between precision and recall. The study focuses on ``The Magnificent Seven'' companies and evaluates 42 model configurations, combining three sentiment models, two transformer architectures, and seven company-specific datasets. Forecasting was performed using Informer and PatchTST, with hyperparameter tuning. Evaluation included regression and directional metrics as well as a modified Diebold-Mariano test. Due to the inclusion of multiple company-specific models, results are also reported as average scores across the seven companies. Results show that PatchTST consistently outperformed Informer in regression tasks, while Informer achieved marginally higher directional accuracy. These outcomes were partly driven by systematic differences in sentiment strength across models, with FinBERT outputting more optimistic signals than RoBERTa and VADER. Key contributions include the end-to-end use of state-of-the-art transformers, a custom ensemble labeling pipeline, and extensive empirical evaluation. (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/9199576
- author
- Nordansjö, William LU and Fourong, Fredrik
- supervisor
- organization
- alternative title
- Evaluating Sentiment-Driven Stock Price Prediction
- course
- DABN01 20251
- year
- 2025
- type
- H1 - Master's Degree (One Year)
- subject
- keywords
- sentiment analysis, stock prediction, labeling, transformers, deep learning
- language
- English
- id
- 9199576
- date added to LUP
- 2025-09-12 09:05:03
- date last changed
- 2025-09-12 09:05:03
@misc{9199576, abstract = {{This thesis evaluates an ensemble-based framework for predicting stock prices using pretrained transformer models for sentiment analysis and locally trained transformers for time series forecasting. Sentiment signals were generated from financial news using FinBERT, RoBERTa, and VADER, then combined with historical stock price data from Refinitiv Eikon. To enhance label quality, a custom ensemble labeling pipeline was developed, combining weak keyword-based labeling methods with the spaCy transformer-based named entity recognition model, using weighted voting. Manual evaluation of 1,400 article-label pairs ensured a balance between precision and recall. The study focuses on ``The Magnificent Seven'' companies and evaluates 42 model configurations, combining three sentiment models, two transformer architectures, and seven company-specific datasets. Forecasting was performed using Informer and PatchTST, with hyperparameter tuning. Evaluation included regression and directional metrics as well as a modified Diebold-Mariano test. Due to the inclusion of multiple company-specific models, results are also reported as average scores across the seven companies. Results show that PatchTST consistently outperformed Informer in regression tasks, while Informer achieved marginally higher directional accuracy. These outcomes were partly driven by systematic differences in sentiment strength across models, with FinBERT outputting more optimistic signals than RoBERTa and VADER. Key contributions include the end-to-end use of state-of-the-art transformers, a custom ensemble labeling pipeline, and extensive empirical evaluation.}}, author = {{Nordansjö, William and Fourong, Fredrik}}, language = {{eng}}, note = {{Student Paper}}, title = {{Decoding the News with Transformers}}, year = {{2025}}, }