Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Portfolio Optimization using Deep Reinforcement Learning models

Hedlund, Jesper LU (2024) NEKP01 20241
Department of Economics
Abstract
Portfolio optimization involves selecting assets to maximize risk-adjusted returns, typically using linear methods like Mean-Variance Optimization (MVO). However, such approaches may not fully capture the complexities of financial markets. This study leverages recent advances in machine learning, specifically reinforcement learning with deep neural networks, to identify alternative methods that may improve upon MVO. Using data from the broad S&P 500, we compare the performance of five modern deep reinforcement learning (DRL) models against MVO, with a focus on risk-adjusted returns. Additionally, we assess whether incorporating a goal-oriented reward function, explicitly designed to maximize risk-adjusted returns, improves DRL performance.... (More)
Portfolio optimization involves selecting assets to maximize risk-adjusted returns, typically using linear methods like Mean-Variance Optimization (MVO). However, such approaches may not fully capture the complexities of financial markets. This study leverages recent advances in machine learning, specifically reinforcement learning with deep neural networks, to identify alternative methods that may improve upon MVO. Using data from the broad S&P 500, we compare the performance of five modern deep reinforcement learning (DRL) models against MVO, with a focus on risk-adjusted returns. Additionally, we assess whether incorporating a goal-oriented reward function, explicitly designed to maximize risk-adjusted returns, improves DRL performance. To account for the stochastic nature of DRL training, the mean performance across 10 independent runs was calculated and used to ensure result
stability, with transaction costs included for increased real-world applicability.

Our findings indicate that DRL models generally outperformed the MVO benchmark,
especially in the Maximum Sharpe and ETF portfolios, by achieving higher Sharpe ratios. However, no significant improvement was observed when using the goal-oriented reward function. A robustness test using a Minimum Variance portfolio revealed that DRL models did not clearly surpass MVO, suggesting that the effectiveness of DRL models may depend on the portfolio strategy employed. Despite these mixed results, DRL continues to show potential for enhanced portfolio optimization, though its practical applications warrant further exploration,
especially considering factors such as model complexity and transaction costs. (Less)
Please use this url to cite or link to this publication:
author
Hedlund, Jesper LU
supervisor
organization
course
NEKP01 20241
year
type
H2 - Master's Degree (Two Years)
subject
keywords
Mean-Variance Optimization, Deep Reinforcement Learning, Portfolio Optimization, Differential Sharpe Ratio, Machine Learning
language
English
id
9178260
date added to LUP
2025-01-20 12:18:31
date last changed
2025-01-20 12:18:31
@misc{9178260,
  abstract     = {{Portfolio optimization involves selecting assets to maximize risk-adjusted returns, typically using linear methods like Mean-Variance Optimization (MVO). However, such approaches may not fully capture the complexities of financial markets. This study leverages recent advances in machine learning, specifically reinforcement learning with deep neural networks, to identify alternative methods that may improve upon MVO. Using data from the broad S&P 500, we compare the performance of five modern deep reinforcement learning (DRL) models against MVO, with a focus on risk-adjusted returns. Additionally, we assess whether incorporating a goal-oriented reward function, explicitly designed to maximize risk-adjusted returns, improves DRL performance. To account for the stochastic nature of DRL training, the mean performance across 10 independent runs was calculated and used to ensure result
stability, with transaction costs included for increased real-world applicability.

Our findings indicate that DRL models generally outperformed the MVO benchmark,
especially in the Maximum Sharpe and ETF portfolios, by achieving higher Sharpe ratios. However, no significant improvement was observed when using the goal-oriented reward function. A robustness test using a Minimum Variance portfolio revealed that DRL models did not clearly surpass MVO, suggesting that the effectiveness of DRL models may depend on the portfolio strategy employed. Despite these mixed results, DRL continues to show potential for enhanced portfolio optimization, though its practical applications warrant further exploration,
especially considering factors such as model complexity and transaction costs.}},
  author       = {{Hedlund, Jesper}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Portfolio Optimization using Deep Reinforcement Learning models}},
  year         = {{2024}},
}