Portfolio Optimization using Deep Reinforcement Learning models

Hedlund, Jesper

Portfolio Optimization using Deep Reinforcement Learning models

Mark

Hedlund, Jesper ^LU (2024) NEKP01 20241
Department of Economics

Abstract: Portfolio optimization involves selecting assets to maximize risk-adjusted returns, typically using linear methods like Mean-Variance Optimization (MVO). However, such approaches may not fully capture the complexities of financial markets. This study leverages recent advances in machine learning, specifically reinforcement learning with deep neural networks, to identify alternative methods that may improve upon MVO. Using data from the broad S&P 500, we compare the performance of five modern deep reinforcement learning (DRL) models against MVO, with a focus on risk-adjusted returns. Additionally, we assess whether incorporating a goal-oriented reward function, explicitly designed to maximize risk-adjusted returns, improves DRL performance.... (More); Portfolio optimization involves selecting assets to maximize risk-adjusted returns, typically using linear methods like Mean-Variance Optimization (MVO). However, such approaches may not fully capture the complexities of financial markets. This study leverages recent advances in machine learning, specifically reinforcement learning with deep neural networks, to identify alternative methods that may improve upon MVO. Using data from the broad S&P 500, we compare the performance of five modern deep reinforcement learning (DRL) models against MVO, with a focus on risk-adjusted returns. Additionally, we assess whether incorporating a goal-oriented reward function, explicitly designed to maximize risk-adjusted returns, improves DRL performance. To account for the stochastic nature of DRL training, the mean performance across 10 independent runs was calculated and used to ensure result
stability, with transaction costs included for increased real-world applicability.

Our findings indicate that DRL models generally outperformed the MVO benchmark,
especially in the Maximum Sharpe and ETF portfolios, by achieving higher Sharpe ratios. However, no significant improvement was observed when using the goal-oriented reward function. A robustness test using a Minimum Variance portfolio revealed that DRL models did not clearly surpass MVO, suggesting that the effectiveness of DRL models may depend on the portfolio strategy employed. Despite these mixed results, DRL continues to show potential for enhanced portfolio optimization, though its practical applications warrant further exploration,
especially considering factors such as model complexity and transaction costs. (Less)

Please use this url to cite or link to this publication: http://lup.lub.lu.se/student-papers/record/9178260

author

Hedlund, Jesper ^LU

supervisor

Andreas Johansson ^LU

organization

Department of Economics

course

NEKP01 20241

year

2024

type

H2 - Master's Degree (Two Years)

subject

Business and Economics

keywords

Mean-Variance Optimization, Deep Reinforcement Learning, Portfolio Optimization, Differential Sharpe Ratio, Machine Learning

language

English

id

9178260

date added to LUP

2025-01-20 12:18:31

date last changed

2025-01-20 12:18:31

@misc{9178260,
  abstract     = {{Portfolio optimization involves selecting assets to maximize risk-adjusted returns, typically using linear methods like Mean-Variance Optimization (MVO). However, such approaches may not fully capture the complexities of financial markets. This study leverages recent advances in machine learning, specifically reinforcement learning with deep neural networks, to identify alternative methods that may improve upon MVO. Using data from the broad S&P 500, we compare the performance of five modern deep reinforcement learning (DRL) models against MVO, with a focus on risk-adjusted returns. Additionally, we assess whether incorporating a goal-oriented reward function, explicitly designed to maximize risk-adjusted returns, improves DRL performance. To account for the stochastic nature of DRL training, the mean performance across 10 independent runs was calculated and used to ensure result
stability, with transaction costs included for increased real-world applicability.

Our findings indicate that DRL models generally outperformed the MVO benchmark,
especially in the Maximum Sharpe and ETF portfolios, by achieving higher Sharpe ratios. However, no significant improvement was observed when using the goal-oriented reward function. A robustness test using a Minimum Variance portfolio revealed that DRL models did not clearly surpass MVO, suggesting that the effectiveness of DRL models may depend on the portfolio strategy employed. Despite these mixed results, DRL continues to show potential for enhanced portfolio optimization, though its practical applications warrant further exploration,
especially considering factors such as model complexity and transaction costs.}},
  author       = {{Hedlund, Jesper}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Portfolio Optimization using Deep Reinforcement Learning models}},
  year         = {{2024}},
}

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Portfolio Optimization using Deep Reinforcement Learning models