Better hedging of CVA with reinforcement learning

Byman, Isabelle; Karlsson, Oscar

Better hedging of CVA with reinforcement learning

Mark

Byman, Isabelle ^LU and Karlsson, Oscar (2025) In Master's Thesis in Mathematical Sciences FMSM01 20251
Mathematical Statistics

Abstract: This thesis investigates the application of reinforcement learning (RL) to the problem of hedging Credit Valuation Adjustment (CVA), a key component of counterparty credit risk in over-the-counter (OTC) derivatives. While prior studies have demonstrated the effectiveness of RL in hedging simple financial instruments, this work extends the literature by exploring whether RL, specifically the Proximal Policy Optimization (PPO) algorithm, can effectively
hedge CVA in the presence of realistic market conditions, including transaction costs and wrong-way risk. The study is conducted within a simulated environment that models interest rates using a one-factor Hull-White model and default intensity through a Jump-extended Cox–Ingersoll–Ross... (More); This thesis investigates the application of reinforcement learning (RL) to the problem of hedging Credit Valuation Adjustment (CVA), a key component of counterparty credit risk in over-the-counter (OTC) derivatives. While prior studies have demonstrated the effectiveness of RL in hedging simple financial instruments, this work extends the literature by exploring whether RL, specifically the Proximal Policy Optimization (PPO) algorithm, can effectively
hedge CVA in the presence of realistic market conditions, including transaction costs and wrong-way risk. The study is conducted within a simulated environment that models interest rates using a one-factor Hull-White model and default intensity through a Jump-extended Cox–Ingersoll–Ross (JCIR) process. Multiple PPO variants are tested, evaluating the impact of reward functions, observation and action spaces, and hyperparameter settings. Results show that PPO is capable of learning competitive hedging strategies and can outperform traditional delta hedging under certain market
frictions. However, further research is required for the model to be applicable in the industry. Directions for future research include incorporating funding costs, liquidity constraints, and portfolio-level netting effects. (Less)

Please use this url to cite or link to this publication: http://lup.lub.lu.se/student-papers/record/9197473

author

Byman, Isabelle ^LU and Karlsson, Oscar

supervisor

Magnus Wiktorsson

organization

Mathematical Statistics

course

FMSM01 20251

year

2025

type

H2 - Master's Degree (Two Years)

subject

Mathematics and Statistics

keywords

OTC, xVA, CVA, Counterparty Credit Risk, Interest Rate Swaps, Hull-White Model, Machine Learning, Reinforcement Learning

publication/series

Master's Thesis in Mathematical Sciences

report number

LUTFMS-3520-2025

ISSN

1404-6342

other publication id

2025:E45

language

English

id

9197473

date added to LUP

2025-06-12 09:52:08

date last changed

2025-06-13 09:37:30

@misc{9197473,
  abstract     = {{This thesis investigates the application of reinforcement learning (RL) to the problem of hedging Credit Valuation Adjustment (CVA), a key component of counterparty credit risk in over-the-counter (OTC) derivatives. While prior studies have demonstrated the effectiveness of RL in hedging simple financial instruments, this work extends the literature by exploring whether RL, specifically the Proximal Policy Optimization (PPO) algorithm, can effectively
hedge CVA in the presence of realistic market conditions, including transaction costs and wrong-way risk. The study is conducted within a simulated environment that models interest rates using a one-factor Hull-White model and default intensity through a Jump-extended Cox–Ingersoll–Ross (JCIR) process. Multiple PPO variants are tested, evaluating the impact of reward functions, observation and action spaces, and hyperparameter settings. Results show that PPO is capable of learning competitive hedging strategies and can outperform traditional delta hedging under certain market
frictions. However, further research is required for the model to be applicable in the industry. Directions for future research include incorporating funding costs, liquidity constraints, and portfolio-level netting effects.}},
  author       = {{Byman, Isabelle and Karlsson, Oscar}},
  issn         = {{1404-6342}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{Master's Thesis in Mathematical Sciences}},
  title        = {{Better hedging of CVA with reinforcement learning}},
  year         = {{2025}},
}

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Better hedging of CVA with reinforcement learning