Robustness, Stability and Performance of Optimization Algorithms for GAN Training
(2021)Department of Automatic Control
- Abstract
- Training Generative Adversial Networks (GANs) for image synthesis problems is a largely heuristical process that is known to be fickle and difficult to set up reliably. To avoid common failure modes and succeed with GAN training, one needs to find very specific hyperparameter settings carefully tuned to the model architectures and datasets at hand. Standard GAN training optimization methods such as Stochastic Gradient Descent Ascent (SGDA) and Adam do not even converge on some very simple min-max problems, which may in part explain the unreliable results that occur when applying them in the more complicated GAN setting.
This thesis compares training GANs using SGDA and Adam with optimization schemes that do converge in convex-concave... (More) - Training Generative Adversial Networks (GANs) for image synthesis problems is a largely heuristical process that is known to be fickle and difficult to set up reliably. To avoid common failure modes and succeed with GAN training, one needs to find very specific hyperparameter settings carefully tuned to the model architectures and datasets at hand. Standard GAN training optimization methods such as Stochastic Gradient Descent Ascent (SGDA) and Adam do not even converge on some very simple min-max problems, which may in part explain the unreliable results that occur when applying them in the more complicated GAN setting.
This thesis compares training GANs using SGDA and Adam with optimization schemes that do converge in convex-concave min-max settings. Specifically, Optimistic Gradient Descent Ascent (OGDA), the ExtraGradient method, and their Adam-variants are treated. Their robustness, stability and performance are evaluated and compared on the state-of-the-art GAN architectures U-Net GAN and StyleGAN 2.
The empirical results on U-Net GAN strongly indicate that the Adam-variants of OGDA and ExtraGradient are more robust to varying choices of hyperparameter settings and less prone to collapse mid-training than the most commonly used optimization schemes in contemporary GAN training, regular SGDA and Adam. However, for an already fine-tuned, well working Adam setup such as StyleGAN 2, end-results were neither improved nor worsened by using the Adam-variants of OGDA and ExtraGradient. This indicates that these algorithms are more reliable and easier to work with than SGDA and Adam, but not necessary for achieving state-of-the-art performance. (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/9064517
- author
- Larsson, Oskar
- supervisor
-
- Martin Morin LU
- Pontus Giselsson LU
- Bo Bernhardsson LU
- organization
- year
- 2021
- type
- H3 - Professional qualifications (4 Years - )
- subject
- report number
- TFRT-6136
- other publication id
- 0280-5316
- language
- English
- id
- 9064517
- date added to LUP
- 2021-09-01 15:49:37
- date last changed
- 2021-09-01 15:49:37
@misc{9064517, abstract = {{Training Generative Adversial Networks (GANs) for image synthesis problems is a largely heuristical process that is known to be fickle and difficult to set up reliably. To avoid common failure modes and succeed with GAN training, one needs to find very specific hyperparameter settings carefully tuned to the model architectures and datasets at hand. Standard GAN training optimization methods such as Stochastic Gradient Descent Ascent (SGDA) and Adam do not even converge on some very simple min-max problems, which may in part explain the unreliable results that occur when applying them in the more complicated GAN setting. This thesis compares training GANs using SGDA and Adam with optimization schemes that do converge in convex-concave min-max settings. Specifically, Optimistic Gradient Descent Ascent (OGDA), the ExtraGradient method, and their Adam-variants are treated. Their robustness, stability and performance are evaluated and compared on the state-of-the-art GAN architectures U-Net GAN and StyleGAN 2. The empirical results on U-Net GAN strongly indicate that the Adam-variants of OGDA and ExtraGradient are more robust to varying choices of hyperparameter settings and less prone to collapse mid-training than the most commonly used optimization schemes in contemporary GAN training, regular SGDA and Adam. However, for an already fine-tuned, well working Adam setup such as StyleGAN 2, end-results were neither improved nor worsened by using the Adam-variants of OGDA and ExtraGradient. This indicates that these algorithms are more reliable and easier to work with than SGDA and Adam, but not necessary for achieving state-of-the-art performance.}}, author = {{Larsson, Oskar}}, language = {{eng}}, note = {{Student Paper}}, title = {{Robustness, Stability and Performance of Optimization Algorithms for GAN Training}}, year = {{2021}}, }