Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Robustness, Stability and Performance of Optimization Algorithms for GAN Training

Larsson, Oskar (2021)
Department of Automatic Control
Abstract
Training Generative Adversial Networks (GANs) for image synthesis problems is a largely heuristical process that is known to be fickle and difficult to set up reliably. To avoid common failure modes and succeed with GAN training, one needs to find very specific hyperparameter settings carefully tuned to the model architectures and datasets at hand. Standard GAN training optimization methods such as Stochastic Gradient Descent Ascent (SGDA) and Adam do not even converge on some very simple min-max problems, which may in part explain the unreliable results that occur when applying them in the more complicated GAN setting.
This thesis compares training GANs using SGDA and Adam with optimization schemes that do converge in convex-concave... (More)
Training Generative Adversial Networks (GANs) for image synthesis problems is a largely heuristical process that is known to be fickle and difficult to set up reliably. To avoid common failure modes and succeed with GAN training, one needs to find very specific hyperparameter settings carefully tuned to the model architectures and datasets at hand. Standard GAN training optimization methods such as Stochastic Gradient Descent Ascent (SGDA) and Adam do not even converge on some very simple min-max problems, which may in part explain the unreliable results that occur when applying them in the more complicated GAN setting.
This thesis compares training GANs using SGDA and Adam with optimization schemes that do converge in convex-concave min-max settings. Specifically, Optimistic Gradient Descent Ascent (OGDA), the ExtraGradient method, and their Adam-variants are treated. Their robustness, stability and performance are evaluated and compared on the state-of-the-art GAN architectures U-Net GAN and StyleGAN 2.
The empirical results on U-Net GAN strongly indicate that the Adam-variants of OGDA and ExtraGradient are more robust to varying choices of hyperparameter settings and less prone to collapse mid-training than the most commonly used optimization schemes in contemporary GAN training, regular SGDA and Adam. However, for an already fine-tuned, well working Adam setup such as StyleGAN 2, end-results were neither improved nor worsened by using the Adam-variants of OGDA and ExtraGradient. This indicates that these algorithms are more reliable and easier to work with than SGDA and Adam, but not necessary for achieving state-of-the-art performance. (Less)
Please use this url to cite or link to this publication:
author
Larsson, Oskar
supervisor
organization
year
type
H3 - Professional qualifications (4 Years - )
subject
report number
TFRT-6136
other publication id
0280-5316
language
English
id
9064517
date added to LUP
2021-09-01 15:49:37
date last changed
2021-09-01 15:49:37
@misc{9064517,
  abstract     = {{Training Generative Adversial Networks (GANs) for image synthesis problems is a largely heuristical process that is known to be fickle and difficult to set up reliably. To avoid common failure modes and succeed with GAN training, one needs to find very specific hyperparameter settings carefully tuned to the model architectures and datasets at hand. Standard GAN training optimization methods such as Stochastic Gradient Descent Ascent (SGDA) and Adam do not even converge on some very simple min-max problems, which may in part explain the unreliable results that occur when applying them in the more complicated GAN setting.
 This thesis compares training GANs using SGDA and Adam with optimization schemes that do converge in convex-concave min-max settings. Specifically, Optimistic Gradient Descent Ascent (OGDA), the ExtraGradient method, and their Adam-variants are treated. Their robustness, stability and performance are evaluated and compared on the state-of-the-art GAN architectures U-Net GAN and StyleGAN 2.
 The empirical results on U-Net GAN strongly indicate that the Adam-variants of OGDA and ExtraGradient are more robust to varying choices of hyperparameter settings and less prone to collapse mid-training than the most commonly used optimization schemes in contemporary GAN training, regular SGDA and Adam. However, for an already fine-tuned, well working Adam setup such as StyleGAN 2, end-results were neither improved nor worsened by using the Adam-variants of OGDA and ExtraGradient. This indicates that these algorithms are more reliable and easier to work with than SGDA and Adam, but not necessary for achieving state-of-the-art performance.}},
  author       = {{Larsson, Oskar}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Robustness, Stability and Performance of Optimization Algorithms for GAN Training}},
  year         = {{2021}},
}