Convergence and stability analysis of stochastic optimization algorithms

Williamson, Måns

Convergence and stability analysis of stochastic optimization algorithms

Mark

Williamson, Måns ^LU

(2023) In Licentiate Theses in Mathematical Sciences 2023(1).

Abstract: This thesis is concerned with stochastic optimization methods. The pioneering work in the field is the article “A stochastic approximation algorithm” by Robbins and Monro [1], in which they proposed the stochastic gradient descent; a stochastic version of the classical gradient descent algorithm. Since then, many improvements and extensions of the theory have been published, as well as new versions of the original algorithm. Despite this, a problem that many stochastic algorithms still share, is the sensitivity to the choice of the step size/learning rate. One can view the stochastic gradient descent algorithm as a stochastic version of the explicit Euler scheme applied to the gradient flow equation. There are other schemes for solving... (More); This thesis is concerned with stochastic optimization methods. The pioneering work in the field is the article “A stochastic approximation algorithm” by Robbins and Monro [1], in which they proposed the stochastic gradient descent; a stochastic version of the classical gradient descent algorithm. Since then, many improvements and extensions of the theory have been published, as well as new versions of the original algorithm. Despite this, a problem that many stochastic algorithms still share, is the sensitivity to the choice of the step size/learning rate. One can view the stochastic gradient descent algorithm as a stochastic version of the explicit Euler scheme applied to the gradient flow equation. There are other schemes for solving differential equations numerically that allow for larger step sizes. In this thesis, we investigate the properties of some of these methods, and how they perform, when applied to stochastic optimization problems. (Less)
Abstract (Swedish): Den här uppsatsen handlar om stokastiska optimerings metoder. Inom områdena Artificiell intelligens och Maskininlärning är det vanligt att man vill skatta parametrar i en statistisk modell för att kunna göra prediktioner. Man använder sig då av en kostnadsfunktion som bestraffar prediktioner som ligger långt ifrån det riktiga värdet och minimerar denna med avseende på de statistiska parametrarna. Ett exempel på detta är artificiella neurala nätverk, som ofta är väldigt komplexa och svåra att minimera effektivt. Vanliga optimeringsmetoder är då inte lämpliga eftersom de är för beräkningsmässigt krävande och man använder sig istället av så kallade stokastiska optimerings metoder, som är mindre kostsamma och snabbare att använda. Även om... (More); Den här uppsatsen handlar om stokastiska optimerings metoder. Inom områdena Artificiell intelligens och Maskininlärning är det vanligt att man vill skatta parametrar i en statistisk modell för att kunna göra prediktioner. Man använder sig då av en kostnadsfunktion som bestraffar prediktioner som ligger långt ifrån det riktiga värdet och minimerar denna med avseende på de statistiska parametrarna. Ett exempel på detta är artificiella neurala nätverk, som ofta är väldigt komplexa och svåra att minimera effektivt. Vanliga optimeringsmetoder är då inte lämpliga eftersom de är för beräkningsmässigt krävande och man använder sig istället av så kallade stokastiska optimerings metoder, som är mindre kostsamma och snabbare att använda. Även om dessa har visat sig fungera bra för dylika problem, är de ofta känsliga för valet av steglängd. Väljs den för liten tar det en evighet att hitta minimat till kostnadsfunktionen och väljs den för stor kan algoritmen explodera. I den här uppsatsen undersöks olika metoder för att stabilisera stokastiska optimeringsalgoritmer. Mer exakt tittar vi på metoder för att lösa differentialekvationer numeriskt, som har visat sig ha väldigt bra stabilitetsegenskaper och omformulerar dessa som stokastiska optimeringsalgoritmer.
(Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/49ab007a-f98f-4afc-bc59-ea4e228db6ec

author

Williamson, Måns ^LU

supervisor

Tony Stillfjord ^LU

organization

Mathematics (Faculty of Engineering)

publishing date

2023-02-21

type

Thesis

publication status

published

subject

Computational Mathematics

keywords

numerical analysis, optimization, stochastic optimization, machine learning

in

Licentiate Theses in Mathematical Sciences

volume

2023

issue

1

pages

104 pages

publisher

Lund University

ISSN

1404-028X

ISBN

978-91-8039-558-8

978-91-8039-559-5

language

English

LU publication?

yes

id

49ab007a-f98f-4afc-bc59-ea4e228db6ec

date added to LUP

2023-02-21 11:03:17

date last changed

2025-04-04 15:06:54

@misc{49ab007a-f98f-4afc-bc59-ea4e228db6ec,
  abstract     = {{This thesis is concerned with stochastic optimization methods. The pioneering work in the field is the article “A stochastic approximation algorithm” by Robbins and Monro [1], in which they proposed the stochastic gradient descent; a stochastic version of the classical gradient descent algorithm. Since then, many improvements and extensions of the theory have been published, as well as new versions of the original algorithm. Despite this, a problem that many stochastic algorithms still share, is the sensitivity to the choice of the step size/learning rate. One can view the stochastic gradient descent algorithm as a stochastic version of the explicit Euler scheme applied to the gradient flow equation. There are other schemes for solving differential equations numerically that allow for larger step sizes. In this thesis, we investigate the properties of some of these methods, and how they perform, when applied to stochastic optimization problems.}},
  author       = {{Williamson, Måns}},
  isbn         = {{978-91-8039-558-8}},
  issn         = {{1404-028X}},
  keywords     = {{numerical analysis; optimization; stochastic optimization; machine learning}},
  language     = {{eng}},
  month        = {{02}},
  note         = {{Licentiate Thesis}},
  number       = {{1}},
  publisher    = {{Lund University}},
  series       = {{Licentiate Theses in Mathematical Sciences}},
  title        = {{Convergence and stability analysis of stochastic optimization algorithms}},
  url          = {{https://lup.lub.lu.se/search/files/138524260/Avhandling_M_ns_Williamson.pdf}},
  volume       = {{2023}},
  year         = {{2023}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Convergence and stability analysis of stochastic optimization algorithms