Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

A Study of Time-Stepping Methods for Optimization in Supervised Learning

Shehadeh, Kassem LU (2020) In Master's Theses in Mathematical Sciences NUMM11 20201
Mathematics (Faculty of Engineering)
Centre for Mathematical Sciences
Abstract
In supervised machine learning, training neural networks requires solving optimization problems. These problems can be restated as gradient flow problems and solved numerically using time-stepping methods.

In order to study the behavior of these time-stepping methods, in the context of solving the aforementioned optimization problems, a framework in Python was developed. This framework is based on the software for machine learning Scikit Learn and implements a stochastic approach of different time-stepping methods, namely the explicit Euler method, the implicit Euler method and the
classical fourth-order Runge-Kutta method.

This framework was used to run numerical tests in order to study the behaviour of these methods when used to... (More)
In supervised machine learning, training neural networks requires solving optimization problems. These problems can be restated as gradient flow problems and solved numerically using time-stepping methods.

In order to study the behavior of these time-stepping methods, in the context of solving the aforementioned optimization problems, a framework in Python was developed. This framework is based on the software for machine learning Scikit Learn and implements a stochastic approach of different time-stepping methods, namely the explicit Euler method, the implicit Euler method and the
classical fourth-order Runge-Kutta method.

This framework was used to run numerical tests in order to study the behaviour of these methods when used to train neural networks on a small scale data set, as well as the MNIST data set. These tests revealed that the performance of the explicit Euler method and the Runge-Kutta method is comparable, with the explicit Euler method being slightly better at minimizing the loss function in terms of computation time. On the other hand, the implicit Euler method was found to be impractical, especially when the training data set is high dimensional. This is due to the fact that this method requires the solving of computationally expensive nonlinear equations at every iteration.

Finally, a convergence analysis was done for the explicit Euler method and the classical fourth-order Runge-Kutta method to show that they converge (in expectation) to a neighborhood of the optimal solution. (Less)
Popular Abstract (Swedish)
Maskininlärning är ett område inom artificiell intelligens som handlar om metoder för att träna datorer att lära sig regler för att lösa en uppgift. Dessa regler har inte programmerats av en människa, utan datorerna måste upptäcka dem genom att lösa optimeringsproblem.

De här optimeringsproblemen kan också ses som gradientflödesproblem, en typ av differentialekvationer, vilka kan lösas med hjälp av tidsstegningsmetoder. Det här examensarbetet studerar dessa tidsstegningsmetoder ur både ett teoretiskt och ett praktiskt perspektiv.
Please use this url to cite or link to this publication:
author
Shehadeh, Kassem LU
supervisor
organization
course
NUMM11 20201
year
type
H2 - Master's Degree (Two Years)
subject
keywords
Supervised machine learning, optimization, time-stepping methods, neural networks
publication/series
Master's Theses in Mathematical Sciences
report number
LUNFNA-3031-2020
ISSN
1404-6342
other publication id
2020:E30
language
English
id
9028003
date added to LUP
2020-09-10 14:55:29
date last changed
2020-09-10 14:55:29
@misc{9028003,
  abstract     = {{In supervised machine learning, training neural networks requires solving optimization problems. These problems can be restated as gradient flow problems and solved numerically using time-stepping methods.

In order to study the behavior of these time-stepping methods, in the context of solving the aforementioned optimization problems, a framework in Python was developed. This framework is based on the software for machine learning Scikit Learn and implements a stochastic approach of different time-stepping methods, namely the explicit Euler method, the implicit Euler method and the
classical fourth-order Runge-Kutta method.

This framework was used to run numerical tests in order to study the behaviour of these methods when used to train neural networks on a small scale data set, as well as the MNIST data set. These tests revealed that the performance of the explicit Euler method and the Runge-Kutta method is comparable, with the explicit Euler method being slightly better at minimizing the loss function in terms of computation time. On the other hand, the implicit Euler method was found to be impractical, especially when the training data set is high dimensional. This is due to the fact that this method requires the solving of computationally expensive nonlinear equations at every iteration.

Finally, a convergence analysis was done for the explicit Euler method and the classical fourth-order Runge-Kutta method to show that they converge (in expectation) to a neighborhood of the optimal solution.}},
  author       = {{Shehadeh, Kassem}},
  issn         = {{1404-6342}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{Master's Theses in Mathematical Sciences}},
  title        = {{A Study of Time-Stepping Methods for Optimization in Supervised Learning}},
  year         = {{2020}},
}