A Study of Time-Stepping Methods for Optimization in Supervised Learning

Shehadeh, Kassem

A Study of Time-Stepping Methods for Optimization in Supervised Learning

Mark

Shehadeh, Kassem ^LU (2020) In Master's Theses in Mathematical Sciences NUMM11 20201
Mathematics (Faculty of Engineering)
Centre for Mathematical Sciences

Abstract: In supervised machine learning, training neural networks requires solving optimization problems. These problems can be restated as gradient flow problems and solved numerically using time-stepping methods.

In order to study the behavior of these time-stepping methods, in the context of solving the aforementioned optimization problems, a framework in Python was developed. This framework is based on the software for machine learning Scikit Learn and implements a stochastic approach of different time-stepping methods, namely the explicit Euler method, the implicit Euler method and the
classical fourth-order Runge-Kutta method.

This framework was used to run numerical tests in order to study the behaviour of these methods when used to... (More); In supervised machine learning, training neural networks requires solving optimization problems. These problems can be restated as gradient flow problems and solved numerically using time-stepping methods.

In order to study the behavior of these time-stepping methods, in the context of solving the aforementioned optimization problems, a framework in Python was developed. This framework is based on the software for machine learning Scikit Learn and implements a stochastic approach of different time-stepping methods, namely the explicit Euler method, the implicit Euler method and the
classical fourth-order Runge-Kutta method.

This framework was used to run numerical tests in order to study the behaviour of these methods when used to train neural networks on a small scale data set, as well as the MNIST data set. These tests revealed that the performance of the explicit Euler method and the Runge-Kutta method is comparable, with the explicit Euler method being slightly better at minimizing the loss function in terms of computation time. On the other hand, the implicit Euler method was found to be impractical, especially when the training data set is high dimensional. This is due to the fact that this method requires the solving of computationally expensive nonlinear equations at every iteration.

Finally, a convergence analysis was done for the explicit Euler method and the classical fourth-order Runge-Kutta method to show that they converge (in expectation) to a neighborhood of the optimal solution. (Less)
Popular Abstract (Swedish): Maskininlärning är ett område inom artificiell intelligens som handlar om metoder för att träna datorer att lära sig regler för att lösa en uppgift. Dessa regler har inte programmerats av en människa, utan datorerna måste upptäcka dem genom att lösa optimeringsproblem.

De här optimeringsproblemen kan också ses som gradientflödesproblem, en typ av differentialekvationer, vilka kan lösas med hjälp av tidsstegningsmetoder. Det här examensarbetet studerar dessa tidsstegningsmetoder ur både ett teoretiskt och ett praktiskt perspektiv.

Please use this url to cite or link to this publication: http://lup.lub.lu.se/student-papers/record/9028003

author

Shehadeh, Kassem ^LU

supervisor

Tony Stillfjord ^LU

organization

course

NUMM11 20201

year

2020

type

H2 - Master's Degree (Two Years)

subject

Mathematics and Statistics

keywords

Supervised machine learning, optimization, time-stepping methods, neural networks

publication/series

Master's Theses in Mathematical Sciences

report number

LUNFNA-3031-2020

ISSN

1404-6342

other publication id

2020:E30

language

English

id

9028003

date added to LUP

2020-09-10 14:55:29

date last changed

2020-09-10 14:55:29

@misc{9028003,
  abstract     = {{In supervised machine learning, training neural networks requires solving optimization problems. These problems can be restated as gradient flow problems and solved numerically using time-stepping methods.

In order to study the behavior of these time-stepping methods, in the context of solving the aforementioned optimization problems, a framework in Python was developed. This framework is based on the software for machine learning Scikit Learn and implements a stochastic approach of different time-stepping methods, namely the explicit Euler method, the implicit Euler method and the
classical fourth-order Runge-Kutta method.

This framework was used to run numerical tests in order to study the behaviour of these methods when used to train neural networks on a small scale data set, as well as the MNIST data set. These tests revealed that the performance of the explicit Euler method and the Runge-Kutta method is comparable, with the explicit Euler method being slightly better at minimizing the loss function in terms of computation time. On the other hand, the implicit Euler method was found to be impractical, especially when the training data set is high dimensional. This is due to the fact that this method requires the solving of computationally expensive nonlinear equations at every iteration.

Finally, a convergence analysis was done for the explicit Euler method and the classical fourth-order Runge-Kutta method to show that they converge (in expectation) to a neighborhood of the optimal solution.}},
  author       = {{Shehadeh, Kassem}},
  issn         = {{1404-6342}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{Master's Theses in Mathematical Sciences}},
  title        = {{A Study of Time-Stepping Methods for Optimization in Supervised Learning}},
  year         = {{2020}},
}

LUP Student Papers

LUND UNIVERSITY LIBRARIES

A Study of Time-Stepping Methods for Optimization in Supervised Learning