A Study of Time-Stepping Methods for Optimization in Supervised Learning
(2020) In Master's Theses in Mathematical Sciences NUMM11 20201Mathematics (Faculty of Engineering)
Centre for Mathematical Sciences
- Abstract
- In supervised machine learning, training neural networks requires solving optimization problems. These problems can be restated as gradient flow problems and solved numerically using time-stepping methods.
In order to study the behavior of these time-stepping methods, in the context of solving the aforementioned optimization problems, a framework in Python was developed. This framework is based on the software for machine learning Scikit Learn and implements a stochastic approach of different time-stepping methods, namely the explicit Euler method, the implicit Euler method and the
classical fourth-order Runge-Kutta method.
This framework was used to run numerical tests in order to study the behaviour of these methods when used to... (More) - In supervised machine learning, training neural networks requires solving optimization problems. These problems can be restated as gradient flow problems and solved numerically using time-stepping methods.
In order to study the behavior of these time-stepping methods, in the context of solving the aforementioned optimization problems, a framework in Python was developed. This framework is based on the software for machine learning Scikit Learn and implements a stochastic approach of different time-stepping methods, namely the explicit Euler method, the implicit Euler method and the
classical fourth-order Runge-Kutta method.
This framework was used to run numerical tests in order to study the behaviour of these methods when used to train neural networks on a small scale data set, as well as the MNIST data set. These tests revealed that the performance of the explicit Euler method and the Runge-Kutta method is comparable, with the explicit Euler method being slightly better at minimizing the loss function in terms of computation time. On the other hand, the implicit Euler method was found to be impractical, especially when the training data set is high dimensional. This is due to the fact that this method requires the solving of computationally expensive nonlinear equations at every iteration.
Finally, a convergence analysis was done for the explicit Euler method and the classical fourth-order Runge-Kutta method to show that they converge (in expectation) to a neighborhood of the optimal solution. (Less) - Popular Abstract (Swedish)
- Maskininlärning är ett område inom artificiell intelligens som handlar om metoder för att träna datorer att lära sig regler för att lösa en uppgift. Dessa regler har inte programmerats av en människa, utan datorerna måste upptäcka dem genom att lösa optimeringsproblem.
De här optimeringsproblemen kan också ses som gradientflödesproblem, en typ av differentialekvationer, vilka kan lösas med hjälp av tidsstegningsmetoder. Det här examensarbetet studerar dessa tidsstegningsmetoder ur både ett teoretiskt och ett praktiskt perspektiv.
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/9028003
- author
- Shehadeh, Kassem LU
- supervisor
- organization
- course
- NUMM11 20201
- year
- 2020
- type
- H2 - Master's Degree (Two Years)
- subject
- keywords
- Supervised machine learning, optimization, time-stepping methods, neural networks
- publication/series
- Master's Theses in Mathematical Sciences
- report number
- LUNFNA-3031-2020
- ISSN
- 1404-6342
- other publication id
- 2020:E30
- language
- English
- id
- 9028003
- date added to LUP
- 2020-09-10 14:55:29
- date last changed
- 2020-09-10 14:55:29
@misc{9028003, abstract = {{In supervised machine learning, training neural networks requires solving optimization problems. These problems can be restated as gradient flow problems and solved numerically using time-stepping methods. In order to study the behavior of these time-stepping methods, in the context of solving the aforementioned optimization problems, a framework in Python was developed. This framework is based on the software for machine learning Scikit Learn and implements a stochastic approach of different time-stepping methods, namely the explicit Euler method, the implicit Euler method and the classical fourth-order Runge-Kutta method. This framework was used to run numerical tests in order to study the behaviour of these methods when used to train neural networks on a small scale data set, as well as the MNIST data set. These tests revealed that the performance of the explicit Euler method and the Runge-Kutta method is comparable, with the explicit Euler method being slightly better at minimizing the loss function in terms of computation time. On the other hand, the implicit Euler method was found to be impractical, especially when the training data set is high dimensional. This is due to the fact that this method requires the solving of computationally expensive nonlinear equations at every iteration. Finally, a convergence analysis was done for the explicit Euler method and the classical fourth-order Runge-Kutta method to show that they converge (in expectation) to a neighborhood of the optimal solution.}}, author = {{Shehadeh, Kassem}}, issn = {{1404-6342}}, language = {{eng}}, note = {{Student Paper}}, series = {{Master's Theses in Mathematical Sciences}}, title = {{A Study of Time-Stepping Methods for Optimization in Supervised Learning}}, year = {{2020}}, }