Adapting Individual Learning Rates for SGD and ADAM Optimizers
(2025) In Master’s Theses in Mathematical Sciences BERM01 20251Mathematics (Faculty of Sciences)
- Abstract
- This thesis investigates the integration of the individual learning rates from Rprop into SGD and Adam optimization methods. The individual learning rates are updated separately from the weights. A major focus of this work is the role of mini-batch size in updating these learning rates. Empirical analysis demonstrates that selecting an appropriate mini-batch size can accelerate convergence and improve model performance. Additionally, an algorithm is introduced that dynamically adjusts the batch size used for learning rate updates.
- Popular Abstract
- In machine learning, a computer learns to perform tasks, like identifying what is on an image. This is done by training a type of model called an artificial neural network (ANN). These networks are inspired by the structure of the human brain, where neurons are connected together. Similarly, an ANN consists of "weights" that connect different parts of the network, and these weights adjust over time as the model learns.
How much each weight changes during training depends in part on something called the learning rate. This is a number that controls how quickly or slowly the model learns. If the model learns too fast, it might miss the best solution, but if it learns too slow, it might take a long time to learn anything useful.
Neural... (More) - In machine learning, a computer learns to perform tasks, like identifying what is on an image. This is done by training a type of model called an artificial neural network (ANN). These networks are inspired by the structure of the human brain, where neurons are connected together. Similarly, an ANN consists of "weights" that connect different parts of the network, and these weights adjust over time as the model learns.
How much each weight changes during training depends in part on something called the learning rate. This is a number that controls how quickly or slowly the model learns. If the model learns too fast, it might miss the best solution, but if it learns too slow, it might take a long time to learn anything useful.
Neural networks are trained using algorithms called optimizers. These optimizers guide how the weights should be updated. Most optimizers use a single learning rate for all weights. In my project, I apply an individual learning rate to each weight. This means every connection in the network can learn at its own pace, which can lead to faster training and improved results. (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/9206380
- author
- Schijf, Steffi LU
- supervisor
-
- Patrik Edén LU
- organization
- course
- BERM01 20251
- year
- 2025
- type
- H2 - Master's Degree (Two Years)
- subject
- keywords
- machine learning, stochastic gradient descent, SGD, Rprop, Adam, learning rate, individual learning rates, optimization, artificial neural networks
- publication/series
- Master’s Theses in Mathematical Sciences
- report number
- LUNFBV-3006-2025
- ISSN
- 1404-6342
- other publication id
- 2025:E82
- language
- English
- id
- 9206380
- date added to LUP
- 2025-08-06 15:15:09
- date last changed
- 2025-08-06 15:15:09
@misc{9206380, abstract = {{This thesis investigates the integration of the individual learning rates from Rprop into SGD and Adam optimization methods. The individual learning rates are updated separately from the weights. A major focus of this work is the role of mini-batch size in updating these learning rates. Empirical analysis demonstrates that selecting an appropriate mini-batch size can accelerate convergence and improve model performance. Additionally, an algorithm is introduced that dynamically adjusts the batch size used for learning rate updates.}}, author = {{Schijf, Steffi}}, issn = {{1404-6342}}, language = {{eng}}, note = {{Student Paper}}, series = {{Master’s Theses in Mathematical Sciences}}, title = {{Adapting Individual Learning Rates for SGD and ADAM Optimizers}}, year = {{2025}}, }