Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Adapting Individual Learning Rates for SGD and ADAM Optimizers

Schijf, Steffi LU (2025) In Master’s Theses in Mathematical Sciences BERM01 20251
Mathematics (Faculty of Sciences)
Abstract
This thesis investigates the integration of the individual learning rates from Rprop into SGD and Adam optimization methods. The individual learning rates are updated separately from the weights. A major focus of this work is the role of mini-batch size in updating these learning rates. Empirical analysis demonstrates that selecting an appropriate mini-batch size can accelerate convergence and improve model performance. Additionally, an algorithm is introduced that dynamically adjusts the batch size used for learning rate updates.
Popular Abstract
In machine learning, a computer learns to perform tasks, like identifying what is on an image. This is done by training a type of model called an artificial neural network (ANN). These networks are inspired by the structure of the human brain, where neurons are connected together. Similarly, an ANN consists of "weights" that connect different parts of the network, and these weights adjust over time as the model learns.

How much each weight changes during training depends in part on something called the learning rate. This is a number that controls how quickly or slowly the model learns. If the model learns too fast, it might miss the best solution, but if it learns too slow, it might take a long time to learn anything useful.

Neural... (More)
In machine learning, a computer learns to perform tasks, like identifying what is on an image. This is done by training a type of model called an artificial neural network (ANN). These networks are inspired by the structure of the human brain, where neurons are connected together. Similarly, an ANN consists of "weights" that connect different parts of the network, and these weights adjust over time as the model learns.

How much each weight changes during training depends in part on something called the learning rate. This is a number that controls how quickly or slowly the model learns. If the model learns too fast, it might miss the best solution, but if it learns too slow, it might take a long time to learn anything useful.

Neural networks are trained using algorithms called optimizers. These optimizers guide how the weights should be updated. Most optimizers use a single learning rate for all weights. In my project, I apply an individual learning rate to each weight. This means every connection in the network can learn at its own pace, which can lead to faster training and improved results. (Less)
Please use this url to cite or link to this publication:
author
Schijf, Steffi LU
supervisor
organization
course
BERM01 20251
year
type
H2 - Master's Degree (Two Years)
subject
keywords
machine learning, stochastic gradient descent, SGD, Rprop, Adam, learning rate, individual learning rates, optimization, artificial neural networks
publication/series
Master’s Theses in Mathematical Sciences
report number
LUNFBV-3006-2025
ISSN
1404-6342
other publication id
2025:E82
language
English
id
9206380
date added to LUP
2025-08-06 15:15:09
date last changed
2025-08-06 15:15:09
@misc{9206380,
  abstract     = {{This thesis investigates the integration of the individual learning rates from Rprop into SGD and Adam optimization methods. The individual learning rates are updated separately from the weights. A major focus of this work is the role of mini-batch size in updating these learning rates. Empirical analysis demonstrates that selecting an appropriate mini-batch size can accelerate convergence and improve model performance. Additionally, an algorithm is introduced that dynamically adjusts the batch size used for learning rate updates.}},
  author       = {{Schijf, Steffi}},
  issn         = {{1404-6342}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{Master’s Theses in Mathematical Sciences}},
  title        = {{Adapting Individual Learning Rates for SGD and ADAM Optimizers}},
  year         = {{2025}},
}