Adapting Individual Learning Rates for SGD and ADAM Optimizers

Schijf, Steffi

Adapting Individual Learning Rates for SGD and ADAM Optimizers

Mark

Schijf, Steffi ^LU (2025) In Master’s Theses in Mathematical Sciences BERM01 20251
Mathematics (Faculty of Sciences)

Abstract: This thesis investigates the integration of the individual learning rates from Rprop into SGD and Adam optimization methods. The individual learning rates are updated separately from the weights. A major focus of this work is the role of mini-batch size in updating these learning rates. Empirical analysis demonstrates that selecting an appropriate mini-batch size can accelerate convergence and improve model performance. Additionally, an algorithm is introduced that dynamically adjusts the batch size used for learning rate updates.
Popular Abstract: In machine learning, a computer learns to perform tasks, like identifying what is on an image. This is done by training a type of model called an artificial neural network (ANN). These networks are inspired by the structure of the human brain, where neurons are connected together. Similarly, an ANN consists of "weights" that connect different parts of the network, and these weights adjust over time as the model learns.

How much each weight changes during training depends in part on something called the learning rate. This is a number that controls how quickly or slowly the model learns. If the model learns too fast, it might miss the best solution, but if it learns too slow, it might take a long time to learn anything useful.

Neural... (More); In machine learning, a computer learns to perform tasks, like identifying what is on an image. This is done by training a type of model called an artificial neural network (ANN). These networks are inspired by the structure of the human brain, where neurons are connected together. Similarly, an ANN consists of "weights" that connect different parts of the network, and these weights adjust over time as the model learns.

How much each weight changes during training depends in part on something called the learning rate. This is a number that controls how quickly or slowly the model learns. If the model learns too fast, it might miss the best solution, but if it learns too slow, it might take a long time to learn anything useful.

Neural networks are trained using algorithms called optimizers. These optimizers guide how the weights should be updated. Most optimizers use a single learning rate for all weights. In my project, I apply an individual learning rate to each weight. This means every connection in the network can learn at its own pace, which can lead to faster training and improved results. (Less)

Please use this url to cite or link to this publication: http://lup.lub.lu.se/student-papers/record/9206380

author

Schijf, Steffi ^LU

supervisor

Patrik Edén ^LU

organization

Mathematics (Faculty of Sciences)

course

BERM01 20251

year

2025

type

H2 - Master's Degree (Two Years)

subject

Mathematics and Statistics

keywords

machine learning, stochastic gradient descent, SGD, Rprop, Adam, learning rate, individual learning rates, optimization, artificial neural networks

publication/series

Master’s Theses in Mathematical Sciences

report number

LUNFBV-3006-2025

ISSN

1404-6342

other publication id

2025:E82

language

English

id

9206380

date added to LUP

2025-08-06 15:15:09

date last changed

2025-08-06 15:15:09

@misc{9206380,
  abstract     = {{This thesis investigates the integration of the individual learning rates from Rprop into SGD and Adam optimization methods. The individual learning rates are updated separately from the weights. A major focus of this work is the role of mini-batch size in updating these learning rates. Empirical analysis demonstrates that selecting an appropriate mini-batch size can accelerate convergence and improve model performance. Additionally, an algorithm is introduced that dynamically adjusts the batch size used for learning rate updates.}},
  author       = {{Schijf, Steffi}},
  issn         = {{1404-6342}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{Master’s Theses in Mathematical Sciences}},
  title        = {{Adapting Individual Learning Rates for SGD and ADAM Optimizers}},
  year         = {{2025}},
}

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Adapting Individual Learning Rates for SGD and ADAM Optimizers