Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Prediction of appropriate L2 regularization strengths through Bayesian formalism

Degener, Alexander LU (2022) FYTK02 20221
Computational Biology and Biological Physics - Undergoing reorganization
Department of Astronomy and Theoretical Physics - Undergoing reorganization
Abstract
This paper proposes and investigates a Bayesian relation between optimal L2 regularization strengths and the number of training patterns and hidden nodes used for an artificial neural network. The results support the proposed dependence for number of training patterns, while the dependence on hidden architecture was less clear. Finally, applying different regularization strengths on different layers, rather than the same on all, resulted in better validation performances. The essential programs for training ANNs were developed for these studies, along with functionality for synthetic data generation, which together provided a controlled and flexible environment.
Popular Abstract
Imagine you are trying to come up with a plan for solving a problem, but the best strategy you can come up with is to test every possible approach. This could work if you have as many tries as you want, but it would take quite some time and if you eventually settle for an approach you will likely not know if there exists a better one. Analyzing the problem to instead figure out an appropriate approach given its nature can be both more efficient and effective. In this study, we analyze some strategies for machine learning, hoping to reduce the need for a trial-and-error approach.

Machine learning can perform well on certain tasks that would be very demanding to solve in alternative ways. An example of such a task is finding relations... (More)
Imagine you are trying to come up with a plan for solving a problem, but the best strategy you can come up with is to test every possible approach. This could work if you have as many tries as you want, but it would take quite some time and if you eventually settle for an approach you will likely not know if there exists a better one. Analyzing the problem to instead figure out an appropriate approach given its nature can be both more efficient and effective. In this study, we analyze some strategies for machine learning, hoping to reduce the need for a trial-and-error approach.

Machine learning can perform well on certain tasks that would be very demanding to solve in alternative ways. An example of such a task is finding relations between medical data of individuals and their health-conditions. This can be done through machine learning of the type “Artificial Neural Networks” also known as ANNs, that use a specific kind of models inspired by the neurons in a human brain. These models are implemented as programs that can be trained through many iterations of adaptation.

When training the program to solve difficult problems, we often use some information about them to set parameter values for a good starting point. One parameter currently laborious to determine is called the L2 regularization strength, which helps make the program more able to handle other similar tasks. Our research aims to find a way to set this parameter, given certain knowledge about the initial problem.

A common practice for finding a useful L2 regularization strength is through trial and error, which can yield quite useful results, given enough time. We would like to reduce the time required by finding at least a narrower range in which the most appropriate value lies. To do this, a proposed dependence for L2 regularization strength on data size and ANN size is investigated. (Less)
Please use this url to cite or link to this publication:
author
Degener, Alexander LU
supervisor
organization
course
FYTK02 20221
year
type
M2 - Bachelor Degree
subject
keywords
Machine learning, Artificial Neural Network, L2 regularization strength, Bayesian formalism, Classification tasks
language
English
id
9092637
date added to LUP
2022-06-28 11:13:31
date last changed
2022-06-28 11:13:58
@misc{9092637,
  abstract     = {{This paper proposes and investigates a Bayesian relation between optimal L2 regularization strengths and the number of training patterns and hidden nodes used for an artificial neural network. The results support the proposed dependence for number of training patterns, while the dependence on hidden architecture was less clear. Finally, applying different regularization strengths on different layers, rather than the same on all, resulted in better validation performances. The essential programs for training ANNs were developed for these studies, along with functionality for synthetic data generation, which together provided a controlled and flexible environment.}},
  author       = {{Degener, Alexander}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Prediction of appropriate L2 regularization strengths through Bayesian formalism}},
  year         = {{2022}},
}