Prediction of appropriate L2 regularization strengths through Bayesian formalism

Degener, Alexander

Prediction of appropriate L2 regularization strengths through Bayesian formalism

Mark

Degener, Alexander ^LU (2022) FYTK02 20221
Computational Biology and Biological Physics - Has been reorganised
Department of Astronomy and Theoretical Physics - Has been reorganised

Abstract: This paper proposes and investigates a Bayesian relation between optimal L2 regularization strengths and the number of training patterns and hidden nodes used for an artificial neural network. The results support the proposed dependence for number of training patterns, while the dependence on hidden architecture was less clear. Finally, applying different regularization strengths on different layers, rather than the same on all, resulted in better validation performances. The essential programs for training ANNs were developed for these studies, along with functionality for synthetic data generation, which together provided a controlled and flexible environment.
Popular Abstract: Imagine you are trying to come up with a plan for solving a problem, but the best strategy you can come up with is to test every possible approach. This could work if you have as many tries as you want, but it would take quite some time and if you eventually settle for an approach you will likely not know if there exists a better one. Analyzing the problem to instead figure out an appropriate approach given its nature can be both more efficient and effective. In this study, we analyze some strategies for machine learning, hoping to reduce the need for a trial-and-error approach.

Machine learning can perform well on certain tasks that would be very demanding to solve in alternative ways. An example of such a task is finding relations... (More); Imagine you are trying to come up with a plan for solving a problem, but the best strategy you can come up with is to test every possible approach. This could work if you have as many tries as you want, but it would take quite some time and if you eventually settle for an approach you will likely not know if there exists a better one. Analyzing the problem to instead figure out an appropriate approach given its nature can be both more efficient and effective. In this study, we analyze some strategies for machine learning, hoping to reduce the need for a trial-and-error approach.

Machine learning can perform well on certain tasks that would be very demanding to solve in alternative ways. An example of such a task is finding relations between medical data of individuals and their health-conditions. This can be done through machine learning of the type “Artificial Neural Networks” also known as ANNs, that use a specific kind of models inspired by the neurons in a human brain. These models are implemented as programs that can be trained through many iterations of adaptation.

When training the program to solve difficult problems, we often use some information about them to set parameter values for a good starting point. One parameter currently laborious to determine is called the L2 regularization strength, which helps make the program more able to handle other similar tasks. Our research aims to find a way to set this parameter, given certain knowledge about the initial problem.

A common practice for finding a useful L2 regularization strength is through trial and error, which can yield quite useful results, given enough time. We would like to reduce the time required by finding at least a narrower range in which the most appropriate value lies. To do this, a proposed dependence for L2 regularization strength on data size and ANN size is investigated. (Less)

Please use this url to cite or link to this publication: http://lup.lub.lu.se/student-papers/record/9092637

author

Degener, Alexander ^LU

supervisor

Patrik Edén ^LU

organization

course

FYTK02 20221

year

2022

type

M2 - Bachelor Degree

subject

Physics and Astronomy

keywords

Machine learning, Artificial Neural Network, L2 regularization strength, Bayesian formalism, Classification tasks

language

English

id

9092637

date added to LUP

2022-06-28 11:13:31

date last changed

2022-06-28 11:13:58

@misc{9092637,
  abstract     = {{This paper proposes and investigates a Bayesian relation between optimal L2 regularization strengths and the number of training patterns and hidden nodes used for an artificial neural network. The results support the proposed dependence for number of training patterns, while the dependence on hidden architecture was less clear. Finally, applying different regularization strengths on different layers, rather than the same on all, resulted in better validation performances. The essential programs for training ANNs were developed for these studies, along with functionality for synthetic data generation, which together provided a controlled and flexible environment.}},
  author       = {{Degener, Alexander}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Prediction of appropriate L2 regularization strengths through Bayesian formalism}},
  year         = {{2022}},
}

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Prediction of appropriate L2 regularization strengths through Bayesian formalism