Trainable Activation Functions For Artificial Neural Networks

Öhman, Jim

Trainable Activation Functions For Artificial Neural Networks

Mark

Öhman, Jim ^LU (2018) FYTK02 20181
Computational Biology and Biological Physics - Has been reorganised
Department of Astronomy and Theoretical Physics - Has been reorganised

Abstract: Artificial Neural Networks (ANNs) are widely used information processing algorithms based roughly on biological neural networks. These networks can be trained to find complex patterns in datasets and to produce certain output signals given a set of input signals. A key element of ANNs are their so-called activation functions, which control the signal strengths between the artificial neurons in a network, and which are normally chosen from a standard set of functions. This thesis investigates the performance of small networks with a new activation function, named the Diversifier, that differs from the common ones in that its shape is trainable, while the others are generally not. Additionally, a new method is introduced that helps to avoid... (More); Artificial Neural Networks (ANNs) are widely used information processing algorithms based roughly on biological neural networks. These networks can be trained to find complex patterns in datasets and to produce certain output signals given a set of input signals. A key element of ANNs are their so-called activation functions, which control the signal strengths between the artificial neurons in a network, and which are normally chosen from a standard set of functions. This thesis investigates the performance of small networks with a new activation function, named the Diversifier, that differs from the common ones in that its shape is trainable, while the others are generally not. Additionally, a new method is introduced that helps to avoid the well known issue of overtraining. In the end it was shown that networks with the Diversifier performed slightly better compared to networks using two of the most common activation functions, the rectifier and the hyperbolic tangent, trained on two different datasets.

There have been articles covering explorations of different kinds of trainable activation functions, including, a trainable rectifier in a convolutional neural network (CNN) [5]. They also reported an improvement in performance. However, none of the ones read introduced something similar to the Diversifier. (Less)
Popular Abstract (Swedish): I dagens samhälle har maskininlärning en betydande roll, speciellt den delen som kallas för artificiella neuronnät (ANN).
Dessa är användbara algoritmer som vagt härmar hur våra biologiska nätverk fungerar. De har visat sig vara skickliga i att lära sig klassificera, och generellt lära sig kopplingar mellan indata och utdata. ANN används flitigt för informationshantering av jättar som Facebook, Youtube och Google. Några användningsområden är bildigenkänning, bildförbättring, röstigenkänning, och datasortering, m.m.

Nyligen släppte ett Google ägt företag (Deep Mind) en ANN algoritm som kan lära sig olika brädspel genom att enbart spela med sig själv, utifrån att bara blivit given reglerna. Denna algoritm testades på de klassiska... (More); I dagens samhälle har maskininlärning en betydande roll, speciellt den delen som kallas för artificiella neuronnät (ANN).
Dessa är användbara algoritmer som vagt härmar hur våra biologiska nätverk fungerar. De har visat sig vara skickliga i att lära sig klassificera, och generellt lära sig kopplingar mellan indata och utdata. ANN används flitigt för informationshantering av jättar som Facebook, Youtube och Google. Några användningsområden är bildigenkänning, bildförbättring, röstigenkänning, och datasortering, m.m.

Nyligen släppte ett Google ägt företag (Deep Mind) en ANN algoritm som kan lära sig olika brädspel genom att enbart spela med sig själv, utifrån att bara blivit given reglerna. Denna algoritm testades på de klassiska brädspelen: go, shogi, och schack, och efter bara en kort tids träning lyckades man uppnå, och passera, den mänskliga skicklighetsnivån. Denna imponerande bedrift visar att dessa nätverk klarar av att lösa komplexa och abstrakta problem.

En aspekt som nyligen gjort ANN mer användbara, och populära, är inte bara ökandet av datorkraft, men även hur nätverken tränas. Denna kandidatuppsats undersöker ifall det finns några positiva aspekter med att introducera ytterligare en träningsbar del av nätverken. Denna del är vad som kallas för aktiveringsfunktioner, som är en viktig komponent av artificiella neuroner eftersom de ser till att bestämma styrkan på de utgående signalerna beroende på styrkan på de ingående signalerna. Hur nätverkets aktiveringsfunktioner behandlar de ingående signalerna är något som man normalt inte har förändrat under träningen. (Less)

Please use this url to cite or link to this publication: http://lup.lub.lu.se/student-papers/record/8962971

author

Öhman, Jim ^LU

supervisor

Mattias Ohlsson

organization

course

FYTK02 20181

year

2018

type

M2 - Bachelor Degree

subject

Science General

keywords

ANN, ARTIFICIAL NEURAL NETWORKS, ACTIVATION FUNCTIONS, REGULARIZATION

language

English

id

8962971

date added to LUP

2018-11-12 13:42:00

date last changed

2019-06-19 11:30:45

@misc{8962971,
  abstract     = {{Artificial Neural Networks (ANNs) are widely used information processing algorithms based roughly on biological neural networks. These networks can be trained to find complex patterns in datasets and to produce certain output signals given a set of input signals. A key element of ANNs are their so-called activation functions, which control the signal strengths between the artificial neurons in a network, and which are normally chosen from a standard set of functions. This thesis investigates the performance of small networks with a new activation function, named the Diversifier, that differs from the common ones in that its shape is trainable, while the others are generally not. Additionally, a new method is introduced that helps to avoid the well known issue of overtraining. In the end it was shown that networks with the Diversifier performed slightly better compared to networks using two of the most common activation functions, the rectifier and the hyperbolic tangent, trained on two different datasets.

There have been articles covering explorations of different kinds of trainable activation functions, including, a trainable rectifier in a convolutional neural network (CNN) [5]. They also reported an improvement in performance. However, none of the ones read introduced something similar to the Diversifier.}},
  author       = {{Öhman, Jim}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Trainable Activation Functions For Artificial Neural Networks}},
  year         = {{2018}},
}

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Trainable Activation Functions For Artificial Neural Networks