Training Bayesian Neural Networks

Book, Johan

Training Bayesian Neural Networks

Mark

Book, Johan ^LU (2020) FYTM04 20192
Computational Biology and Biological Physics - Has been reorganised

Abstract (Swedish): Although deep learning has made advances in a plethora of fields, ranging from financial analysis to image classification, it has some shortcomings for cases of limited data and complex models. In these cases the networks tend to be overconfident in their prediction even when erroneous - something that exposes its applications to risk. One way to incorporate an uncertainty measure is to let the network weights be described by probability distributions rather than point estimates. These networks, known as Bayesian neural networks, can be trained using a method called variational inference, allowing one to utilize standard optimization tools, such as SGD, Adam and learning rate schedules. Although these tools were not developed with Bayesian... (More); Although deep learning has made advances in a plethora of fields, ranging from financial analysis to image classification, it has some shortcomings for cases of limited data and complex models. In these cases the networks tend to be overconfident in their prediction even when erroneous - something that exposes its applications to risk. One way to incorporate an uncertainty measure is to let the network weights be described by probability distributions rather than point estimates. These networks, known as Bayesian neural networks, can be trained using a method called variational inference, allowing one to utilize standard optimization tools, such as SGD, Adam and learning rate schedules. Although these tools were not developed with Bayesian neural networks in mind, we will show that they behave similarly. We will confirm some best practices for training these networks, such as how the loss should be scaled and evaluated. Moreover, we see that one should avoid using Adam in favor of SGD and AdaBound. Wee see that one should also group the learnable parameters in order to use custom learning rates for the different groups. (Less)
Popular Abstract: Genom att använda maskininlärning går det att lösa problem som vi tidigare trodde var olösbara. Saker som ansiktsigenkänning, chatbottar och maskiner som lär sig gå självmant känns idag helt triviala men var en enorm utmaning bara några decennier sedan. Maskinlärning används i alla hörn av vårt samhälle, från medicin och finans till bilindustrin och telekommunikation. Gemensamt för alla dessa sektorer är att man behöver ha full koll på osäkerheten i alla beslut man tar. Ett exempel på detta är en doktor med en patient med cancer. Hur säker doktorn är på att patienten faktiskt har cancer kan avgöra om hen kommer behandlas med cellgifter eller ej.

Artificiella neurala nätverk, vilka är benstommen inom maskininlärning, har idag inget mått... (More); Genom att använda maskininlärning går det att lösa problem som vi tidigare trodde var olösbara. Saker som ansiktsigenkänning, chatbottar och maskiner som lär sig gå självmant känns idag helt triviala men var en enorm utmaning bara några decennier sedan. Maskinlärning används i alla hörn av vårt samhälle, från medicin och finans till bilindustrin och telekommunikation. Gemensamt för alla dessa sektorer är att man behöver ha full koll på osäkerheten i alla beslut man tar. Ett exempel på detta är en doktor med en patient med cancer. Hur säker doktorn är på att patienten faktiskt har cancer kan avgöra om hen kommer behandlas med cellgifter eller ej.

Artificiella neurala nätverk, vilka är benstommen inom maskininlärning, har idag inget mått på hur säkra de är i de beslut de gör. Ändå används de för t.ex.\ diagnosticera cancer och i självstyrande bilar. Så vad kan vi göra åt nätverken för att göra dem mer osäkra? En möjlighet är något som heter Bayesianska neurala nätverk. Dessa nätverk är medvetna om hur osäkra de är när de tar beslut. Dessa Bayesianska neurala nätverk är dock svårare att lära upp (att lära någon som är kritisk till allt är tämligen krångligt). Detta arbete undersöker om det finns knep som går att använda for att snabba upp träningsprocessen. (Less)

Please use this url to cite or link to this publication: http://lup.lub.lu.se/student-papers/record/9008040

author

Book, Johan ^LU

supervisor

organization

Computational Biology and Biological Physics - Has been reorganised

course

FYTM04 20192

year

2020

type

H1 - Master's Degree (One Year)

subject

Physics and Astronomy

keywords

artificial neural networks, ANN, convolutional neural networks

report number

TP 20-07

language

English

id

9008040

date added to LUP

2020-04-27 16:11:43

date last changed

2020-04-27 16:11:43

@misc{9008040,
  abstract     = {{Although deep learning has made advances in a plethora of fields, ranging from financial analysis to image classification, it has some shortcomings for cases of limited data and complex models. In these cases the networks tend to be overconfident in their prediction even when erroneous - something that exposes its applications to risk. One way to incorporate an uncertainty measure is to let the network weights be described by probability distributions rather than point estimates. These networks, known as Bayesian neural networks, can be trained using a method called variational inference, allowing one to utilize standard optimization tools, such as SGD, Adam and learning rate schedules. Although these tools were not developed with Bayesian neural networks in mind, we will show that they behave similarly. We will confirm some best practices for training these networks, such as how the loss should be scaled and evaluated. Moreover, we see that one should avoid using Adam in favor of SGD and AdaBound. Wee see that one should also group the learnable parameters in order to use custom learning rates for the different groups.}},
  author       = {{Book, Johan}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Training Bayesian Neural Networks}},
  year         = {{2020}},
}

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Training Bayesian Neural Networks