Advanced

Inflated Multinomial Matching for Anchor-Free Object Detection

Hiersemann, Cesar LU (2018) In Master’s Theses in Mathematical Sciences FMA820 20171
Mathematics (Faculty of Engineering)
Abstract
This thesis presents a novel matching strategy, Inflated Multinomial Matching, which enables training of anchor-free object detection models based on convolutional neural networks. An important aspect of detection models is the integral usage of anchor boxes, where an anchor box is a bounding box with a preset and constant location, size and shape in the image. The matching strategy presented removes the need for anchor boxes, where it instead utilizes the similarity scores between ground truths and predictions in a stochastic way which lets detection models obtain several independent submodels where each submodel specializes towards predicting objects of a certain size and shape. The resulting properties is essentially mimicking the main... (More)
This thesis presents a novel matching strategy, Inflated Multinomial Matching, which enables training of anchor-free object detection models based on convolutional neural networks. An important aspect of detection models is the integral usage of anchor boxes, where an anchor box is a bounding box with a preset and constant location, size and shape in the image. The matching strategy presented removes the need for anchor boxes, where it instead utilizes the similarity scores between ground truths and predictions in a stochastic way which lets detection models obtain several independent submodels where each submodel specializes towards predicting objects of a certain size and shape. The resulting properties is essentially mimicking the main benefits of anchor boxes in an unsupervised way. The intended behavior of the matching strategy is confirmed through a number of indicators monitored throughout the training process. Finally, a full scale object detection model is trained with Inflated Multinomial Matching and example detection results are showcased. (Less)
Popular Abstract (Swedish)
Algoritmer för att automatiskt förstå innehållet i bilder och videor har sett en enorm utveckling under de senaste åren. Trots framsteg i hur väl algoritmerna kan läras till att förstå bildinnehållet på egen hand så krävs fortfarande en viss grad av handhållning och vägledning från områdesexperter som utvecklar och tränar algoritmerna. När positionen och dimensionerna på ett objekt av intresse ska lokaliseras i en bild, då ges algoritmen en omfattande mall med förbestämda positioner och dimensioner, så kallade ankare, som den sen kan utnyttja för att förenkla sin egen sökning och träningsprocess. I detta arbetet har en originell metod utvecklats för att kunna träna algoritmerna helt utan ankare, som i slutändan kan bidra till en mer... (More)
Algoritmer för att automatiskt förstå innehållet i bilder och videor har sett en enorm utveckling under de senaste åren. Trots framsteg i hur väl algoritmerna kan läras till att förstå bildinnehållet på egen hand så krävs fortfarande en viss grad av handhållning och vägledning från områdesexperter som utvecklar och tränar algoritmerna. När positionen och dimensionerna på ett objekt av intresse ska lokaliseras i en bild, då ges algoritmen en omfattande mall med förbestämda positioner och dimensioner, så kallade ankare, som den sen kan utnyttja för att förenkla sin egen sökning och träningsprocess. I detta arbetet har en originell metod utvecklats för att kunna träna algoritmerna helt utan ankare, som i slutändan kan bidra till en mer automatiserad träningsprocess som kräver mindre manuell vägledning av experter för att uppnå bästa resultat. (Less)
Please use this url to cite or link to this publication:
author
Hiersemann, Cesar LU
supervisor
organization
alternative title
Inflaterad Multinomial Matchning för Ankarfri Objektdetektion
course
FMA820 20171
year
type
H2 - Master's Degree (Two Years)
subject
keywords
Computer Vision, Machine Learning, Deep Learning, Convolutional Neural Networks, Object Recognition, Object Detection, Anchors, Anchor Boxes
publication/series
Master’s Theses in Mathematical Sciences
report number
LUTFMA-3366-2018
ISSN
1404-6342
other publication id
2018:E66
language
English
id
8959431
date added to LUP
2018-11-19 14:38:40
date last changed
2018-11-19 14:38:40
@misc{8959431,
  abstract     = {This thesis presents a novel matching strategy, Inflated Multinomial Matching, which enables training of anchor-free object detection models based on convolutional neural networks. An important aspect of detection models is the integral usage of anchor boxes, where an anchor box is a bounding box with a preset and constant location, size and shape in the image. The matching strategy presented removes the need for anchor boxes, where it instead utilizes the similarity scores between ground truths and predictions in a stochastic way which lets detection models obtain several independent submodels where each submodel specializes towards predicting objects of a certain size and shape. The resulting properties is essentially mimicking the main benefits of anchor boxes in an unsupervised way. The intended behavior of the matching strategy is confirmed through a number of indicators monitored throughout the training process. Finally, a full scale object detection model is trained with Inflated Multinomial Matching and example detection results are showcased.},
  author       = {Hiersemann, Cesar},
  issn         = {1404-6342},
  keyword      = {Computer Vision,Machine Learning,Deep Learning,Convolutional Neural Networks,Object Recognition,Object Detection,Anchors,Anchor Boxes},
  language     = {eng},
  note         = {Student Paper},
  series       = {Master’s Theses in Mathematical Sciences},
  title        = {Inflated Multinomial Matching for Anchor-Free Object Detection},
  year         = {2018},
}