Inflated Multinomial Matching for Anchor-Free Object Detection

Hiersemann, Cesar

Inflated Multinomial Matching for Anchor-Free Object Detection

Mark

Hiersemann, Cesar ^LU (2018) In Master’s Theses in Mathematical Sciences FMA820 20171
Mathematics (Faculty of Engineering)

Abstract: This thesis presents a novel matching strategy, Inflated Multinomial Matching, which enables training of anchor-free object detection models based on convolutional neural networks. An important aspect of detection models is the integral usage of anchor boxes, where an anchor box is a bounding box with a preset and constant location, size and shape in the image. The matching strategy presented removes the need for anchor boxes, where it instead utilizes the similarity scores between ground truths and predictions in a stochastic way which lets detection models obtain several independent submodels where each submodel specializes towards predicting objects of a certain size and shape. The resulting properties is essentially mimicking the main... (More); This thesis presents a novel matching strategy, Inflated Multinomial Matching, which enables training of anchor-free object detection models based on convolutional neural networks. An important aspect of detection models is the integral usage of anchor boxes, where an anchor box is a bounding box with a preset and constant location, size and shape in the image. The matching strategy presented removes the need for anchor boxes, where it instead utilizes the similarity scores between ground truths and predictions in a stochastic way which lets detection models obtain several independent submodels where each submodel specializes towards predicting objects of a certain size and shape. The resulting properties is essentially mimicking the main benefits of anchor boxes in an unsupervised way. The intended behavior of the matching strategy is confirmed through a number of indicators monitored throughout the training process. Finally, a full scale object detection model is trained with Inflated Multinomial Matching and example detection results are showcased. (Less)
Popular Abstract (Swedish): Algoritmer för att automatiskt förstå innehållet i bilder och videor har sett en enorm utveckling under de senaste åren. Trots framsteg i hur väl algoritmerna kan läras till att förstå bildinnehållet på egen hand så krävs fortfarande en viss grad av handhållning och vägledning från områdesexperter som utvecklar och tränar algoritmerna. När positionen och dimensionerna på ett objekt av intresse ska lokaliseras i en bild, då ges algoritmen en omfattande mall med förbestämda positioner och dimensioner, så kallade ankare, som den sen kan utnyttja för att förenkla sin egen sökning och träningsprocess. I detta arbetet har en originell metod utvecklats för att kunna träna algoritmerna helt utan ankare, som i slutändan kan bidra till en mer... (More); Algoritmer för att automatiskt förstå innehållet i bilder och videor har sett en enorm utveckling under de senaste åren. Trots framsteg i hur väl algoritmerna kan läras till att förstå bildinnehållet på egen hand så krävs fortfarande en viss grad av handhållning och vägledning från områdesexperter som utvecklar och tränar algoritmerna. När positionen och dimensionerna på ett objekt av intresse ska lokaliseras i en bild, då ges algoritmen en omfattande mall med förbestämda positioner och dimensioner, så kallade ankare, som den sen kan utnyttja för att förenkla sin egen sökning och träningsprocess. I detta arbetet har en originell metod utvecklats för att kunna träna algoritmerna helt utan ankare, som i slutändan kan bidra till en mer automatiserad träningsprocess som kräver mindre manuell vägledning av experter för att uppnå bästa resultat. (Less)

Please use this url to cite or link to this publication: http://lup.lub.lu.se/student-papers/record/8959431

author

Hiersemann, Cesar ^LU

supervisor

Anders Heyden ^LU
Karl Åström ^LU

organization

Mathematics (Faculty of Engineering)

alternative title

Inflaterad Multinomial Matchning för Ankarfri Objektdetektion

course

FMA820 20171

year

2018

type

H2 - Master's Degree (Two Years)

subject

Mathematics and Statistics

keywords

Computer Vision, Machine Learning, Deep Learning, Convolutional Neural Networks, Object Recognition, Object Detection, Anchors, Anchor Boxes

publication/series

Master’s Theses in Mathematical Sciences

report number

LUTFMA-3366-2018

ISSN

1404-6342

other publication id

2018:E66

language

English

id

8959431

date added to LUP

2018-11-19 14:38:40

date last changed

2018-11-19 14:38:40

@misc{8959431,
  abstract     = {{This thesis presents a novel matching strategy, Inflated Multinomial Matching, which enables training of anchor-free object detection models based on convolutional neural networks. An important aspect of detection models is the integral usage of anchor boxes, where an anchor box is a bounding box with a preset and constant location, size and shape in the image. The matching strategy presented removes the need for anchor boxes, where it instead utilizes the similarity scores between ground truths and predictions in a stochastic way which lets detection models obtain several independent submodels where each submodel specializes towards predicting objects of a certain size and shape. The resulting properties is essentially mimicking the main benefits of anchor boxes in an unsupervised way. The intended behavior of the matching strategy is confirmed through a number of indicators monitored throughout the training process. Finally, a full scale object detection model is trained with Inflated Multinomial Matching and example detection results are showcased.}},
  author       = {{Hiersemann, Cesar}},
  issn         = {{1404-6342}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{Master’s Theses in Mathematical Sciences}},
  title        = {{Inflated Multinomial Matching for Anchor-Free Object Detection}},
  year         = {{2018}},
}

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Inflated Multinomial Matching for Anchor-Free Object Detection