Text-aided object segmentation and classification in images

Tegen, Agnes

Text-aided object segmentation and classification in images

Mark

Tegen, Agnes ^LU (2014) In Master's Theses in Mathematical Sciences FMA820 20132
Mathematics (Faculty of Engineering)

Abstract: Object recognition in images is a popular research field with many applications including medicine, robotics and face recognition. The task of automatically finding and identifying objects in an image is challenging in the extreme. By looking at the problem from a new angle and including additional information beside the visual, the problem becomes less ill posed.

In this thesis we investigate how the addition of text annotations to images affects the classification process. Classifications of different sets of labels as well as clusters of labels were carried out. A comparison between the results from using only visual information and from also including information from an image description is given. In most cases the additional... (More); Object recognition in images is a popular research field with many applications including medicine, robotics and face recognition. The task of automatically finding and identifying objects in an image is challenging in the extreme. By looking at the problem from a new angle and including additional information beside the visual, the problem becomes less ill posed.

In this thesis we investigate how the addition of text annotations to images affects the classification process. Classifications of different sets of labels as well as clusters of labels were carried out. A comparison between the results from using only visual information and from also including information from an image description is given. In most cases the additional information improved the accuracy of the classification.

The obtained results were then used to design an algorithm that could, given an image with a description, find relevant words from the text and mark their presence in the image. A large set of overlapping segments is generated and each segment is classified into a set of categories. The image descriptions are parsed by an algorithm (a so called chunker) and visually relevant words (key-nouns) are extracted from the text. These key-nouns are then connected to the categories by metrics from WordNet. To create an optimal assignment of the visual segments to the key-nouns combinatorial optimization was used. The resulting system was compared to manually segmented and classified images.

The results are promising and have given rise to several new ideas for continued research. (Less)

Please use this url to cite or link to this publication: http://lup.lub.lu.se/student-papers/record/4317091

author

Tegen, Agnes ^LU

supervisor

organization

Mathematics (Faculty of Engineering)

course

FMA820 20132

year

2014

type

H2 - Master's Degree (Two Years)

subject

publication/series

Master's Theses in Mathematical Sciences

report number

LUTFMA-3258-2014

ISSN

1404-6342

other publication id

2014:E5

language

English

id

4317091

date added to LUP

2014-05-16 17:05:01

date last changed

2014-05-27 11:18:40

@misc{4317091,
  abstract     = {{Object recognition in images is a popular research field with many applications including medicine, robotics and face recognition. The task of automatically finding and identifying objects in an image is challenging in the extreme. By looking at the problem from a new angle and including additional information beside the visual, the problem becomes less ill posed.

In this thesis we investigate how the addition of text annotations to images affects the classification process. Classifications of different sets of labels as well as clusters of labels were carried out. A comparison between the results from using only visual information and from also including information from an image description is given. In most cases the additional information improved the accuracy of the classification.

The obtained results were then used to design an algorithm that could, given an image with a description, find relevant words from the text and mark their presence in the image. A large set of overlapping segments is generated and each segment is classified into a set of categories. The image descriptions are parsed by an algorithm (a so called chunker) and visually relevant words (key-nouns) are extracted from the text. These key-nouns are then connected to the categories by metrics from WordNet. To create an optimal assignment of the visual segments to the key-nouns combinatorial optimization was used. The resulting system was compared to manually segmented and classified images.

The results are promising and have given rise to several new ideas for continued research.}},
  author       = {{Tegen, Agnes}},
  issn         = {{1404-6342}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{Master's Theses in Mathematical Sciences}},
  title        = {{Text-aided object segmentation and classification in images}},
  year         = {{2014}},
}

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Text-aided object segmentation and classification in images