Advanced

Free-Form Region Description with Second-Order Pooling

Carreira, Joao; Caseiro, Rui; Batista, Jorge and Sminchisescu, Cristian LU (2015) In IEEE Transactions on Pattern Analysis and Machine Intelligence 37(6). p.1177-1189
Abstract
Semantic segmentation and object detection are nowadays dominated by methods operating on regions obtained as a result of a bottom-up grouping process (segmentation) but use feature extractors developed for recognition on fixed-form (e.g. rectangular) patches, with full images as a special case. This is most likely suboptimal. In this paper we focus on feature extraction and description over free-form regions and study the relationship with their fixed-form counterparts. Our main contributions are novel pooling techniques that capture the second-order statistics of local descriptors inside such free-form regions. We introduce second-order generalizations of average and max-pooling that together with appropriate non-linearities, derived... (More)
Semantic segmentation and object detection are nowadays dominated by methods operating on regions obtained as a result of a bottom-up grouping process (segmentation) but use feature extractors developed for recognition on fixed-form (e.g. rectangular) patches, with full images as a special case. This is most likely suboptimal. In this paper we focus on feature extraction and description over free-form regions and study the relationship with their fixed-form counterparts. Our main contributions are novel pooling techniques that capture the second-order statistics of local descriptors inside such free-form regions. We introduce second-order generalizations of average and max-pooling that together with appropriate non-linearities, derived from the mathematical structure of their embedding space, lead to state-of-the-art recognition performance in semantic segmentation experiments without any type of local feature coding. In contrast, we show that codebook-based local feature coding is more important when feature extraction is constrained to operate over regions that include both foreground and large portions of the background, as typical in image classification settings, whereas for high-accuracy localization setups, second-order pooling over free-form regions produces results superior to those of the winning systems in the contemporary semantic segmentation challenges, with models that are much faster in both training and testing. (Less)
Please use this url to cite or link to this publication:
author
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
Recognition, image descriptors, second-order statistics, segmentation, regression, pooling, differential geometry
in
IEEE Transactions on Pattern Analysis and Machine Intelligence
volume
37
issue
6
pages
1177 - 1189
publisher
IEEE--Institute of Electrical and Electronics Engineers Inc.
external identifiers
  • wos:000354377100005
  • scopus:84929223025
ISSN
1939-3539
DOI
10.1109/TPAMI.2014.2361137
language
English
LU publication?
yes
id
34d64813-cd25-4f42-b293-843ce7399184 (old id 7425226)
date added to LUP
2015-06-25 14:53:57
date last changed
2017-11-05 03:50:55
@article{34d64813-cd25-4f42-b293-843ce7399184,
  abstract     = {Semantic segmentation and object detection are nowadays dominated by methods operating on regions obtained as a result of a bottom-up grouping process (segmentation) but use feature extractors developed for recognition on fixed-form (e.g. rectangular) patches, with full images as a special case. This is most likely suboptimal. In this paper we focus on feature extraction and description over free-form regions and study the relationship with their fixed-form counterparts. Our main contributions are novel pooling techniques that capture the second-order statistics of local descriptors inside such free-form regions. We introduce second-order generalizations of average and max-pooling that together with appropriate non-linearities, derived from the mathematical structure of their embedding space, lead to state-of-the-art recognition performance in semantic segmentation experiments without any type of local feature coding. In contrast, we show that codebook-based local feature coding is more important when feature extraction is constrained to operate over regions that include both foreground and large portions of the background, as typical in image classification settings, whereas for high-accuracy localization setups, second-order pooling over free-form regions produces results superior to those of the winning systems in the contemporary semantic segmentation challenges, with models that are much faster in both training and testing.},
  author       = {Carreira, Joao and Caseiro, Rui and Batista, Jorge and Sminchisescu, Cristian},
  issn         = {1939-3539},
  keyword      = {Recognition,image descriptors,second-order statistics,segmentation,regression,pooling,differential geometry},
  language     = {eng},
  number       = {6},
  pages        = {1177--1189},
  publisher    = {IEEE--Institute of Electrical and Electronics Engineers Inc.},
  series       = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
  title        = {Free-Form Region Description with Second-Order Pooling},
  url          = {http://dx.doi.org/10.1109/TPAMI.2014.2361137},
  volume       = {37},
  year         = {2015},
}