Free-Form Region Description with Second-Order Pooling

Carreira, Joao; Caseiro, Rui; Batista, Jorge; Sminchisescu, Cristian

Free-Form Region Description with Second-Order Pooling

Mark

Carreira, Joao ; Caseiro, Rui ; Batista, Jorge and Sminchisescu, Cristian ^LU (2015) In IEEE Transactions on Pattern Analysis and Machine Intelligence 37(6). p.1177-1189

Abstract: Semantic segmentation and object detection are nowadays dominated by methods operating on regions obtained as a result of a bottom-up grouping process (segmentation) but use feature extractors developed for recognition on fixed-form (e.g. rectangular) patches, with full images as a special case. This is most likely suboptimal. In this paper we focus on feature extraction and description over free-form regions and study the relationship with their fixed-form counterparts. Our main contributions are novel pooling techniques that capture the second-order statistics of local descriptors inside such free-form regions. We introduce second-order generalizations of average and max-pooling that together with appropriate non-linearities, derived... (More); Semantic segmentation and object detection are nowadays dominated by methods operating on regions obtained as a result of a bottom-up grouping process (segmentation) but use feature extractors developed for recognition on fixed-form (e.g. rectangular) patches, with full images as a special case. This is most likely suboptimal. In this paper we focus on feature extraction and description over free-form regions and study the relationship with their fixed-form counterparts. Our main contributions are novel pooling techniques that capture the second-order statistics of local descriptors inside such free-form regions. We introduce second-order generalizations of average and max-pooling that together with appropriate non-linearities, derived from the mathematical structure of their embedding space, lead to state-of-the-art recognition performance in semantic segmentation experiments without any type of local feature coding. In contrast, we show that codebook-based local feature coding is more important when feature extraction is constrained to operate over regions that include both foreground and large portions of the background, as typical in image classification settings, whereas for high-accuracy localization setups, second-order pooling over free-form regions produces results superior to those of the winning systems in the contemporary semantic segmentation challenges, with models that are much faster in both training and testing. (Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/7425226

author

Carreira, Joao ; Caseiro, Rui ; Batista, Jorge and Sminchisescu, Cristian ^LU

organization

publishing date

2015

type

Contribution to journal

publication status

published

subject

Computer graphics and computer vision

keywords

Recognition, image descriptors, second-order statistics, segmentation, regression, pooling, differential geometry

in

IEEE Transactions on Pattern Analysis and Machine Intelligence

volume

37

issue

6

pages

1177 - 1189

publisher

IEEE - Institute of Electrical and Electronics Engineers Inc.

external identifiers

wos:000354377100005
scopus:84929223025
pmid:26357341

ISSN

1939-3539

DOI

10.1109/TPAMI.2014.2361137

language

English

LU publication?

yes

id

34d64813-cd25-4f42-b293-843ce7399184 (old id 7425226)

date added to LUP

2016-04-01 12:55:31

date last changed

2025-10-14 13:13:52

@article{34d64813-cd25-4f42-b293-843ce7399184,
  abstract     = {{Semantic segmentation and object detection are nowadays dominated by methods operating on regions obtained as a result of a bottom-up grouping process (segmentation) but use feature extractors developed for recognition on fixed-form (e.g. rectangular) patches, with full images as a special case. This is most likely suboptimal. In this paper we focus on feature extraction and description over free-form regions and study the relationship with their fixed-form counterparts. Our main contributions are novel pooling techniques that capture the second-order statistics of local descriptors inside such free-form regions. We introduce second-order generalizations of average and max-pooling that together with appropriate non-linearities, derived from the mathematical structure of their embedding space, lead to state-of-the-art recognition performance in semantic segmentation experiments without any type of local feature coding. In contrast, we show that codebook-based local feature coding is more important when feature extraction is constrained to operate over regions that include both foreground and large portions of the background, as typical in image classification settings, whereas for high-accuracy localization setups, second-order pooling over free-form regions produces results superior to those of the winning systems in the contemporary semantic segmentation challenges, with models that are much faster in both training and testing.}},
  author       = {{Carreira, Joao and Caseiro, Rui and Batista, Jorge and Sminchisescu, Cristian}},
  issn         = {{1939-3539}},
  keywords     = {{Recognition; image descriptors; second-order statistics; segmentation; regression; pooling; differential geometry}},
  language     = {{eng}},
  number       = {{6}},
  pages        = {{1177--1189}},
  publisher    = {{IEEE - Institute of Electrical and Electronics Engineers Inc.}},
  series       = {{IEEE Transactions on Pattern Analysis and Machine Intelligence}},
  title        = {{Free-Form Region Description with Second-Order Pooling}},
  url          = {{http://dx.doi.org/10.1109/TPAMI.2014.2361137}},
  doi          = {{10.1109/TPAMI.2014.2361137}},
  volume       = {{37}},
  year         = {{2015}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Free-Form Region Description with Second-Order Pooling