Free-Form Region Description with Second-Order Pooling
(2015) In IEEE Transactions on Pattern Analysis and Machine Intelligence 37(6). p.1177-1189- Abstract
- Semantic segmentation and object detection are nowadays dominated by methods operating on regions obtained as a result of a bottom-up grouping process (segmentation) but use feature extractors developed for recognition on fixed-form (e.g. rectangular) patches, with full images as a special case. This is most likely suboptimal. In this paper we focus on feature extraction and description over free-form regions and study the relationship with their fixed-form counterparts. Our main contributions are novel pooling techniques that capture the second-order statistics of local descriptors inside such free-form regions. We introduce second-order generalizations of average and max-pooling that together with appropriate non-linearities, derived... (More)
- Semantic segmentation and object detection are nowadays dominated by methods operating on regions obtained as a result of a bottom-up grouping process (segmentation) but use feature extractors developed for recognition on fixed-form (e.g. rectangular) patches, with full images as a special case. This is most likely suboptimal. In this paper we focus on feature extraction and description over free-form regions and study the relationship with their fixed-form counterparts. Our main contributions are novel pooling techniques that capture the second-order statistics of local descriptors inside such free-form regions. We introduce second-order generalizations of average and max-pooling that together with appropriate non-linearities, derived from the mathematical structure of their embedding space, lead to state-of-the-art recognition performance in semantic segmentation experiments without any type of local feature coding. In contrast, we show that codebook-based local feature coding is more important when feature extraction is constrained to operate over regions that include both foreground and large portions of the background, as typical in image classification settings, whereas for high-accuracy localization setups, second-order pooling over free-form regions produces results superior to those of the winning systems in the contemporary semantic segmentation challenges, with models that are much faster in both training and testing. (Less)
Please use this url to cite or link to this publication:
https://lup.lub.lu.se/record/7425226
- author
- Carreira, Joao ; Caseiro, Rui ; Batista, Jorge and Sminchisescu, Cristian LU
- organization
- publishing date
- 2015
- type
- Contribution to journal
- publication status
- published
- subject
- keywords
- Recognition, image descriptors, second-order statistics, segmentation, regression, pooling, differential geometry
- in
- IEEE Transactions on Pattern Analysis and Machine Intelligence
- volume
- 37
- issue
- 6
- pages
- 1177 - 1189
- publisher
- IEEE - Institute of Electrical and Electronics Engineers Inc.
- external identifiers
-
- wos:000354377100005
- scopus:84929223025
- pmid:26357341
- ISSN
- 1939-3539
- DOI
- 10.1109/TPAMI.2014.2361137
- language
- English
- LU publication?
- yes
- id
- 34d64813-cd25-4f42-b293-843ce7399184 (old id 7425226)
- date added to LUP
- 2016-04-01 12:55:31
- date last changed
- 2022-05-07 06:26:57
@article{34d64813-cd25-4f42-b293-843ce7399184, abstract = {{Semantic segmentation and object detection are nowadays dominated by methods operating on regions obtained as a result of a bottom-up grouping process (segmentation) but use feature extractors developed for recognition on fixed-form (e.g. rectangular) patches, with full images as a special case. This is most likely suboptimal. In this paper we focus on feature extraction and description over free-form regions and study the relationship with their fixed-form counterparts. Our main contributions are novel pooling techniques that capture the second-order statistics of local descriptors inside such free-form regions. We introduce second-order generalizations of average and max-pooling that together with appropriate non-linearities, derived from the mathematical structure of their embedding space, lead to state-of-the-art recognition performance in semantic segmentation experiments without any type of local feature coding. In contrast, we show that codebook-based local feature coding is more important when feature extraction is constrained to operate over regions that include both foreground and large portions of the background, as typical in image classification settings, whereas for high-accuracy localization setups, second-order pooling over free-form regions produces results superior to those of the winning systems in the contemporary semantic segmentation challenges, with models that are much faster in both training and testing.}}, author = {{Carreira, Joao and Caseiro, Rui and Batista, Jorge and Sminchisescu, Cristian}}, issn = {{1939-3539}}, keywords = {{Recognition; image descriptors; second-order statistics; segmentation; regression; pooling; differential geometry}}, language = {{eng}}, number = {{6}}, pages = {{1177--1189}}, publisher = {{IEEE - Institute of Electrical and Electronics Engineers Inc.}}, series = {{IEEE Transactions on Pattern Analysis and Machine Intelligence}}, title = {{Free-Form Region Description with Second-Order Pooling}}, url = {{http://dx.doi.org/10.1109/TPAMI.2014.2361137}}, doi = {{10.1109/TPAMI.2014.2361137}}, volume = {{37}}, year = {{2015}}, }