Probabilistic Joint Image Segmentation and Labeling by Figure-Ground Composition
(2014) In International Journal of Computer Vision 107(1). p.40-57- Abstract
- We propose a layered statistical model for image segmentation and labeling obtained by combining independently extracted, possibly overlapping sets of figure-ground (FG) segmentations. The process of constructing consistent image segmentations, called tilings, is cast as optimization over sets of maximal cliques sampled from a graph connecting all non-overlapping figure-ground segment hypotheses. Potential functions over cliques combine unary, Gestalt-based figure qualities, and pairwise compatibilities among spatially neighboring segments, constrained by T-junctions and the boundary interface statistics of real scenes. Building on the segmentation layer, we further derive a joint image segmentation and labeling model (JSL) which, given a... (More)
- We propose a layered statistical model for image segmentation and labeling obtained by combining independently extracted, possibly overlapping sets of figure-ground (FG) segmentations. The process of constructing consistent image segmentations, called tilings, is cast as optimization over sets of maximal cliques sampled from a graph connecting all non-overlapping figure-ground segment hypotheses. Potential functions over cliques combine unary, Gestalt-based figure qualities, and pairwise compatibilities among spatially neighboring segments, constrained by T-junctions and the boundary interface statistics of real scenes. Building on the segmentation layer, we further derive a joint image segmentation and labeling model (JSL) which, given a bag of FGs, constructs a joint probability distribution over both the compatible image interpretations (tilings) composed from those segments, and over their labeling into categories. The process of drawing samples from the joint distribution can be interpreted as first sampling tilings, followed by sampling labelings conditioned on the choice of a particular tiling. We learn the segmentation and labeling parameters jointly, based on maximum likelihood with a novel estimation procedure we refer to as incremental saddle-point approximation. The partition function over tilings and labelings is increasingly more accurately approximated by including incorrect configurations that are rated as probable by candidate models during learning. State of the art results are reported on the Berkeley, Stanford and Pascal VOC datasets, where an improvement of 28 % was achieved for the segmentation task only (tiling), and an accuracy of 47.8 % was obtained on the test set of VOC12 for semantic labeling (JSL). (Less)
Please use this url to cite or link to this publication:
https://lup.lub.lu.se/record/4411072
- author
- Ion, Adrian ; Carreira, Joao and Sminchisescu, Cristian LU
- organization
- publishing date
- 2014
- type
- Contribution to journal
- publication status
- published
- subject
- keywords
- Image segmentation, Image labeling, Semantic segmentation, Statistical, models, Learning and categorization
- in
- International Journal of Computer Vision
- volume
- 107
- issue
- 1
- pages
- 40 - 57
- publisher
- Springer
- external identifiers
-
- wos:000331640500003
- scopus:84894475814
- ISSN
- 1573-1405
- DOI
- 10.1007/s11263-013-0663-7
- language
- English
- LU publication?
- yes
- id
- 83f80c8e-f6d7-49e5-ad21-28165bf589dc (old id 4411072)
- date added to LUP
- 2016-04-01 10:20:32
- date last changed
- 2022-01-25 22:18:39
@article{83f80c8e-f6d7-49e5-ad21-28165bf589dc, abstract = {{We propose a layered statistical model for image segmentation and labeling obtained by combining independently extracted, possibly overlapping sets of figure-ground (FG) segmentations. The process of constructing consistent image segmentations, called tilings, is cast as optimization over sets of maximal cliques sampled from a graph connecting all non-overlapping figure-ground segment hypotheses. Potential functions over cliques combine unary, Gestalt-based figure qualities, and pairwise compatibilities among spatially neighboring segments, constrained by T-junctions and the boundary interface statistics of real scenes. Building on the segmentation layer, we further derive a joint image segmentation and labeling model (JSL) which, given a bag of FGs, constructs a joint probability distribution over both the compatible image interpretations (tilings) composed from those segments, and over their labeling into categories. The process of drawing samples from the joint distribution can be interpreted as first sampling tilings, followed by sampling labelings conditioned on the choice of a particular tiling. We learn the segmentation and labeling parameters jointly, based on maximum likelihood with a novel estimation procedure we refer to as incremental saddle-point approximation. The partition function over tilings and labelings is increasingly more accurately approximated by including incorrect configurations that are rated as probable by candidate models during learning. State of the art results are reported on the Berkeley, Stanford and Pascal VOC datasets, where an improvement of 28 % was achieved for the segmentation task only (tiling), and an accuracy of 47.8 % was obtained on the test set of VOC12 for semantic labeling (JSL).}}, author = {{Ion, Adrian and Carreira, Joao and Sminchisescu, Cristian}}, issn = {{1573-1405}}, keywords = {{Image segmentation; Image labeling; Semantic segmentation; Statistical; models; Learning and categorization}}, language = {{eng}}, number = {{1}}, pages = {{40--57}}, publisher = {{Springer}}, series = {{International Journal of Computer Vision}}, title = {{Probabilistic Joint Image Segmentation and Labeling by Figure-Ground Composition}}, url = {{http://dx.doi.org/10.1007/s11263-013-0663-7}}, doi = {{10.1007/s11263-013-0663-7}}, volume = {{107}}, year = {{2014}}, }