Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Probabilistic Joint Image Segmentation and Labeling by Figure-Ground Composition

Ion, Adrian ; Carreira, Joao and Sminchisescu, Cristian LU (2014) In International Journal of Computer Vision 107(1). p.40-57
Abstract
We propose a layered statistical model for image segmentation and labeling obtained by combining independently extracted, possibly overlapping sets of figure-ground (FG) segmentations. The process of constructing consistent image segmentations, called tilings, is cast as optimization over sets of maximal cliques sampled from a graph connecting all non-overlapping figure-ground segment hypotheses. Potential functions over cliques combine unary, Gestalt-based figure qualities, and pairwise compatibilities among spatially neighboring segments, constrained by T-junctions and the boundary interface statistics of real scenes. Building on the segmentation layer, we further derive a joint image segmentation and labeling model (JSL) which, given a... (More)
We propose a layered statistical model for image segmentation and labeling obtained by combining independently extracted, possibly overlapping sets of figure-ground (FG) segmentations. The process of constructing consistent image segmentations, called tilings, is cast as optimization over sets of maximal cliques sampled from a graph connecting all non-overlapping figure-ground segment hypotheses. Potential functions over cliques combine unary, Gestalt-based figure qualities, and pairwise compatibilities among spatially neighboring segments, constrained by T-junctions and the boundary interface statistics of real scenes. Building on the segmentation layer, we further derive a joint image segmentation and labeling model (JSL) which, given a bag of FGs, constructs a joint probability distribution over both the compatible image interpretations (tilings) composed from those segments, and over their labeling into categories. The process of drawing samples from the joint distribution can be interpreted as first sampling tilings, followed by sampling labelings conditioned on the choice of a particular tiling. We learn the segmentation and labeling parameters jointly, based on maximum likelihood with a novel estimation procedure we refer to as incremental saddle-point approximation. The partition function over tilings and labelings is increasingly more accurately approximated by including incorrect configurations that are rated as probable by candidate models during learning. State of the art results are reported on the Berkeley, Stanford and Pascal VOC datasets, where an improvement of 28 % was achieved for the segmentation task only (tiling), and an accuracy of 47.8 % was obtained on the test set of VOC12 for semantic labeling (JSL). (Less)
Please use this url to cite or link to this publication:
author
; and
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
Image segmentation, Image labeling, Semantic segmentation, Statistical, models, Learning and categorization
in
International Journal of Computer Vision
volume
107
issue
1
pages
40 - 57
publisher
Springer
external identifiers
  • wos:000331640500003
  • scopus:84894475814
ISSN
1573-1405
DOI
10.1007/s11263-013-0663-7
language
English
LU publication?
yes
id
83f80c8e-f6d7-49e5-ad21-28165bf589dc (old id 4411072)
date added to LUP
2016-04-01 10:20:32
date last changed
2022-01-25 22:18:39
@article{83f80c8e-f6d7-49e5-ad21-28165bf589dc,
  abstract     = {{We propose a layered statistical model for image segmentation and labeling obtained by combining independently extracted, possibly overlapping sets of figure-ground (FG) segmentations. The process of constructing consistent image segmentations, called tilings, is cast as optimization over sets of maximal cliques sampled from a graph connecting all non-overlapping figure-ground segment hypotheses. Potential functions over cliques combine unary, Gestalt-based figure qualities, and pairwise compatibilities among spatially neighboring segments, constrained by T-junctions and the boundary interface statistics of real scenes. Building on the segmentation layer, we further derive a joint image segmentation and labeling model (JSL) which, given a bag of FGs, constructs a joint probability distribution over both the compatible image interpretations (tilings) composed from those segments, and over their labeling into categories. The process of drawing samples from the joint distribution can be interpreted as first sampling tilings, followed by sampling labelings conditioned on the choice of a particular tiling. We learn the segmentation and labeling parameters jointly, based on maximum likelihood with a novel estimation procedure we refer to as incremental saddle-point approximation. The partition function over tilings and labelings is increasingly more accurately approximated by including incorrect configurations that are rated as probable by candidate models during learning. State of the art results are reported on the Berkeley, Stanford and Pascal VOC datasets, where an improvement of 28 % was achieved for the segmentation task only (tiling), and an accuracy of 47.8 % was obtained on the test set of VOC12 for semantic labeling (JSL).}},
  author       = {{Ion, Adrian and Carreira, Joao and Sminchisescu, Cristian}},
  issn         = {{1573-1405}},
  keywords     = {{Image segmentation; Image labeling; Semantic segmentation; Statistical; models; Learning and categorization}},
  language     = {{eng}},
  number       = {{1}},
  pages        = {{40--57}},
  publisher    = {{Springer}},
  series       = {{International Journal of Computer Vision}},
  title        = {{Probabilistic Joint Image Segmentation and Labeling by Figure-Ground Composition}},
  url          = {{http://dx.doi.org/10.1007/s11263-013-0663-7}},
  doi          = {{10.1007/s11263-013-0663-7}},
  volume       = {{107}},
  year         = {{2014}},
}