Max-margin learning of deep structured models for semantic segmentation

Larsson, Måns; Alvén, Jennifer; Kahl, Fredrik

Max-margin learning of deep structured models for semantic segmentation

Mark

Larsson, Måns ; Alvén, Jennifer and Kahl, Fredrik ^LU (2017) 20th Scandinavian Conference on Image Analysis, SCIA 2017 In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 10270 LNCS. p.28-40

Abstract: During the last few years most work done on the task of image segmentation has been focused on deep learning and Convolutional Neural Networks (CNNs) in particular. CNNs are powerful for modeling complex connections between input and output data but lack the ability to directly model dependent output structures, for instance, enforcing properties such as smoothness and coherence. This drawback motivates the use of Conditional Random Fields (CRFs), widely applied as a post-processing step in semantic segmentation. In this paper, we propose a learning framework that jointly trains the parameters of a CNN paired with a CRF. For this, we develop theoretical tools making it possible to optimize a max-margin objective with back-propagation.... (More); During the last few years most work done on the task of image segmentation has been focused on deep learning and Convolutional Neural Networks (CNNs) in particular. CNNs are powerful for modeling complex connections between input and output data but lack the ability to directly model dependent output structures, for instance, enforcing properties such as smoothness and coherence. This drawback motivates the use of Conditional Random Fields (CRFs), widely applied as a post-processing step in semantic segmentation. In this paper, we propose a learning framework that jointly trains the parameters of a CNN paired with a CRF. For this, we develop theoretical tools making it possible to optimize a max-margin objective with back-propagation. The max-margin loss function gives the model good generalization capabilities. Thus, the method is especially suitable for applications where labelled data is limited, for example, medical applications. This generalization capability is reflected in our results where we are able to show good performance on two relatively small medical datasets. The method is also evaluated on a public benchmark (frequently used for semantic segmentation) yielding results competitive to state-of-the-art. Overall, we demonstrate that end-to-end max-margin training is preferred over piecewise training when combining a CNN with a CRF.
(Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/b59f0d9c-c138-4c55-851a-c17815f958b6

author

Larsson, Måns ; Alvén, Jennifer and Kahl, Fredrik ^LU

organization

publishing date

2017

type

Chapter in Book/Report/Conference proceeding

publication status

published

subject

Computer graphics and computer vision

keywords

Convolutional Neural Networks, Markov random fields, Segmentation

host publication

Image Analysis - 20th Scandinavian Conference, SCIA 2017, Proceedings

series title

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

volume

10270 LNCS

pages

13 pages

publisher

Springer

conference name

20th Scandinavian Conference on Image Analysis, SCIA 2017

conference location

Tromso, Norway

conference dates

2017-06-12 - 2017-06-14

external identifiers

scopus:85020436836

ISSN

16113349

03029743

ISBN

9783319591285

DOI

10.1007/978-3-319-59129-2_3

language

English

LU publication?

yes

id

b59f0d9c-c138-4c55-851a-c17815f958b6

date added to LUP

2017-06-30 09:01:19

date last changed

2025-04-04 15:09:34

@inproceedings{b59f0d9c-c138-4c55-851a-c17815f958b6,
  abstract     = {{<p>During the last few years most work done on the task of image segmentation has been focused on deep learning and Convolutional Neural Networks (CNNs) in particular. CNNs are powerful for modeling complex connections between input and output data but lack the ability to directly model dependent output structures, for instance, enforcing properties such as smoothness and coherence. This drawback motivates the use of Conditional Random Fields (CRFs), widely applied as a post-processing step in semantic segmentation. In this paper, we propose a learning framework that jointly trains the parameters of a CNN paired with a CRF. For this, we develop theoretical tools making it possible to optimize a max-margin objective with back-propagation. The max-margin loss function gives the model good generalization capabilities. Thus, the method is especially suitable for applications where labelled data is limited, for example, medical applications. This generalization capability is reflected in our results where we are able to show good performance on two relatively small medical datasets. The method is also evaluated on a public benchmark (frequently used for semantic segmentation) yielding results competitive to state-of-the-art. Overall, we demonstrate that end-to-end max-margin training is preferred over piecewise training when combining a CNN with a CRF.</p>}},
  author       = {{Larsson, Måns and Alvén, Jennifer and Kahl, Fredrik}},
  booktitle    = {{Image Analysis - 20th Scandinavian Conference, SCIA 2017, Proceedings}},
  isbn         = {{9783319591285}},
  issn         = {{16113349}},
  keywords     = {{Convolutional Neural Networks; Markov random fields; Segmentation}},
  language     = {{eng}},
  pages        = {{28--40}},
  publisher    = {{Springer}},
  series       = {{Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)}},
  title        = {{Max-margin learning of deep structured models for semantic segmentation}},
  url          = {{http://dx.doi.org/10.1007/978-3-319-59129-2_3}},
  doi          = {{10.1007/978-3-319-59129-2_3}},
  volume       = {{10270 LNCS}},
  year         = {{2017}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Max-margin learning of deep structured models for semantic segmentation