Improving the Detection of Relations Between Objects in an Image Using Textual Semantics

Medved, Dennis; Jiang, Fangyuan; Exner, Peter; Oskarsson, Magnus; Nugues, Pierre; Åström, Karl

Improving the Detection of Relations Between Objects in an Image Using Textual Semantics

Mark

Medved, Dennis ^LU

; Jiang, Fangyuan ^LU ; Exner, Peter ^LU ; Oskarsson, Magnus ^LU

; Nugues, Pierre ^LU

and Åström, Karl ^LU

(2015) 3rd International Conference on Pattern Recognition Applications an Methods (ICPRAM 2014) 9443. p.133-145

Abstract: In this article, we describe a system that classifies relations between entities extracted from an image. We started from the idea that we could utilize lexical and semantic information from text associated with the image, such as captions or surrounding text, rather than just the geometric and visual characteristics of the entities found in the image. We collected a corpus of images from Wikipedia together with their corresponding articles. In our experimental setup, we extracted two kinds of entities from the images, human beings and horses, and we defined three relations that could exist between them: Ride, Lead,or None. We used geometric features as a baseline to identify the relations between the entities and we describe the... (More); In this article, we describe a system that classifies relations between entities extracted from an image. We started from the idea that we could utilize lexical and semantic information from text associated with the image, such as captions or surrounding text, rather than just the geometric and visual characteristics of the entities found in the image. We collected a corpus of images from Wikipedia together with their corresponding articles. In our experimental setup, we extracted two kinds of entities from the images, human beings and horses, and we defined three relations that could exist between them: Ride, Lead,or None. We used geometric features as a baseline to identify the relations between the entities and we describe the improvements brought by the addition of bag-of-word features and predicate–argument structures that we extracted from the text. The best semantic model resulted in a relative error reduction of more than 18 % over the baseline (Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/8260062

author

Medved, Dennis ^LU

; Jiang, Fangyuan ^LU ; Exner, Peter ^LU ; Oskarsson, Magnus ^LU

; Nugues, Pierre ^LU

and Åström, Karl ^LU

organization

publishing date

2015

type

Chapter in Book/Report/Conference proceeding

publication status

published

subject

Computer and Information Sciences

keywords

Semantic parsing, Relation extraction from images, Machine learning

host publication

Pattern Recognition Applications and Methods /Lecture Notes in Computer Science

editor

Fred, Ana ; De Marsico, Maria and Tabbone, Antoine

volume

9443

pages

13 pages

publisher

Springer

conference name

3rd International Conference on Pattern Recognition Applications an Methods (ICPRAM 2014)

conference location

Angers, France

conference dates

2014-03-06 - 2014-03-08

external identifiers

scopus:84951860819
wos:000374104100009

ISBN

978-3-319-25529-3

978-3-319-25530-9

DOI

10.1007/978-3-319-25530-9_9

language

English

LU publication?

yes

id

dfd30702-58ac-4a52-a127-e62bd8093251 (old id 8260062)

date added to LUP

2016-04-04 11:00:11

date last changed

2025-10-14 09:33:35

@inproceedings{dfd30702-58ac-4a52-a127-e62bd8093251,
  abstract     = {{In this article, we describe a system that classifies relations between entities extracted from an image. We started from the idea that we could utilize lexical and semantic information from text associated with the image, such as captions or surrounding text, rather than just the geometric and visual characteristics of the entities found in the image. We collected a corpus of images from Wikipedia together with their corresponding articles. In our experimental setup, we extracted two kinds of entities from the images, human beings and horses, and we defined three relations that could exist between them: Ride, Lead,or None. We used geometric features as a baseline to identify the relations between the entities and we describe the improvements brought by the addition of bag-of-word features and predicate–argument structures that we extracted from the text. The best semantic model resulted in a relative error reduction of more than 18 % over the baseline}},
  author       = {{Medved, Dennis and Jiang, Fangyuan and Exner, Peter and Oskarsson, Magnus and Nugues, Pierre and Åström, Karl}},
  booktitle    = {{Pattern Recognition Applications and Methods /Lecture Notes in Computer Science}},
  editor       = {{Fred, Ana and De Marsico, Maria and Tabbone, Antoine}},
  isbn         = {{978-3-319-25529-3}},
  keywords     = {{Semantic parsing; Relation extraction from images; Machine learning}},
  language     = {{eng}},
  pages        = {{133--145}},
  publisher    = {{Springer}},
  title        = {{Improving the Detection of Relations Between Objects in an Image Using Textual Semantics}},
  url          = {{http://dx.doi.org/10.1007/978-3-319-25530-9_9}},
  doi          = {{10.1007/978-3-319-25530-9_9}},
  volume       = {{9443}},
  year         = {{2015}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Improving the Detection of Relations Between Objects in an Image Using Textual Semantics