Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Ontology-based integration and querying of heterogeneous rare disease data sources — POLVAS perspective

Palacz, Wojciech ; Lichołai, Sabina ; Musiał, Jacek ; Wawrzycka-Adamczyk, Katarzyna ; Ślusarczyk, Grażyna ; Strug, Barbara ; Yaman, Beyza LU ; Tesi, Michelangelo LU ; Gisslander, Karl LU orcid and O'Sullivan, Declan , et al. (2025) In Computers in Biology and Medicine 185.
Abstract

The integration of rare disease medical databases belonging to different countries is an important problem, as a large number of observations are required for reliable statistical inference of patient data in order to facilitate clinical research. Such integration of national registry data, which requires harmonization of the heterogeneous data sets into a unified view, is facilitated in the European FAIRVASC project by developing a domain-specific ontology. The FAIRVASC project is dedicated to the rare disease of anti-neutrophil cytoplasmic antibody (ANCA) associated vasculitis (AAV). This paper focuses on the practical issues and challenges, encountered during the process of integrating the Polish national database POLVAS into the... (More)

The integration of rare disease medical databases belonging to different countries is an important problem, as a large number of observations are required for reliable statistical inference of patient data in order to facilitate clinical research. Such integration of national registry data, which requires harmonization of the heterogeneous data sets into a unified view, is facilitated in the European FAIRVASC project by developing a domain-specific ontology. The FAIRVASC project is dedicated to the rare disease of anti-neutrophil cytoplasmic antibody (ANCA) associated vasculitis (AAV). This paper focuses on the practical issues and challenges, encountered during the process of integrating the Polish national database POLVAS into the federated database within the FAIRVASC project. It discusses the use of ontology-based methods for data integration and the importance of ensuring patient privacy and data protection. It addresses the problem of missing information in POLVAS, which can be obtained by aggregating other data available within the database, incompatibility of data types and formats, and mapping polish data names into the common vocabulary. The modifications of mappings used to ‘uplift’ national data into the Resource Description Framework (RDF) triplestore are also proposed. The described methods allow for integrating the Polish national database into the European network over which federated queries are performed.

(Less)
Please use this url to cite or link to this publication:
author
; ; ; ; ; ; ; ; and , et al. (More)
; ; ; ; ; ; ; ; ; ; ; ; and (Less)
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
Data integration, Federated queries, Ontologies, Rare diseases
in
Computers in Biology and Medicine
volume
185
article number
109452
publisher
Elsevier
external identifiers
  • pmid:39626458
  • scopus:85210732485
ISSN
0010-4825
DOI
10.1016/j.compbiomed.2024.109452
language
English
LU publication?
yes
id
8a341170-7613-43d6-961c-29b2b4f44b13
date added to LUP
2025-02-18 16:22:14
date last changed
2025-07-09 04:18:42
@article{8a341170-7613-43d6-961c-29b2b4f44b13,
  abstract     = {{<p>The integration of rare disease medical databases belonging to different countries is an important problem, as a large number of observations are required for reliable statistical inference of patient data in order to facilitate clinical research. Such integration of national registry data, which requires harmonization of the heterogeneous data sets into a unified view, is facilitated in the European FAIRVASC project by developing a domain-specific ontology. The FAIRVASC project is dedicated to the rare disease of anti-neutrophil cytoplasmic antibody (ANCA) associated vasculitis (AAV). This paper focuses on the practical issues and challenges, encountered during the process of integrating the Polish national database POLVAS into the federated database within the FAIRVASC project. It discusses the use of ontology-based methods for data integration and the importance of ensuring patient privacy and data protection. It addresses the problem of missing information in POLVAS, which can be obtained by aggregating other data available within the database, incompatibility of data types and formats, and mapping polish data names into the common vocabulary. The modifications of mappings used to ‘uplift’ national data into the Resource Description Framework (RDF) triplestore are also proposed. The described methods allow for integrating the Polish national database into the European network over which federated queries are performed.</p>}},
  author       = {{Palacz, Wojciech and Lichołai, Sabina and Musiał, Jacek and Wawrzycka-Adamczyk, Katarzyna and Ślusarczyk, Grażyna and Strug, Barbara and Yaman, Beyza and Tesi, Michelangelo and Gisslander, Karl and O'Sullivan, Declan and Vaglio, Augusto and Emmi, Giacomo and Little, Mark A. and Wójcik, Krzysztof}},
  issn         = {{0010-4825}},
  keywords     = {{Data integration; Federated queries; Ontologies; Rare diseases}},
  language     = {{eng}},
  publisher    = {{Elsevier}},
  series       = {{Computers in Biology and Medicine}},
  title        = {{Ontology-based integration and querying of heterogeneous rare disease data sources — POLVAS perspective}},
  url          = {{http://dx.doi.org/10.1016/j.compbiomed.2024.109452}},
  doi          = {{10.1016/j.compbiomed.2024.109452}},
  volume       = {{185}},
  year         = {{2025}},
}