Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Bayesian network imputation methods applied to multi-omics data identify putative causal relationships in a type 2 diabetes dataset containing incomplete data : An IMI DIRECT Study

Howey, Richard ; Adam, Jonathan ; Adamski, Jerzy ; Atabaki, Natalie N LU orcid ; Brunak, Søren ; Chmura, Piotr Jaroslaw ; De Masi, Federico ; Dermitzakis, Emmanouil T ; Fernandez-Tajes, Juan J LU and Forgie, Ian M , et al. (2025) In PLoS Genetics 21(7).
Abstract

Here we report the results from exploratory analysis using a Bayesian network approach of data originally derived from a large North European study of type 2 diabetes (T2D) conducted by the IMI DIRECT consortium. 3029 individuals (795 with T2D and 2234 without) within 7 different study centres provided data comprising genotypes, proteins, metabolites, gene expression measurements and many different clinical variables. The main aim of the current study was to demonstrate the utility of our previously developed method to fit Bayesian networks by performing exploratory analysis of this dataset to identify possible causal relationships between these variables. The data was analysed using the BayesNetty software package, which can handle... (More)

Here we report the results from exploratory analysis using a Bayesian network approach of data originally derived from a large North European study of type 2 diabetes (T2D) conducted by the IMI DIRECT consortium. 3029 individuals (795 with T2D and 2234 without) within 7 different study centres provided data comprising genotypes, proteins, metabolites, gene expression measurements and many different clinical variables. The main aim of the current study was to demonstrate the utility of our previously developed method to fit Bayesian networks by performing exploratory analysis of this dataset to identify possible causal relationships between these variables. The data was analysed using the BayesNetty software package, which can handle mixed discrete/continuous data with missing values. The original dataset consisted of over 16,000 variables, which were filtered down to 260 variables for analysis. Even with this reduction, no individual had complete data for all variables, making it impossible to analyse using standard Bayesian network methodology. However, using the recently proposed novel imputation method implemented in BayesNetty we computed a large average Bayesian network from which we could infer possible associations and causal relationships between variables of interest. Our results confirmed many previous findings in connection with T2D, including possible mediating proteins and genes, some of which have not been widely reported. We also confirmed potential causal relationships with liver fat that were identified in an earlier study that used the IMI DIRECT dataset but was limited to a smaller subset of individuals and variables (namely individuals with complete data at pre-defined variables of interest). In addition to providing valuable confirmation, our analyses thus demonstrate a proof-of-principle of the utility of the method implemented within BayesNetty. The full final average Bayesian network generated from our analysis is freely available and can be easily interrogated further to address specific focussed scientific questions of interest.

(Less)
Please use this url to cite or link to this publication:
author
; ; ; ; ; ; ; ; and , et al. (More)
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; and (Less)
author collaboration
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
Diabetes Mellitus, Type 2/genetics, Humans, Bayes Theorem, Male, Female, Polymorphism, Single Nucleotide, Software, Genome-Wide Association Study, Genotype, Middle Aged, Multiomics
in
PLoS Genetics
volume
21
issue
7
article number
e1011776
publisher
Public Library of Science (PLoS)
external identifiers
  • pmid:40663565
ISSN
1553-7404
DOI
10.1371/journal.pgen.1011776
language
English
LU publication?
yes
additional info
Copyright: © 2025 Howey et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
id
d088cc7c-8887-4376-afcf-85085271d5ba
date added to LUP
2025-08-14 13:00:42
date last changed
2025-08-14 14:05:51
@article{d088cc7c-8887-4376-afcf-85085271d5ba,
  abstract     = {{<p>Here we report the results from exploratory analysis using a Bayesian network approach of data originally derived from a large North European study of type 2 diabetes (T2D) conducted by the IMI DIRECT consortium. 3029 individuals (795 with T2D and 2234 without) within 7 different study centres provided data comprising genotypes, proteins, metabolites, gene expression measurements and many different clinical variables. The main aim of the current study was to demonstrate the utility of our previously developed method to fit Bayesian networks by performing exploratory analysis of this dataset to identify possible causal relationships between these variables. The data was analysed using the BayesNetty software package, which can handle mixed discrete/continuous data with missing values. The original dataset consisted of over 16,000 variables, which were filtered down to 260 variables for analysis. Even with this reduction, no individual had complete data for all variables, making it impossible to analyse using standard Bayesian network methodology. However, using the recently proposed novel imputation method implemented in BayesNetty we computed a large average Bayesian network from which we could infer possible associations and causal relationships between variables of interest. Our results confirmed many previous findings in connection with T2D, including possible mediating proteins and genes, some of which have not been widely reported. We also confirmed potential causal relationships with liver fat that were identified in an earlier study that used the IMI DIRECT dataset but was limited to a smaller subset of individuals and variables (namely individuals with complete data at pre-defined variables of interest). In addition to providing valuable confirmation, our analyses thus demonstrate a proof-of-principle of the utility of the method implemented within BayesNetty. The full final average Bayesian network generated from our analysis is freely available and can be easily interrogated further to address specific focussed scientific questions of interest.</p>}},
  author       = {{Howey, Richard and Adam, Jonathan and Adamski, Jerzy and Atabaki, Natalie N and Brunak, Søren and Chmura, Piotr Jaroslaw and De Masi, Federico and Dermitzakis, Emmanouil T and Fernandez-Tajes, Juan J and Forgie, Ian M and Franks, Paul W and Giordano, Giuseppe N and Haid, Mark and Hansen, Torben and Hansen, Tue H and Harms, Peter P and Hattersley, Andrew T and Hong, Mun-Gwan and Jacobsen, Ulrik Plesner and Jones, Angus G and Koivula, Robert W and Kokkola, Tarja and Mahajan, Anubha and Mari, Andrea and McCarthy, Mark I and McDonald, Timothy J and Musholt, Petra B and Pavo, Imre and Pearson, Ewan R and Pedersen, Oluf and Ruetten, Hartmut and Rutters, Femke and Schwenk, Jochen M and Sharma, Sapna and 't Hart, Leen M and Vestergaard, Henrik and Walker, Mark and Viñuela, Ana and Cordell, Heather J}},
  issn         = {{1553-7404}},
  keywords     = {{Diabetes Mellitus, Type 2/genetics; Humans; Bayes Theorem; Male; Female; Polymorphism, Single Nucleotide; Software; Genome-Wide Association Study; Genotype; Middle Aged; Multiomics}},
  language     = {{eng}},
  number       = {{7}},
  publisher    = {{Public Library of Science (PLoS)}},
  series       = {{PLoS Genetics}},
  title        = {{Bayesian network imputation methods applied to multi-omics data identify putative causal relationships in a type 2 diabetes dataset containing incomplete data : An IMI DIRECT Study}},
  url          = {{http://dx.doi.org/10.1371/journal.pgen.1011776}},
  doi          = {{10.1371/journal.pgen.1011776}},
  volume       = {{21}},
  year         = {{2025}},
}