Harmonising and linking biomedical and clinical data across disparate data archives to enable integrative cross-biobank research.

Spjuth, Ola; Krestyaninova, Maria; Hastings, Janna; Shen, Huei-Yi; Heikkinen, Jani; Waldenberger, Melanie; Langhammer, Arnulf; Ladenvall, Claes; Esko, Tõnu; Persson, Mats-Åke; Heggland, Jon; Dietrich, Joern; Ose, Sandra; Gieger, Christian; Ried, Janina S; Peters, Annette; Fortier, Isabel; de Geus, Eco Jc; Klovins, Janis; Zaharenko, Linda; Willemsen, Gonneke; Hottenga, Jouke-Jan; Litton, Jan-Eric; Karvanen, Juha; Boomsma, Dorret I; Groop, Leif; Rung, Johan; Palmgren, Juni; Pedersen, Nancy L; McCarthy, Mark I; van Duijn, Cornelia M; Hveem, Kristian; Metspalu, Andres; Ripatti, Samuli; Prokopenko, Inga; Harris, Jennifer R

Harmonising and linking biomedical and clinical data across disparate data archives to enable integrative cross-biobank research.

Mark

Spjuth, Ola ; Krestyaninova, Maria ; Hastings, Janna ; Shen, Huei-Yi ; Heikkinen, Jani ; Waldenberger, Melanie ; Langhammer, Arnulf ; Ladenvall, Claes ^LU ; Esko, Tõnu and Persson, Mats-Åke ^LU , et al. (2015) In European Journal of Human Genetics

Abstract: A wealth of biospecimen samples are stored in modern globally distributed biobanks. Biomedical researchers worldwide need to be able to combine the available resources to improve the power of large-scale studies. A prerequisite for this effort is to be able to search and access phenotypic, clinical and other information about samples that are currently stored at biobanks in an integrated manner. However, privacy issues together with heterogeneous information systems and the lack of agreed-upon vocabularies have made specimen searching across multiple biobanks extremely challenging. We describe three case studies where we have linked samples and sample descriptions in order to facilitate global searching of available samples for research.... (More); A wealth of biospecimen samples are stored in modern globally distributed biobanks. Biomedical researchers worldwide need to be able to combine the available resources to improve the power of large-scale studies. A prerequisite for this effort is to be able to search and access phenotypic, clinical and other information about samples that are currently stored at biobanks in an integrated manner. However, privacy issues together with heterogeneous information systems and the lack of agreed-upon vocabularies have made specimen searching across multiple biobanks extremely challenging. We describe three case studies where we have linked samples and sample descriptions in order to facilitate global searching of available samples for research. The use cases include the ENGAGE (European Network for Genetic and Genomic Epidemiology) consortium comprising at least 39 cohorts, the SUMMIT (surrogate markers for micro- and macro-vascular hard endpoints for innovative diabetes tools) consortium and a pilot for data integration between a Swedish clinical health registry and a biobank. We used the Sample avAILability (SAIL) method for data linking: first, created harmonised variables and then annotated and made searchable information on the number of specimens available in individual biobanks for various phenotypic categories. By operating on this categorised availability data we sidestep many obstacles related to privacy that arise when handling real values and show that harmonised and annotated records about data availability across disparate biomedical archives provide a key methodological advance in pre-analysis exchange of information between biobanks, that is, during the project planning phase.European Journal of Human Genetics advance online publication, 26 August 2015; doi:10.1038/ejhg.2015.165. (Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/7834515

author

Spjuth, Ola ; Krestyaninova, Maria ; Hastings, Janna ; Shen, Huei-Yi ; Heikkinen, Jani ; Waldenberger, Melanie ; Langhammer, Arnulf ; Ladenvall, Claes ^LU ; Esko, Tõnu and Persson, Mats-Åke ^LU , et al. (More)

Spjuth, Ola ; Krestyaninova, Maria ; Hastings, Janna ; Shen, Huei-Yi ; Heikkinen, Jani ; Waldenberger, Melanie ; Langhammer, Arnulf ; Ladenvall, Claes ^LU ; Esko, Tõnu ; Persson, Mats-Åke ^LU ; Heggland, Jon ; Dietrich, Joern ; Ose, Sandra ; Gieger, Christian ; Ried, Janina S ; Peters, Annette ; Fortier, Isabel ; de Geus, Eco Jc ; Klovins, Janis ; Zaharenko, Linda ; Willemsen, Gonneke ; Hottenga, Jouke-Jan ; Litton, Jan-Eric ; Karvanen, Juha ; Boomsma, Dorret I ; Groop, Leif ^LU ; Rung, Johan ; Palmgren, Juni ; Pedersen, Nancy L ; McCarthy, Mark I ; van Duijn, Cornelia M ; Hveem, Kristian ; Metspalu, Andres ; Ripatti, Samuli ; Prokopenko, Inga and Harris, Jennifer R (Less)

organization

publishing date

2015-08-26

type

Contribution to journal

publication status

published

subject

Endocrinology and Diabetes

in

European Journal of Human Genetics

publisher

Nature Publishing Group

external identifiers

pmid:26306643
scopus:84960377720
wos:000374124800007
pmid:26306643

ISSN

1476-5438

DOI

10.1038/ejhg.2015.165

language

English

LU publication?

yes

id

bf19a7f3-b72e-4710-96fa-5845af92b7cb (old id 7834515)

alternative location

http://www.ncbi.nlm.nih.gov/pubmed/26306643?dopt=Abstract

date added to LUP

2016-04-01 10:53:44

date last changed

2025-10-14 08:53:53

@article{bf19a7f3-b72e-4710-96fa-5845af92b7cb,
  abstract     = {{A wealth of biospecimen samples are stored in modern globally distributed biobanks. Biomedical researchers worldwide need to be able to combine the available resources to improve the power of large-scale studies. A prerequisite for this effort is to be able to search and access phenotypic, clinical and other information about samples that are currently stored at biobanks in an integrated manner. However, privacy issues together with heterogeneous information systems and the lack of agreed-upon vocabularies have made specimen searching across multiple biobanks extremely challenging. We describe three case studies where we have linked samples and sample descriptions in order to facilitate global searching of available samples for research. The use cases include the ENGAGE (European Network for Genetic and Genomic Epidemiology) consortium comprising at least 39 cohorts, the SUMMIT (surrogate markers for micro- and macro-vascular hard endpoints for innovative diabetes tools) consortium and a pilot for data integration between a Swedish clinical health registry and a biobank. We used the Sample avAILability (SAIL) method for data linking: first, created harmonised variables and then annotated and made searchable information on the number of specimens available in individual biobanks for various phenotypic categories. By operating on this categorised availability data we sidestep many obstacles related to privacy that arise when handling real values and show that harmonised and annotated records about data availability across disparate biomedical archives provide a key methodological advance in pre-analysis exchange of information between biobanks, that is, during the project planning phase.European Journal of Human Genetics advance online publication, 26 August 2015; doi:10.1038/ejhg.2015.165.}},
  author       = {{Spjuth, Ola and Krestyaninova, Maria and Hastings, Janna and Shen, Huei-Yi and Heikkinen, Jani and Waldenberger, Melanie and Langhammer, Arnulf and Ladenvall, Claes and Esko, Tõnu and Persson, Mats-Åke and Heggland, Jon and Dietrich, Joern and Ose, Sandra and Gieger, Christian and Ried, Janina S and Peters, Annette and Fortier, Isabel and de Geus, Eco Jc and Klovins, Janis and Zaharenko, Linda and Willemsen, Gonneke and Hottenga, Jouke-Jan and Litton, Jan-Eric and Karvanen, Juha and Boomsma, Dorret I and Groop, Leif and Rung, Johan and Palmgren, Juni and Pedersen, Nancy L and McCarthy, Mark I and van Duijn, Cornelia M and Hveem, Kristian and Metspalu, Andres and Ripatti, Samuli and Prokopenko, Inga and Harris, Jennifer R}},
  issn         = {{1476-5438}},
  language     = {{eng}},
  month        = {{08}},
  publisher    = {{Nature Publishing Group}},
  series       = {{European Journal of Human Genetics}},
  title        = {{Harmonising and linking biomedical and clinical data across disparate data archives to enable integrative cross-biobank research.}},
  url          = {{https://lup.lub.lu.se/search/files/2214668/8618291}},
  doi          = {{10.1038/ejhg.2015.165}},
  year         = {{2015}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Harmonising and linking biomedical and clinical data across disparate data archives to enable integrative cross-biobank research.