VarioML framework for comprehensive variation data representation and exchange

Byrne, Myles; Fokkema, Ivo F. A. C.; Lancaster, Owen; Adamusiak, Tomasz; Ahonen-Bishopp, Anni; Atlan, David; Beroud, Christophe; Cornell, Michael; Dalgleish, Raymond; Devereau, Andrew; Patrinos, George P.; Swertz, Morris A.; Taschner, Peter E. M.; Thorisson, Gudmundur A.; Vihinen, Mauno; Brookes, Anthony J.; Muilu, Juha

VarioML framework for comprehensive variation data representation and exchange

Mark

Byrne, Myles ; Fokkema, Ivo F. A. C. ; Lancaster, Owen ; Adamusiak, Tomasz ; Ahonen-Bishopp, Anni ; Atlan, David ; Beroud, Christophe ; Cornell, Michael ; Dalgleish, Raymond and Devereau, Andrew , et al. (2012) In BMC Bioinformatics 13(254).

Abstract: Background: Sharing of data about variation and the associated phenotypes is a critical need, yet variant information can be arbitrarily complex, making a single standard vocabulary elusive and re-formatting difficult. Complex standards have proven too time-consuming to implement. Results: The GEN2PHEN project addressed these difficulties by developing a comprehensive data model for capturing biomedical observations, Observ-OM, and building the VarioML format around it. VarioML pairs a simplified open specification for describing variants, with a toolkit for adapting the specification into one's own research workflow. Straightforward variant data can be captured, federated, and exchanged with no overhead; more complex data can be... (More); Background: Sharing of data about variation and the associated phenotypes is a critical need, yet variant information can be arbitrarily complex, making a single standard vocabulary elusive and re-formatting difficult. Complex standards have proven too time-consuming to implement. Results: The GEN2PHEN project addressed these difficulties by developing a comprehensive data model for capturing biomedical observations, Observ-OM, and building the VarioML format around it. VarioML pairs a simplified open specification for describing variants, with a toolkit for adapting the specification into one's own research workflow. Straightforward variant data can be captured, federated, and exchanged with no overhead; more complex data can be described, without loss of compatibility. The open specification enables push-button submission to gene variant databases (LSDBs) e. g., the Leiden Open Variation Database, using the Cafe Variome data publishing service, while VarioML bidirectionally transforms data between XML and web-application code formats, opening up new possibilities for open source web applications building on shared data. A Java implementation toolkit makes VarioML easily integrated into biomedical applications. VarioML is designed primarily for LSDB data submission and transfer scenarios, but can also be used as a standard variation data format for JSON and XML document databases and user interface components. Conclusions: VarioML is a set of tools and practices improving the availability, quality, and comprehensibility of human variation information. It enables researchers, diagnostic laboratories, and clinics to share that information with ease, clarity, and without ambiguity. (Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/3373343

author

Byrne, Myles ; Fokkema, Ivo F. A. C. ; Lancaster, Owen ; Adamusiak, Tomasz ; Ahonen-Bishopp, Anni ; Atlan, David ; Beroud, Christophe ; Cornell, Michael ; Dalgleish, Raymond and Devereau, Andrew , et al. (More)

Byrne, Myles ; Fokkema, Ivo F. A. C. ; Lancaster, Owen ; Adamusiak, Tomasz ; Ahonen-Bishopp, Anni ; Atlan, David ; Beroud, Christophe ; Cornell, Michael ; Dalgleish, Raymond ; Devereau, Andrew ; Patrinos, George P. ; Swertz, Morris A. ; Taschner, Peter E. M. ; Thorisson, Gudmundur A. ; Vihinen, Mauno ^LU

; Brookes, Anthony J. and Muilu, Juha (Less)

organization

Protein Bioinformatics (research group)

publishing date

2012

type

Contribution to journal

publication status

published

subject

Bioinformatics and Computational Biology

keywords

LSDB, Variation database curation, Data collection, Distribution

in

BMC Bioinformatics

volume

13

issue

254

publisher

BioMed Central (BMC)

external identifiers

wos:000311732700001
scopus:84866844093
pmid:23031277

ISSN

1471-2105

DOI

10.1186/1471-2105-13-254

language

English

LU publication?

yes

id

111b1569-8b28-4035-9c1f-54ed5215054b (old id 3373343)

date added to LUP

2016-04-01 14:42:35

date last changed

2025-10-14 12:33:57

@article{111b1569-8b28-4035-9c1f-54ed5215054b,
  abstract     = {{Background: Sharing of data about variation and the associated phenotypes is a critical need, yet variant information can be arbitrarily complex, making a single standard vocabulary elusive and re-formatting difficult. Complex standards have proven too time-consuming to implement. Results: The GEN2PHEN project addressed these difficulties by developing a comprehensive data model for capturing biomedical observations, Observ-OM, and building the VarioML format around it. VarioML pairs a simplified open specification for describing variants, with a toolkit for adapting the specification into one's own research workflow. Straightforward variant data can be captured, federated, and exchanged with no overhead; more complex data can be described, without loss of compatibility. The open specification enables push-button submission to gene variant databases (LSDBs) e. g., the Leiden Open Variation Database, using the Cafe Variome data publishing service, while VarioML bidirectionally transforms data between XML and web-application code formats, opening up new possibilities for open source web applications building on shared data. A Java implementation toolkit makes VarioML easily integrated into biomedical applications. VarioML is designed primarily for LSDB data submission and transfer scenarios, but can also be used as a standard variation data format for JSON and XML document databases and user interface components. Conclusions: VarioML is a set of tools and practices improving the availability, quality, and comprehensibility of human variation information. It enables researchers, diagnostic laboratories, and clinics to share that information with ease, clarity, and without ambiguity.}},
  author       = {{Byrne, Myles and Fokkema, Ivo F. A. C. and Lancaster, Owen and Adamusiak, Tomasz and Ahonen-Bishopp, Anni and Atlan, David and Beroud, Christophe and Cornell, Michael and Dalgleish, Raymond and Devereau, Andrew and Patrinos, George P. and Swertz, Morris A. and Taschner, Peter E. M. and Thorisson, Gudmundur A. and Vihinen, Mauno and Brookes, Anthony J. and Muilu, Juha}},
  issn         = {{1471-2105}},
  keywords     = {{LSDB; Variation database curation; Data collection; Distribution}},
  language     = {{eng}},
  number       = {{254}},
  publisher    = {{BioMed Central (BMC)}},
  series       = {{BMC Bioinformatics}},
  title        = {{VarioML framework for comprehensive variation data representation and exchange}},
  url          = {{https://lup.lub.lu.se/search/files/4122161/3910815.pdf}},
  doi          = {{10.1186/1471-2105-13-254}},
  volume       = {{13}},
  year         = {{2012}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

VarioML framework for comprehensive variation data representation and exchange