Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Facilitating accessible, rapid, and appropriate processing of ancient metagenomic data with AMDirT

Borry, Maxime ; Forsythe, Adrian ; Andrades Valtueña, Aida ; Hübner, Alexander ; Ibrahim, Anan ; Quagliariello, Andrea ; White, Anna E. ; Kocher, Arthur ; Vågene‬, Åshild J. and Bartholdy, Bjørn Peare , et al. (2024) In F1000Research 12.
Abstract

Background: Access to sample-level metadata is important when selecting public metagenomic sequencing datasets for reuse in new biological analyses. The Standards, Precautions, and Advances in Ancient Metagenomics community (SPAAM, https://spaam-community.org) has previously published AncientMetagenomeDir, a collection of curated and standardised sample metadata tables for metagenomic and microbial genome datasets generated from ancient samples. However, while sample-level information is useful for identifying relevant samples for inclusion in new projects, Next Generation Sequencing (NGS) library construction and sequencing metadata are also essential for appropriately reprocessing ancient metagenomic data. Currently, recovering... (More)

Background: Access to sample-level metadata is important when selecting public metagenomic sequencing datasets for reuse in new biological analyses. The Standards, Precautions, and Advances in Ancient Metagenomics community (SPAAM, https://spaam-community.org) has previously published AncientMetagenomeDir, a collection of curated and standardised sample metadata tables for metagenomic and microbial genome datasets generated from ancient samples. However, while sample-level information is useful for identifying relevant samples for inclusion in new projects, Next Generation Sequencing (NGS) library construction and sequencing metadata are also essential for appropriately reprocessing ancient metagenomic data. Currently, recovering information for downloading and preparing such data is difficult when laboratory and bioinformatic metadata is heterogeneously recorded in prose-based publications. Methods: Through a series of community-based hackathon events, AncientMetagenomeDir was updated to provide standardised library-level metadata of existing and new ancient metagenomic samples. In tandem, the companion tool 'AMDirT' was developed to facilitate rapid data filtering and downloading of ancient metagenomic data, as well as improving automated metadata curation and validation for AncientMetagenomeDir. Results: AncientMetagenomeDir was extended to include standardised metadata of over 6000 ancient metagenomic libraries. The companion tool 'AMDirT' provides both graphical- and command-line interface based access to such metadata for users from a wide range of computational backgrounds. We also report on errors with metadata reporting that appear to commonly occur during data upload and provide suggestions on how to improve the quality of data sharing by the community. Conclusions: Together, both standardised metadata reporting and tooling will help towards easier incorporation and reuse of public ancient metagenomic datasets into future analyses.

(Less)
Please use this url to cite or link to this publication:
author
; ; ; ; ; ; ; ; and , et al. (More)
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; and (Less)
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
aDNA, environmental, FAIR data, metadata, metagenomics, microbial, microbiome, palaeogenomics
in
F1000Research
volume
12
article number
926
publisher
F1000 Research Ltd.
external identifiers
  • scopus:85198075493
  • pmid:39262445
ISSN
2046-1402
DOI
10.12688/f1000research.134798.2
language
English
LU publication?
yes
id
c82e8db0-d5e6-48e5-9f4f-6fcb253687c5
date added to LUP
2025-01-16 12:07:01
date last changed
2025-06-05 23:32:34
@article{c82e8db0-d5e6-48e5-9f4f-6fcb253687c5,
  abstract     = {{<p>Background: Access to sample-level metadata is important when selecting public metagenomic sequencing datasets for reuse in new biological analyses. The Standards, Precautions, and Advances in Ancient Metagenomics community (SPAAM, https://spaam-community.org) has previously published AncientMetagenomeDir, a collection of curated and standardised sample metadata tables for metagenomic and microbial genome datasets generated from ancient samples. However, while sample-level information is useful for identifying relevant samples for inclusion in new projects, Next Generation Sequencing (NGS) library construction and sequencing metadata are also essential for appropriately reprocessing ancient metagenomic data. Currently, recovering information for downloading and preparing such data is difficult when laboratory and bioinformatic metadata is heterogeneously recorded in prose-based publications. Methods: Through a series of community-based hackathon events, AncientMetagenomeDir was updated to provide standardised library-level metadata of existing and new ancient metagenomic samples. In tandem, the companion tool 'AMDirT' was developed to facilitate rapid data filtering and downloading of ancient metagenomic data, as well as improving automated metadata curation and validation for AncientMetagenomeDir. Results: AncientMetagenomeDir was extended to include standardised metadata of over 6000 ancient metagenomic libraries. The companion tool 'AMDirT' provides both graphical- and command-line interface based access to such metadata for users from a wide range of computational backgrounds. We also report on errors with metadata reporting that appear to commonly occur during data upload and provide suggestions on how to improve the quality of data sharing by the community. Conclusions: Together, both standardised metadata reporting and tooling will help towards easier incorporation and reuse of public ancient metagenomic datasets into future analyses.</p>}},
  author       = {{Borry, Maxime and Forsythe, Adrian and Andrades Valtueña, Aida and Hübner, Alexander and Ibrahim, Anan and Quagliariello, Andrea and White, Anna E. and Kocher, Arthur and Vågene‬, Åshild J. and Bartholdy, Bjørn Peare and Spurīte, Diāna and Ponce-Soto, Gabriel Yaxal and Neumann, Gunnar and Huang, I. Ting and Light, Ian and Velsko, Irina M. and Jackson, Iseult and Frangenberg, Jasmin and Serrano, Javier G. and Fumey, Julien and Özdoğan, Kadir T. and Blevins, Kelly E. and Daly, Kevin G. and Lopopolo, Maria and Moraitou, Markella and Michel, Megan and van Os, Meriam and Bravo-Lopez, Miriam J. and Sarhan, Mohamed S. and Dagtas, Nihan D. and Oskolkov, Nikolay and Smith, Olivia S. and Lebrasseur, Ophélie and Rozwalak, Piotr and Eisenhofer, Raphael and Wasef, Sally and Ramachandran, Shreya L. and Vanghi, Valentina and Warinner, Christina and Fellows Yates, James A.}},
  issn         = {{2046-1402}},
  keywords     = {{aDNA; environmental; FAIR data; metadata; metagenomics; microbial; microbiome; palaeogenomics}},
  language     = {{eng}},
  publisher    = {{F1000 Research Ltd.}},
  series       = {{F1000Research}},
  title        = {{Facilitating accessible, rapid, and appropriate processing of ancient metagenomic data with AMDirT}},
  url          = {{http://dx.doi.org/10.12688/f1000research.134798.2}},
  doi          = {{10.12688/f1000research.134798.2}},
  volume       = {{12}},
  year         = {{2024}},
}