Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

The Viral AlphaFold Database of monomers and homodimers reveals conserved protein folds in viruses of bacteria, archaea, and eukaryotes

Odai, Roni LU ; Leemann, Michèle ; Al-Murad, Tamim ; Abdullah, Minhal LU ; Shyrokova, Lena LU orcid ; Tenson, Tanel ; Hauryliuk, Vasili LU orcid ; Durairaj, Janani ; Pereira, Joana LU and Atkinson, Gemma C LU orcid (2025) In Science Advances 11(40). p.1-14
Abstract

Viruses are the most abundant and genetically diverse entities on Earth, yet the functions and evolution of most viral proteins remain poorly understood. Their rapid evolution often obscures evolutionary relationships, limiting the ability to assign functions using sequence-based methods. Although the conservation of protein fold can reveal deep homologies, viral proteins remain underrepresented in structural databases. We address this by clustering viral sequences from RefSeq and predicting the structures of ~27,000 representative proteins using AlphaFold2 to create the Viral AlphaFold Database (VAD). We uncover conserved folds in diverse viruses infecting bacteria, archaea, and eukaryotes. We predict homodimers and make comparisons to... (More)

Viruses are the most abundant and genetically diverse entities on Earth, yet the functions and evolution of most viral proteins remain poorly understood. Their rapid evolution often obscures evolutionary relationships, limiting the ability to assign functions using sequence-based methods. Although the conservation of protein fold can reveal deep homologies, viral proteins remain underrepresented in structural databases. We address this by clustering viral sequences from RefSeq and predicting the structures of ~27,000 representative proteins using AlphaFold2 to create the Viral AlphaFold Database (VAD). We uncover conserved folds in diverse viruses infecting bacteria, archaea, and eukaryotes. We predict homodimers and make comparisons to the Protein Data Bank, providing data on oligomerization potential. We reveal considerable functional darkness in the viral protein universe and report the discovery and validation of an uncharacterized toxin-antitoxin system. The VAD provides a foundation for exploring viral structure-function relationships, including ancient folds shaping viral interactions across all life.

(Less)
Please use this url to cite or link to this publication:
author
; ; ; ; ; ; ; ; and
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
Archaea/virology, Bacteria/virology, Protein Folding, Eukaryota/virology, Viruses/chemistry, Databases, Protein, Viral Proteins/chemistry, Protein Multimerization, Models, Molecular
in
Science Advances
volume
11
issue
40
article number
eadz8560
pages
1 - 14
publisher
American Association for the Advancement of Science (AAAS)
external identifiers
  • pmid:41032608
ISSN
2375-2548
DOI
10.1126/sciadv.adz8560
project
Experimental exploration of bacterial toxin-antitoxin systems
Decoding bacterial toxin-antitoxin systems: from high-throughput discovery to molecular mechanism and biotechnology
Discovering the interplay between antibiotic and phage resistance in bacteria
Interrogating bacteriophages to discover the universal secrets of host-virus and virus-virus warfare
Discovering and exploiting the molecular basis of microbial warfare
language
English
LU publication?
yes
id
bae93652-3b69-4a6b-aa1e-064aa945cbad
date added to LUP
2025-10-07 21:27:38
date last changed
2025-10-10 07:15:03
@article{bae93652-3b69-4a6b-aa1e-064aa945cbad,
  abstract     = {{<p>Viruses are the most abundant and genetically diverse entities on Earth, yet the functions and evolution of most viral proteins remain poorly understood. Their rapid evolution often obscures evolutionary relationships, limiting the ability to assign functions using sequence-based methods. Although the conservation of protein fold can reveal deep homologies, viral proteins remain underrepresented in structural databases. We address this by clustering viral sequences from RefSeq and predicting the structures of ~27,000 representative proteins using AlphaFold2 to create the Viral AlphaFold Database (VAD). We uncover conserved folds in diverse viruses infecting bacteria, archaea, and eukaryotes. We predict homodimers and make comparisons to the Protein Data Bank, providing data on oligomerization potential. We reveal considerable functional darkness in the viral protein universe and report the discovery and validation of an uncharacterized toxin-antitoxin system. The VAD provides a foundation for exploring viral structure-function relationships, including ancient folds shaping viral interactions across all life.</p>}},
  author       = {{Odai, Roni and Leemann, Michèle and Al-Murad, Tamim and Abdullah, Minhal and Shyrokova, Lena and Tenson, Tanel and Hauryliuk, Vasili and Durairaj, Janani and Pereira, Joana and Atkinson, Gemma C}},
  issn         = {{2375-2548}},
  keywords     = {{Archaea/virology; Bacteria/virology; Protein Folding; Eukaryota/virology; Viruses/chemistry; Databases, Protein; Viral Proteins/chemistry; Protein Multimerization; Models, Molecular}},
  language     = {{eng}},
  month        = {{10}},
  number       = {{40}},
  pages        = {{1--14}},
  publisher    = {{American Association for the Advancement of Science (AAAS)}},
  series       = {{Science Advances}},
  title        = {{The Viral AlphaFold Database of monomers and homodimers reveals conserved protein folds in viruses of bacteria, archaea, and eukaryotes}},
  url          = {{http://dx.doi.org/10.1126/sciadv.adz8560}},
  doi          = {{10.1126/sciadv.adz8560}},
  volume       = {{11}},
  year         = {{2025}},
}