The Viral AlphaFold Database of monomers and homodimers reveals conserved protein folds in viruses of bacteria, archaea, and eukaryotes
(2025) In Science Advances 11(40). p.1-14- Abstract
Viruses are the most abundant and genetically diverse entities on Earth, yet the functions and evolution of most viral proteins remain poorly understood. Their rapid evolution often obscures evolutionary relationships, limiting the ability to assign functions using sequence-based methods. Although the conservation of protein fold can reveal deep homologies, viral proteins remain underrepresented in structural databases. We address this by clustering viral sequences from RefSeq and predicting the structures of ~27,000 representative proteins using AlphaFold2 to create the Viral AlphaFold Database (VAD). We uncover conserved folds in diverse viruses infecting bacteria, archaea, and eukaryotes. We predict homodimers and make comparisons to... (More)
Viruses are the most abundant and genetically diverse entities on Earth, yet the functions and evolution of most viral proteins remain poorly understood. Their rapid evolution often obscures evolutionary relationships, limiting the ability to assign functions using sequence-based methods. Although the conservation of protein fold can reveal deep homologies, viral proteins remain underrepresented in structural databases. We address this by clustering viral sequences from RefSeq and predicting the structures of ~27,000 representative proteins using AlphaFold2 to create the Viral AlphaFold Database (VAD). We uncover conserved folds in diverse viruses infecting bacteria, archaea, and eukaryotes. We predict homodimers and make comparisons to the Protein Data Bank, providing data on oligomerization potential. We reveal considerable functional darkness in the viral protein universe and report the discovery and validation of an uncharacterized toxin-antitoxin system. The VAD provides a foundation for exploring viral structure-function relationships, including ancient folds shaping viral interactions across all life.
(Less)
- author
- Odai, Roni
LU
; Leemann, Michèle
; Al-Murad, Tamim
; Abdullah, Minhal
LU
; Shyrokova, Lena
LU
; Tenson, Tanel ; Hauryliuk, Vasili LU
; Durairaj, Janani ; Pereira, Joana LU and Atkinson, Gemma C LU
- organization
- publishing date
- 2025-10-03
- type
- Contribution to journal
- publication status
- published
- subject
- keywords
- Archaea/virology, Bacteria/virology, Protein Folding, Eukaryota/virology, Viruses/chemistry, Databases, Protein, Viral Proteins/chemistry, Protein Multimerization, Models, Molecular
- in
- Science Advances
- volume
- 11
- issue
- 40
- article number
- eadz8560
- pages
- 1 - 14
- publisher
- American Association for the Advancement of Science (AAAS)
- external identifiers
-
- pmid:41032608
- ISSN
- 2375-2548
- DOI
- 10.1126/sciadv.adz8560
- project
- Experimental exploration of bacterial toxin-antitoxin systems
- Decoding bacterial toxin-antitoxin systems: from high-throughput discovery to molecular mechanism and biotechnology
- Discovering the interplay between antibiotic and phage resistance in bacteria
- Interrogating bacteriophages to discover the universal secrets of host-virus and virus-virus warfare
- Discovering and exploiting the molecular basis of microbial warfare
- language
- English
- LU publication?
- yes
- id
- bae93652-3b69-4a6b-aa1e-064aa945cbad
- date added to LUP
- 2025-10-07 21:27:38
- date last changed
- 2025-10-10 07:15:03
@article{bae93652-3b69-4a6b-aa1e-064aa945cbad, abstract = {{<p>Viruses are the most abundant and genetically diverse entities on Earth, yet the functions and evolution of most viral proteins remain poorly understood. Their rapid evolution often obscures evolutionary relationships, limiting the ability to assign functions using sequence-based methods. Although the conservation of protein fold can reveal deep homologies, viral proteins remain underrepresented in structural databases. We address this by clustering viral sequences from RefSeq and predicting the structures of ~27,000 representative proteins using AlphaFold2 to create the Viral AlphaFold Database (VAD). We uncover conserved folds in diverse viruses infecting bacteria, archaea, and eukaryotes. We predict homodimers and make comparisons to the Protein Data Bank, providing data on oligomerization potential. We reveal considerable functional darkness in the viral protein universe and report the discovery and validation of an uncharacterized toxin-antitoxin system. The VAD provides a foundation for exploring viral structure-function relationships, including ancient folds shaping viral interactions across all life.</p>}}, author = {{Odai, Roni and Leemann, Michèle and Al-Murad, Tamim and Abdullah, Minhal and Shyrokova, Lena and Tenson, Tanel and Hauryliuk, Vasili and Durairaj, Janani and Pereira, Joana and Atkinson, Gemma C}}, issn = {{2375-2548}}, keywords = {{Archaea/virology; Bacteria/virology; Protein Folding; Eukaryota/virology; Viruses/chemistry; Databases, Protein; Viral Proteins/chemistry; Protein Multimerization; Models, Molecular}}, language = {{eng}}, month = {{10}}, number = {{40}}, pages = {{1--14}}, publisher = {{American Association for the Advancement of Science (AAAS)}}, series = {{Science Advances}}, title = {{The Viral AlphaFold Database of monomers and homodimers reveals conserved protein folds in viruses of bacteria, archaea, and eukaryotes}}, url = {{http://dx.doi.org/10.1126/sciadv.adz8560}}, doi = {{10.1126/sciadv.adz8560}}, volume = {{11}}, year = {{2025}}, }