PON-Del predictor for sequence retaining protein deletions
(2026) In PLoS Computational Biology 22(2). p.1-18- Abstract
- Protein deletions are frequent among both disease-causing and tolerated variants. Several mechanisms at the DNA, RNA and protein levels can lead to deletions. Many deletions are misclassified in the literature and databases, especially when the mRNA is degraded by the cellular quality-control mechanism. We developed a novel predictor for sequence retaining protein deletions, i.e., variants that do not alter the sequence downstream of the deletion site. We collected an extensive dataset of verified protein deletions, each described by a comprehensive set of context, content, position, and gene-based features. We evaluated both statistical and deep learning algorithms and selected a gradient boosting–based approach to develop the PON-Del... (More)
- Protein deletions are frequent among both disease-causing and tolerated variants. Several mechanisms at the DNA, RNA and protein levels can lead to deletions. Many deletions are misclassified in the literature and databases, especially when the mRNA is degraded by the cellular quality-control mechanism. We developed a novel predictor for sequence retaining protein deletions, i.e., variants that do not alter the sequence downstream of the deletion site. We collected an extensive dataset of verified protein deletions, each described by a comprehensive set of context, content, position, and gene-based features. We evaluated both statistical and deep learning algorithms and selected a gradient boosting–based approach to develop the PON-Del predictor for short, 1–10 amino acid, sequence-retaining deletions. Variants are typically classified into two categories: either pathogenic or benign. However, there is always a third class of variants: variants of uncertain significance (VUSs), which have been ignored by all previous methods. PON-Del is the first deletion interpretation method that includes VUSs. It provides two outputs, binary and three-state prediction with VUSs. The performance of PON-Del was superior to that of previous methods. The tool is freely available at https://structure.bmc.lu.se/pon_del/. (Less)
Please use this url to cite or link to this publication:
https://lup.lub.lu.se/record/b35f4850-613e-4a6c-836c-e5ce0899e68d
- author
- Zhang, Haoyang
LU
; Kabir, Muhammad
LU
and Vihinen, Mauno
LU
- organization
- publishing date
- 2026
- type
- Contribution to journal
- publication status
- published
- subject
- in
- PLoS Computational Biology
- volume
- 22
- issue
- 2
- article number
- e1014020
- pages
- 1 - 18
- publisher
- Public Library of Science (PLoS)
- external identifiers
-
- pmid:41739875
- ISSN
- 1553-7358
- DOI
- 10.1371/journal.pcbi.1014020
- language
- English
- LU publication?
- yes
- id
- b35f4850-613e-4a6c-836c-e5ce0899e68d
- date added to LUP
- 2026-02-26 13:56:11
- date last changed
- 2026-02-27 03:48:35
@article{b35f4850-613e-4a6c-836c-e5ce0899e68d,
abstract = {{Protein deletions are frequent among both disease-causing and tolerated variants. Several mechanisms at the DNA, RNA and protein levels can lead to deletions. Many deletions are misclassified in the literature and databases, especially when the mRNA is degraded by the cellular quality-control mechanism. We developed a novel predictor for sequence retaining protein deletions, i.e., variants that do not alter the sequence downstream of the deletion site. We collected an extensive dataset of verified protein deletions, each described by a comprehensive set of context, content, position, and gene-based features. We evaluated both statistical and deep learning algorithms and selected a gradient boosting–based approach to develop the PON-Del predictor for short, 1–10 amino acid, sequence-retaining deletions. Variants are typically classified into two categories: either pathogenic or benign. However, there is always a third class of variants: variants of uncertain significance (VUSs), which have been ignored by all previous methods. PON-Del is the first deletion interpretation method that includes VUSs. It provides two outputs, binary and three-state prediction with VUSs. The performance of PON-Del was superior to that of previous methods. The tool is freely available at https://structure.bmc.lu.se/pon_del/.}},
author = {{Zhang, Haoyang and Kabir, Muhammad and Vihinen, Mauno}},
issn = {{1553-7358}},
language = {{eng}},
number = {{2}},
pages = {{1--18}},
publisher = {{Public Library of Science (PLoS)}},
series = {{PLoS Computational Biology}},
title = {{PON-Del predictor for sequence retaining protein deletions}},
url = {{http://dx.doi.org/10.1371/journal.pcbi.1014020}},
doi = {{10.1371/journal.pcbi.1014020}},
volume = {{22}},
year = {{2026}},
}