Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

PON-Del predictor for sequence retaining protein deletions

Zhang, Haoyang LU orcid ; Kabir, Muhammad LU orcid and Vihinen, Mauno LU orcid (2026) In PLoS Computational Biology 22(2). p.1-18
Abstract
Protein deletions are frequent among both disease-causing and tolerated variants. Several mechanisms at the DNA, RNA and protein levels can lead to deletions. Many deletions are misclassified in the literature and databases, especially when the mRNA is degraded by the cellular quality-control mechanism. We developed a novel predictor for sequence retaining protein deletions, i.e., variants that do not alter the sequence downstream of the deletion site. We collected an extensive dataset of verified protein deletions, each described by a comprehensive set of context, content, position, and gene-based features. We evaluated both statistical and deep learning algorithms and selected a gradient boosting–based approach to develop the PON-Del... (More)
Protein deletions are frequent among both disease-causing and tolerated variants. Several mechanisms at the DNA, RNA and protein levels can lead to deletions. Many deletions are misclassified in the literature and databases, especially when the mRNA is degraded by the cellular quality-control mechanism. We developed a novel predictor for sequence retaining protein deletions, i.e., variants that do not alter the sequence downstream of the deletion site. We collected an extensive dataset of verified protein deletions, each described by a comprehensive set of context, content, position, and gene-based features. We evaluated both statistical and deep learning algorithms and selected a gradient boosting–based approach to develop the PON-Del predictor for short, 1–10 amino acid, sequence-retaining deletions. Variants are typically classified into two categories: either pathogenic or benign. However, there is always a third class of variants: variants of uncertain significance (VUSs), which have been ignored by all previous methods. PON-Del is the first deletion interpretation method that includes VUSs. It provides two outputs, binary and three-state prediction with VUSs. The performance of PON-Del was superior to that of previous methods. The tool is freely available at https://structure.bmc.lu.se/pon_del/. (Less)
Please use this url to cite or link to this publication:
author
; and
organization
publishing date
type
Contribution to journal
publication status
published
subject
in
PLoS Computational Biology
volume
22
issue
2
article number
e1014020
pages
1 - 18
publisher
Public Library of Science (PLoS)
external identifiers
  • pmid:41739875
ISSN
1553-7358
DOI
10.1371/journal.pcbi.1014020
language
English
LU publication?
yes
id
b35f4850-613e-4a6c-836c-e5ce0899e68d
date added to LUP
2026-02-26 13:56:11
date last changed
2026-02-27 03:48:35
@article{b35f4850-613e-4a6c-836c-e5ce0899e68d,
  abstract     = {{Protein deletions are frequent among both disease-causing and tolerated variants. Several mechanisms at the DNA, RNA and protein levels can lead to deletions. Many deletions are misclassified in the literature and databases, especially when the mRNA is degraded by the cellular quality-control mechanism. We developed a novel predictor for sequence retaining protein deletions, i.e., variants that do not alter the sequence downstream of the deletion site. We collected an extensive dataset of verified protein deletions, each described by a comprehensive set of context, content, position, and gene-based features. We evaluated both statistical and deep learning algorithms and selected a gradient boosting–based approach to develop the PON-Del predictor for short, 1–10 amino acid, sequence-retaining deletions. Variants are typically classified into two categories: either pathogenic or benign. However, there is always a third class of variants: variants of uncertain significance (VUSs), which have been ignored by all previous methods. PON-Del is the first deletion interpretation method that includes VUSs. It provides two outputs, binary and three-state prediction with VUSs. The performance of PON-Del was superior to that of previous methods. The tool is freely available at https://structure.bmc.lu.se/pon_del/.}},
  author       = {{Zhang, Haoyang and Kabir, Muhammad and Vihinen, Mauno}},
  issn         = {{1553-7358}},
  language     = {{eng}},
  number       = {{2}},
  pages        = {{1--18}},
  publisher    = {{Public Library of Science (PLoS)}},
  series       = {{PLoS Computational Biology}},
  title        = {{PON-Del predictor for sequence retaining protein deletions}},
  url          = {{http://dx.doi.org/10.1371/journal.pcbi.1014020}},
  doi          = {{10.1371/journal.pcbi.1014020}},
  volume       = {{22}},
  year         = {{2026}},
}