Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Structure-based prediction of the effects of a missense variant on protein stability

Yang, Yang ; Chen, Biao ; Tan, Ge ; Vihinen, Mauno LU orcid and Shen, Bairong (2013) In Amino Acids 44(3). p.847-855
Abstract
Predicting the effects of amino acid substitutions on protein stability provides invaluable information for protein design, the assignment of biological function, and for understanding disease-associated variations. To understand the effects of substitutions, computational models are preferred to time-consuming and expensive experimental methods. Several methods have been proposed for this task including machine learning-based approaches. However, models trained using limited data have performance problems and many model parameters tend to be over-fitted. To decrease the number of model parameters and to improve the generalization potential, we calculated the amino acid contact energy change for point variations using a structure-based... (More)
Predicting the effects of amino acid substitutions on protein stability provides invaluable information for protein design, the assignment of biological function, and for understanding disease-associated variations. To understand the effects of substitutions, computational models are preferred to time-consuming and expensive experimental methods. Several methods have been proposed for this task including machine learning-based approaches. However, models trained using limited data have performance problems and many model parameters tend to be over-fitted. To decrease the number of model parameters and to improve the generalization potential, we calculated the amino acid contact energy change for point variations using a structure-based coarse-grained model. Based on the structural properties including contact energy (CE) and further physicochemical properties of the amino acids as input features, we developed two support vector machine classifiers. M47 predicted the stability of variant proteins with an accuracy of 87 % and a Matthews correlation coefficient of 0.68 for a large dataset of 1925 variants, whereas M8 performed better when a relatively small dataset of 388 variants was used for 20-fold cross-validation. The performance of the M47 classifier on all six tested contingency table evaluation parameters is better than that of existing machine learning-based models or energy function-based protein stability classifiers. (Less)
Please use this url to cite or link to this publication:
author
; ; ; and
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
Amino acid mutation, Physicochemical properties, Residue-residue contact, energy, Support vector machine, Protein stability prediction
in
Amino Acids
volume
44
issue
3
pages
847 - 855
publisher
Springer
external identifiers
  • wos:000314760700004
  • scopus:84878362869
  • pmid:23064876
ISSN
0939-4451
DOI
10.1007/s00726-012-1407-7
language
English
LU publication?
yes
id
c1c0e178-9d9b-4c12-a683-e66ebeb5201e (old id 3576931)
date added to LUP
2016-04-01 11:01:25
date last changed
2022-04-28 03:47:41
@article{c1c0e178-9d9b-4c12-a683-e66ebeb5201e,
  abstract     = {{Predicting the effects of amino acid substitutions on protein stability provides invaluable information for protein design, the assignment of biological function, and for understanding disease-associated variations. To understand the effects of substitutions, computational models are preferred to time-consuming and expensive experimental methods. Several methods have been proposed for this task including machine learning-based approaches. However, models trained using limited data have performance problems and many model parameters tend to be over-fitted. To decrease the number of model parameters and to improve the generalization potential, we calculated the amino acid contact energy change for point variations using a structure-based coarse-grained model. Based on the structural properties including contact energy (CE) and further physicochemical properties of the amino acids as input features, we developed two support vector machine classifiers. M47 predicted the stability of variant proteins with an accuracy of 87 % and a Matthews correlation coefficient of 0.68 for a large dataset of 1925 variants, whereas M8 performed better when a relatively small dataset of 388 variants was used for 20-fold cross-validation. The performance of the M47 classifier on all six tested contingency table evaluation parameters is better than that of existing machine learning-based models or energy function-based protein stability classifiers.}},
  author       = {{Yang, Yang and Chen, Biao and Tan, Ge and Vihinen, Mauno and Shen, Bairong}},
  issn         = {{0939-4451}},
  keywords     = {{Amino acid mutation; Physicochemical properties; Residue-residue contact; energy; Support vector machine; Protein stability prediction}},
  language     = {{eng}},
  number       = {{3}},
  pages        = {{847--855}},
  publisher    = {{Springer}},
  series       = {{Amino Acids}},
  title        = {{Structure-based prediction of the effects of a missense variant on protein stability}},
  url          = {{http://dx.doi.org/10.1007/s00726-012-1407-7}},
  doi          = {{10.1007/s00726-012-1407-7}},
  volume       = {{44}},
  year         = {{2013}},
}