Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

On preprocessing of protein sequences for neural network prediction of polyproline type II secondary structures

Siermala, M ; Juhola, M and Vihinen, Mauno LU orcid (2001) In Computers in Biology and Medicine 31(5). p.385-398
Abstract
Polyproline type II stretches are somewhat rare on proteins. The backbone of this secondary structural element folds to a triangular form instead of the normal alpha -helix with 3.6 residues per turn. It is a very challenging task to try to detect them computationally from protein sequence. Here, we have studied the preprocessing phase in particular, which is important for any machine learning method. Preprocessing included selection of relevant data from the Protein Data Bank and investigation of learnability properties. These properties show whether the material is suitable for neural network computing. The complexity of algorithms in connection with preprocessing was briefly considered. We found that feedforward perceptron neural... (More)
Polyproline type II stretches are somewhat rare on proteins. The backbone of this secondary structural element folds to a triangular form instead of the normal alpha -helix with 3.6 residues per turn. It is a very challenging task to try to detect them computationally from protein sequence. Here, we have studied the preprocessing phase in particular, which is important for any machine learning method. Preprocessing included selection of relevant data from the Protein Data Bank and investigation of learnability properties. These properties show whether the material is suitable for neural network computing. The complexity of algorithms in connection with preprocessing was briefly considered. We found that feedforward perceptron neural networks were appropriate for the prediction of polyproline type II and also relatively efficient in this task. The problem is very difficult because of the great similarity of the two classes present in the classification. Nevertheless, neural networks were able to recognize and predict about 75% of secondary structures. (C) 2001 Elsevier Science Ltd. All rights reserved. (Less)
Please use this url to cite or link to this publication:
author
; and
publishing date
type
Contribution to journal
publication status
published
subject
keywords
neural networks, proteins, prediction of polyproline type II secondary, structures, polyproline type II structure, PPII
in
Computers in Biology and Medicine
volume
31
issue
5
pages
385 - 398
publisher
Elsevier
external identifiers
  • wos:000170519300006
  • scopus:0034870317
ISSN
1879-0534
DOI
10.1016/S0010-4825(01)00013-0
language
English
LU publication?
no
id
dc527074-9225-4553-b82c-7bea4fed3510 (old id 3851531)
date added to LUP
2016-04-01 11:46:11
date last changed
2022-01-26 17:56:47
@article{dc527074-9225-4553-b82c-7bea4fed3510,
  abstract     = {{Polyproline type II stretches are somewhat rare on proteins. The backbone of this secondary structural element folds to a triangular form instead of the normal alpha -helix with 3.6 residues per turn. It is a very challenging task to try to detect them computationally from protein sequence. Here, we have studied the preprocessing phase in particular, which is important for any machine learning method. Preprocessing included selection of relevant data from the Protein Data Bank and investigation of learnability properties. These properties show whether the material is suitable for neural network computing. The complexity of algorithms in connection with preprocessing was briefly considered. We found that feedforward perceptron neural networks were appropriate for the prediction of polyproline type II and also relatively efficient in this task. The problem is very difficult because of the great similarity of the two classes present in the classification. Nevertheless, neural networks were able to recognize and predict about 75% of secondary structures. (C) 2001 Elsevier Science Ltd. All rights reserved.}},
  author       = {{Siermala, M and Juhola, M and Vihinen, Mauno}},
  issn         = {{1879-0534}},
  keywords     = {{neural networks; proteins; prediction of polyproline type II secondary; structures; polyproline type II structure; PPII}},
  language     = {{eng}},
  number       = {{5}},
  pages        = {{385--398}},
  publisher    = {{Elsevier}},
  series       = {{Computers in Biology and Medicine}},
  title        = {{On preprocessing of protein sequences for neural network prediction of polyproline type II secondary structures}},
  url          = {{http://dx.doi.org/10.1016/S0010-4825(01)00013-0}},
  doi          = {{10.1016/S0010-4825(01)00013-0}},
  volume       = {{31}},
  year         = {{2001}},
}