Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

A novel structural position-specific scoring matrix for the prediction of protein secondary structures

Li, Dapeng ; Li, Tonghua ; Cong, Peisheng ; Xiong, Wenwei and Sun, Jiangming LU orcid (2012) In Bioinformatics 28(1). p.32-39
Abstract

Motivation: The precise prediction of protein secondary structure is of key importance for the prediction of 3D structure and biological function. Although the development of many excellent methods over the last few decades has allowed the achievement of prediction accuracies of up to 80%, progress seems to have reached a bottleneck, and further improvements in accuracy have proven difficult. Results: We propose for the first time a structural position-specific scoring matrix (SPSSM), and establish an unprecedented database of 9 million sequences and their SPSSMs. This database, when combined with a purpose-designed BLAST tool, provides a novel prediction tool: SPSSMPred. When the SPSSMPred was validated on a large dataset (10 814... (More)

Motivation: The precise prediction of protein secondary structure is of key importance for the prediction of 3D structure and biological function. Although the development of many excellent methods over the last few decades has allowed the achievement of prediction accuracies of up to 80%, progress seems to have reached a bottleneck, and further improvements in accuracy have proven difficult. Results: We propose for the first time a structural position-specific scoring matrix (SPSSM), and establish an unprecedented database of 9 million sequences and their SPSSMs. This database, when combined with a purpose-designed BLAST tool, provides a novel prediction tool: SPSSMPred. When the SPSSMPred was validated on a large dataset (10 814 entries), the Q3 accuracy of the protein secondary structure prediction was 93.4%. Our approach was tested on the two latest EVA sets; accuracies of 82.7 and 82.0% were achieved, far higher than can be achieved using other predictors. For further evaluation, we tested our approach on newly determined sequences (141 entries), and obtained an accuracy of 89.6%. For a set of low-homology proteins (40 entries), the SPSSMPred still achieved a Q3 value of 84.6%.

(Less)
Please use this url to cite or link to this publication:
author
; ; ; and
publishing date
type
Contribution to journal
publication status
published
keywords
Machine learning
in
Bioinformatics
volume
28
issue
1
pages
32 - 39
publisher
Oxford University Press
external identifiers
  • pmid:22065541
  • scopus:84855170292
ISSN
1367-4803
DOI
10.1093/bioinformatics/btr611
language
English
LU publication?
no
id
0f99a246-ef5e-4bc6-b462-7757014935d5
date added to LUP
2023-04-24 15:37:49
date last changed
2024-01-05 00:45:47
@article{0f99a246-ef5e-4bc6-b462-7757014935d5,
  abstract     = {{<p>Motivation: The precise prediction of protein secondary structure is of key importance for the prediction of 3D structure and biological function. Although the development of many excellent methods over the last few decades has allowed the achievement of prediction accuracies of up to 80%, progress seems to have reached a bottleneck, and further improvements in accuracy have proven difficult. Results: We propose for the first time a structural position-specific scoring matrix (SPSSM), and establish an unprecedented database of 9 million sequences and their SPSSMs. This database, when combined with a purpose-designed BLAST tool, provides a novel prediction tool: SPSSMPred. When the SPSSMPred was validated on a large dataset (10 814 entries), the Q3 accuracy of the protein secondary structure prediction was 93.4%. Our approach was tested on the two latest EVA sets; accuracies of 82.7 and 82.0% were achieved, far higher than can be achieved using other predictors. For further evaluation, we tested our approach on newly determined sequences (141 entries), and obtained an accuracy of 89.6%. For a set of low-homology proteins (40 entries), the SPSSMPred still achieved a Q3 value of 84.6%.</p>}},
  author       = {{Li, Dapeng and Li, Tonghua and Cong, Peisheng and Xiong, Wenwei and Sun, Jiangming}},
  issn         = {{1367-4803}},
  keywords     = {{Machine learning}},
  language     = {{eng}},
  number       = {{1}},
  pages        = {{32--39}},
  publisher    = {{Oxford University Press}},
  series       = {{Bioinformatics}},
  title        = {{A novel structural position-specific scoring matrix for the prediction of protein secondary structures}},
  url          = {{http://dx.doi.org/10.1093/bioinformatics/btr611}},
  doi          = {{10.1093/bioinformatics/btr611}},
  volume       = {{28}},
  year         = {{2012}},
}