Advanced

Potential for Dramatic Improvement in Sequence Alignment against Structures of Remote Homologous Proteins by Extracting Structural Information from Multiple Structure Alignment.

Zhang, Ziding LU ; Lindstam, Mats LU ; Unge, Johan LU ; Peterson, Carsten LU and Lu, Guoguang LU (2003) In Journal of Molecular Biology 332(1). p.127-142
Abstract
A novel method has been developed for acquiring the correct alignment of a query sequence against remotely homologous proteins by extracting structural information from profiles of multiple structure alignment. A systematic search algorithm combined with a group of score functions based on sequence information and structural information has been introduced in this procedure. A limited number of top solutions (15,000) with high scores were selected as candidates for further examination. On a test-set comprising 301 proteins from 75 protein families with sequence identity less than 30%, the proportion of proteins with completely correct alignment as first candidate was improved to 39.8% by our method, whereas the typical performance of... (More)
A novel method has been developed for acquiring the correct alignment of a query sequence against remotely homologous proteins by extracting structural information from profiles of multiple structure alignment. A systematic search algorithm combined with a group of score functions based on sequence information and structural information has been introduced in this procedure. A limited number of top solutions (15,000) with high scores were selected as candidates for further examination. On a test-set comprising 301 proteins from 75 protein families with sequence identity less than 30%, the proportion of proteins with completely correct alignment as first candidate was improved to 39.8% by our method, whereas the typical performance of existing sequence-based alignment methods was only between 16.1% and 22.7%. Furthermore, multiple candidates for possible alignment were provided in our approach, which dramatically increased the possibility of finding correct alignment, such that completely correct alignments were found amongst the top-ranked 1000 candidates in 88.3% of the proteins. With the assistance of a sequence database, completely correct alignment solutions were achieved amongst the top 1000 candidates in 94.3% of the proteins. From such a limited number of candidates, it would become possible to identify more correct alignment using a more time-consuming but more powerful method with more detailed structural information, such as side-chain packing and energy minimization, etc. The results indicate that the novel alignment strategy could be helpful for extending the application of highly reliable methods for fold identification and homology modeling to a huge number of homologous proteins of low sequence similarity. Details of the methods, together with the results and implications for future development are presented. (Less)
Please use this url to cite or link to this publication:
author
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
homology modeling, sequence alignment, structure alignment, remote homology
in
Journal of Molecular Biology
volume
332
issue
1
pages
127 - 142
publisher
Elsevier
external identifiers
  • wos:000185034400012
  • pmid:12946352
  • scopus:0043237703
ISSN
1089-8638
DOI
10.1016/S0022-2836(03)00858-1
language
English
LU publication?
yes
id
a7e7e3c0-0f77-4986-9192-11e6710f01df (old id 128546)
date added to LUP
2007-07-09 16:39:15
date last changed
2018-01-07 08:51:43
@article{a7e7e3c0-0f77-4986-9192-11e6710f01df,
  abstract     = {A novel method has been developed for acquiring the correct alignment of a query sequence against remotely homologous proteins by extracting structural information from profiles of multiple structure alignment. A systematic search algorithm combined with a group of score functions based on sequence information and structural information has been introduced in this procedure. A limited number of top solutions (15,000) with high scores were selected as candidates for further examination. On a test-set comprising 301 proteins from 75 protein families with sequence identity less than 30%, the proportion of proteins with completely correct alignment as first candidate was improved to 39.8% by our method, whereas the typical performance of existing sequence-based alignment methods was only between 16.1% and 22.7%. Furthermore, multiple candidates for possible alignment were provided in our approach, which dramatically increased the possibility of finding correct alignment, such that completely correct alignments were found amongst the top-ranked 1000 candidates in 88.3% of the proteins. With the assistance of a sequence database, completely correct alignment solutions were achieved amongst the top 1000 candidates in 94.3% of the proteins. From such a limited number of candidates, it would become possible to identify more correct alignment using a more time-consuming but more powerful method with more detailed structural information, such as side-chain packing and energy minimization, etc. The results indicate that the novel alignment strategy could be helpful for extending the application of highly reliable methods for fold identification and homology modeling to a huge number of homologous proteins of low sequence similarity. Details of the methods, together with the results and implications for future development are presented.},
  author       = {Zhang, Ziding and Lindstam, Mats and Unge, Johan and Peterson, Carsten and Lu, Guoguang},
  issn         = {1089-8638},
  keyword      = {homology modeling,sequence alignment,structure alignment,remote homology},
  language     = {eng},
  number       = {1},
  pages        = {127--142},
  publisher    = {Elsevier},
  series       = {Journal of Molecular Biology},
  title        = {Potential for Dramatic Improvement in Sequence Alignment against Structures of Remote Homologous Proteins by Extracting Structural Information from Multiple Structure Alignment.},
  url          = {http://dx.doi.org/10.1016/S0022-2836(03)00858-1},
  volume       = {332},
  year         = {2003},
}