Advanced

Multiple spaced seeds for homology search

Ilie, Lucian and Ilie, Silvana LU (2007) In Bioinformatics 23(22). p.2969-2977
Abstract
Motivation: Homology search finds similar segments between two biological sequences, such as DNA or protein sequences. The introduction of optimal spaced seeds in PatternHunter has increased both the sensitivity and the speed of homology search, and it has been adopted by many alignment programs such as BLAST. With the further improvement provided by multiple spaced seeds in PatternHunterll, Smith-Waterman sensitivity is approached at BLASTn speed. However, computing optimal multiple spaced seeds was proved to be NP-hard and current heuristic algorithms are all very slow (exponential). Results: We give a simple algorithm which computes good multiple seeds in polynomial time. Due to a completely different approach, the difference with... (More)
Motivation: Homology search finds similar segments between two biological sequences, such as DNA or protein sequences. The introduction of optimal spaced seeds in PatternHunter has increased both the sensitivity and the speed of homology search, and it has been adopted by many alignment programs such as BLAST. With the further improvement provided by multiple spaced seeds in PatternHunterll, Smith-Waterman sensitivity is approached at BLASTn speed. However, computing optimal multiple spaced seeds was proved to be NP-hard and current heuristic algorithms are all very slow (exponential). Results: We give a simple algorithm which computes good multiple seeds in polynomial time. Due to a completely different approach, the difference with respect to the previous methods is dramatic. The multiple spaced seed of PatternHunterII, with 16 weight 11 seeds, was computed in 12 days. It takes us 17 s to find a better one. Our approach changes the way of looking at multiple spaced seeds. Contact: ilie@csd.uwo.ca. (Less)
Please use this url to cite or link to this publication:
author
organization
publishing date
type
Contribution to journal
publication status
published
subject
in
Bioinformatics
volume
23
issue
22
pages
2969 - 2977
publisher
Oxford University Press
external identifiers
  • wos:000251197800002
  • scopus:36448966762
ISSN
1367-4803
DOI
10.1093/bioinformatics/btm422
language
English
LU publication?
yes
id
708b0af2-81df-40ea-a987-fea469efc566 (old id 968953)
date added to LUP
2008-01-29 15:35:39
date last changed
2017-02-26 03:30:58
@article{708b0af2-81df-40ea-a987-fea469efc566,
  abstract     = {Motivation: Homology search finds similar segments between two biological sequences, such as DNA or protein sequences. The introduction of optimal spaced seeds in PatternHunter has increased both the sensitivity and the speed of homology search, and it has been adopted by many alignment programs such as BLAST. With the further improvement provided by multiple spaced seeds in PatternHunterll, Smith-Waterman sensitivity is approached at BLASTn speed. However, computing optimal multiple spaced seeds was proved to be NP-hard and current heuristic algorithms are all very slow (exponential). Results: We give a simple algorithm which computes good multiple seeds in polynomial time. Due to a completely different approach, the difference with respect to the previous methods is dramatic. The multiple spaced seed of PatternHunterII, with 16 weight 11 seeds, was computed in 12 days. It takes us 17 s to find a better one. Our approach changes the way of looking at multiple spaced seeds. Contact: ilie@csd.uwo.ca.},
  author       = {Ilie, Lucian and Ilie, Silvana},
  issn         = {1367-4803},
  language     = {eng},
  number       = {22},
  pages        = {2969--2977},
  publisher    = {Oxford University Press},
  series       = {Bioinformatics},
  title        = {Multiple spaced seeds for homology search},
  url          = {http://dx.doi.org/10.1093/bioinformatics/btm422},
  volume       = {23},
  year         = {2007},
}