Advanced

Incidence of "quasi-ditags" in catalogs generated by Serial Analysis of Gene Expression (SAGE)

Anisimov, Sergey LU and Sharov, AA (2004) In BMC Bioinformatics 5.
Abstract
Background: Serial Analysis of Gene Expression (SAGE) is a functional genomic technique that quantitatively analyzes the cellular transcriptome. The analysis of SAGE libraries relies on the identification of ditags from sequencing files; however, the software used to examine SAGE libraries cannot distinguish between authentic versus false ditags ("quasi-ditags"). Results: We provide examples of quasi-ditags that originate from cloning and sequencing artifacts (i.e. genomic contamination or random combinations of nucleotides) that are included in SAGE libraries. We have employed a mathematical model to predict the frequency of quasi-ditags in random nucleotide sequences, and our data show that clones containing less than or equal to 2... (More)
Background: Serial Analysis of Gene Expression (SAGE) is a functional genomic technique that quantitatively analyzes the cellular transcriptome. The analysis of SAGE libraries relies on the identification of ditags from sequencing files; however, the software used to examine SAGE libraries cannot distinguish between authentic versus false ditags ("quasi-ditags"). Results: We provide examples of quasi-ditags that originate from cloning and sequencing artifacts (i.e. genomic contamination or random combinations of nucleotides) that are included in SAGE libraries. We have employed a mathematical model to predict the frequency of quasi-ditags in random nucleotide sequences, and our data show that clones containing less than or equal to 2 ditags (which include chromosomal cloning artifacts) should be excluded from the analysis of SAGE catalogs. Conclusions: Cloning and sequencing artifacts contaminating SAGE libraries could be eliminated using simple pre-screening procedure to increase the reliability of the data. (Less)
Please use this url to cite or link to this publication:
author
organization
publishing date
type
Contribution to journal
publication status
published
subject
in
BMC Bioinformatics
volume
5
publisher
BioMed Central
external identifiers
  • wos:000225770000001
  • pmid:15491492
  • scopus:13244262713
ISSN
1471-2105
DOI
10.1186/1471-2105-5-152
language
English
LU publication?
yes
id
eee91a31-b1b8-486f-b264-e99f4b40bd7a (old id 912279)
date added to LUP
2008-01-10 11:00:07
date last changed
2017-01-01 06:54:21
@article{eee91a31-b1b8-486f-b264-e99f4b40bd7a,
  abstract     = {Background: Serial Analysis of Gene Expression (SAGE) is a functional genomic technique that quantitatively analyzes the cellular transcriptome. The analysis of SAGE libraries relies on the identification of ditags from sequencing files; however, the software used to examine SAGE libraries cannot distinguish between authentic versus false ditags ("quasi-ditags"). Results: We provide examples of quasi-ditags that originate from cloning and sequencing artifacts (i.e. genomic contamination or random combinations of nucleotides) that are included in SAGE libraries. We have employed a mathematical model to predict the frequency of quasi-ditags in random nucleotide sequences, and our data show that clones containing less than or equal to 2 ditags (which include chromosomal cloning artifacts) should be excluded from the analysis of SAGE catalogs. Conclusions: Cloning and sequencing artifacts contaminating SAGE libraries could be eliminated using simple pre-screening procedure to increase the reliability of the data.},
  author       = {Anisimov, Sergey and Sharov, AA},
  issn         = {1471-2105},
  language     = {eng},
  publisher    = {BioMed Central},
  series       = {BMC Bioinformatics},
  title        = {Incidence of "quasi-ditags" in catalogs generated by Serial Analysis of Gene Expression (SAGE)},
  url          = {http://dx.doi.org/10.1186/1471-2105-5-152},
  volume       = {5},
  year         = {2004},
}