Advanced

Compression algorithm for pre-simulated Monte Carlo p-value functions: Application to the ontological analysis of microarray studies

Nilsson, Björn LU (2008) In Pattern Recognition Letters 29(6). p.768-772
Abstract
Monte Carlo simulation is frequently employed to compute p-values for test statistics with unknown null distributions. However, the computations can be exceedingly time-consuming, and, in such cases, the use of pre-computed simulations can be considered to increase speed. This approach is attractive in principle, but complicated in practice because the size of the pre-computed data can be prohibitively large. We developed an algorithm for computing size-reduced representations of Monte Carlo p-value functions. We show that, in typical settings, this algorithm reduces the size of the pre-computed data by several orders of magnitude, while bounding provably the approximation error at an explicitly controllable level. The algorithm is... (More)
Monte Carlo simulation is frequently employed to compute p-values for test statistics with unknown null distributions. However, the computations can be exceedingly time-consuming, and, in such cases, the use of pre-computed simulations can be considered to increase speed. This approach is attractive in principle, but complicated in practice because the size of the pre-computed data can be prohibitively large. We developed an algorithm for computing size-reduced representations of Monte Carlo p-value functions. We show that, in typical settings, this algorithm reduces the size of the pre-computed data by several orders of magnitude, while bounding provably the approximation error at an explicitly controllable level. The algorithm is data-independent, fully non-parametric, and easy to implement. We exemplify its practical utility by applying it to the threshold-free ontological analysis of microarray data. The presented algorithm simplifies the use of pre-computed Monte Carlo p-value functions in software, including specialized bioinformatics applications. (Less)
Please use this url to cite or link to this publication:
author
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
ontological analysis, microarrays, biomedical pattern recognition, bioinformatics, data compression
in
Pattern Recognition Letters
volume
29
issue
6
pages
768 - 772
publisher
Elsevier
external identifiers
  • wos:000255129600007
  • scopus:39949083941
ISSN
0167-8655
DOI
10.1016/j.patrec.2007.12.007
language
English
LU publication?
yes
id
dcfabf40-1eac-461e-91c9-a4d138d8a459 (old id 1206210)
date added to LUP
2008-09-19 11:46:05
date last changed
2017-01-01 06:12:18
@article{dcfabf40-1eac-461e-91c9-a4d138d8a459,
  abstract     = {Monte Carlo simulation is frequently employed to compute p-values for test statistics with unknown null distributions. However, the computations can be exceedingly time-consuming, and, in such cases, the use of pre-computed simulations can be considered to increase speed. This approach is attractive in principle, but complicated in practice because the size of the pre-computed data can be prohibitively large. We developed an algorithm for computing size-reduced representations of Monte Carlo p-value functions. We show that, in typical settings, this algorithm reduces the size of the pre-computed data by several orders of magnitude, while bounding provably the approximation error at an explicitly controllable level. The algorithm is data-independent, fully non-parametric, and easy to implement. We exemplify its practical utility by applying it to the threshold-free ontological analysis of microarray data. The presented algorithm simplifies the use of pre-computed Monte Carlo p-value functions in software, including specialized bioinformatics applications.},
  author       = {Nilsson, Björn},
  issn         = {0167-8655},
  keyword      = {ontological analysis,microarrays,biomedical pattern recognition,bioinformatics,data compression},
  language     = {eng},
  number       = {6},
  pages        = {768--772},
  publisher    = {Elsevier},
  series       = {Pattern Recognition Letters},
  title        = {Compression algorithm for pre-simulated Monte Carlo p-value functions: Application to the ontological analysis of microarray studies},
  url          = {http://dx.doi.org/10.1016/j.patrec.2007.12.007},
  volume       = {29},
  year         = {2008},
}