Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Non-negative matrix factorization for the analysis of complex gene expression data : Identification of clinically relevant tumor subtypes

Frigyesi, Attila LU and Höglund, Mattias LU (2008) In Cancer Informatics 6. p.275-292
Abstract

Non-negative matrix factorization (NMF) is a relatively new approach to analyze gene expression data that models data by additive combinations of non-negative basis vectors (metagenes). The non-negativity constraint makes sense biologically as genes may either be expressed or not, but never show negative expression. We applied NMF to five different microarray data sets. We estimated the appropriate number metagens by comparing the residual error of NMF reconstruction of data to that of NMF reconstruction of permutated data, thus finding when a given solution contained more information than noise. This analysis also revealed that NMF could not factorize one of the data sets in a meaningful way. We used GO categories and pre defined gene... (More)

Non-negative matrix factorization (NMF) is a relatively new approach to analyze gene expression data that models data by additive combinations of non-negative basis vectors (metagenes). The non-negativity constraint makes sense biologically as genes may either be expressed or not, but never show negative expression. We applied NMF to five different microarray data sets. We estimated the appropriate number metagens by comparing the residual error of NMF reconstruction of data to that of NMF reconstruction of permutated data, thus finding when a given solution contained more information than noise. This analysis also revealed that NMF could not factorize one of the data sets in a meaningful way. We used GO categories and pre defined gene sets to evaluate the biological significance of the obtained metagenes. By analyses of metagenes specific for the same GO-categories we could show that individual metagenes activated different aspects of the same biological processes. Several of the obtained metagenes correlated with tumor subtypes and tumors with characteristic chromosomal translocations, indicating that metagenes may correspond to specific disease entities. Hence, NMF extracts biological relevant structures of microarray expression data and may thus contribute to a deeper understanding of tumor behavior.

(Less)
Please use this url to cite or link to this publication:
author
and
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
Gene expression, Metagenes, NMF, Tumor classification
in
Cancer Informatics
volume
6
pages
275 - 292
publisher
Libertas Academica
external identifiers
  • scopus:49649102048
ISSN
1176-9351
DOI
10.4137/cin.s606
language
English
LU publication?
yes
additional info
Funding Information: We thank Anders Kvist for producing gene sets based on GO categories. This work was supported by the Swedish Cancer Society, the Swedish Research Council, the Nilsson foundation, and the Crafoord foundation.
id
6cb65ca9-b812-455f-970c-a1e5dbe1628b
date added to LUP
2023-06-02 10:15:05
date last changed
2023-06-07 14:17:25
@article{6cb65ca9-b812-455f-970c-a1e5dbe1628b,
  abstract     = {{<p>Non-negative matrix factorization (NMF) is a relatively new approach to analyze gene expression data that models data by additive combinations of non-negative basis vectors (metagenes). The non-negativity constraint makes sense biologically as genes may either be expressed or not, but never show negative expression. We applied NMF to five different microarray data sets. We estimated the appropriate number metagens by comparing the residual error of NMF reconstruction of data to that of NMF reconstruction of permutated data, thus finding when a given solution contained more information than noise. This analysis also revealed that NMF could not factorize one of the data sets in a meaningful way. We used GO categories and pre defined gene sets to evaluate the biological significance of the obtained metagenes. By analyses of metagenes specific for the same GO-categories we could show that individual metagenes activated different aspects of the same biological processes. Several of the obtained metagenes correlated with tumor subtypes and tumors with characteristic chromosomal translocations, indicating that metagenes may correspond to specific disease entities. Hence, NMF extracts biological relevant structures of microarray expression data and may thus contribute to a deeper understanding of tumor behavior.</p>}},
  author       = {{Frigyesi, Attila and Höglund, Mattias}},
  issn         = {{1176-9351}},
  keywords     = {{Gene expression; Metagenes; NMF; Tumor classification}},
  language     = {{eng}},
  pages        = {{275--292}},
  publisher    = {{Libertas Academica}},
  series       = {{Cancer Informatics}},
  title        = {{Non-negative matrix factorization for the analysis of complex gene expression data : Identification of clinically relevant tumor subtypes}},
  url          = {{http://dx.doi.org/10.4137/cin.s606}},
  doi          = {{10.4137/cin.s606}},
  volume       = {{6}},
  year         = {{2008}},
}