Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Approximate geodesic distances reveal biologically relevant structures in microarray data

Nilsson, Jens LU ; Fioretos, Thoas LU ; Höglund, Mattias LU and Fontes, Magnus LU (2004) In Bioinformatics 20(6). p.874-880
Abstract
Motivation: Genome-wide gene expression measurements, as currently determined by the microarray technology, can be represented mathematically as points in a high-dimensional gene expression space. Genes interact with each other in regulatory networks, restricting the cellular gene expression profiles to a certain manifold, or surface, in gene expression space. To obtain knowledge about this manifold, various dimensionality reduction methods and distance metrics are used. For data points distributed on curved manifolds, a sensible distance measure would be the geodesic distance along the manifold. In this work, we examine whether an approximate geodesic distance measure captures biological similarities better than the traditionally used... (More)
Motivation: Genome-wide gene expression measurements, as currently determined by the microarray technology, can be represented mathematically as points in a high-dimensional gene expression space. Genes interact with each other in regulatory networks, restricting the cellular gene expression profiles to a certain manifold, or surface, in gene expression space. To obtain knowledge about this manifold, various dimensionality reduction methods and distance metrics are used. For data points distributed on curved manifolds, a sensible distance measure would be the geodesic distance along the manifold. In this work, we examine whether an approximate geodesic distance measure captures biological similarities better than the traditionally used Euclidean distance. Results: We computed approximate geodesic distances, determined by the Isomap algorithm, for one set of lymphoma and one set of lung cancer microarray samples. Compared with the ordinary Euclidean distance metric, this distance measure produced more instructive, biologically relevant, visualizations when applying multidimensional scaling. This suggests the Isomap algorithm as a promising tool for the interpretation of microarray data. Furthermore, the results demonstrate the benefit and importance of taking nonlinearities in gene expression data into account. (Less)
Please use this url to cite or link to this publication:
author
; ; and
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
Lymphoma, Microarray, Gene expression, Manifold learning, Lung cancer, Isomap, Nonlinear dimensionality reduction
in
Bioinformatics
volume
20
issue
6
pages
874 - 880
publisher
Oxford University Press
external identifiers
  • wos:000220895100008
  • pmid:14752004
  • scopus:2342431962
ISSN
1367-4803
DOI
10.1093/bioinformatics/btg496
language
English
LU publication?
yes
id
d9656b93-0e65-4006-b92d-e47304f62e9f (old id 281707)
date added to LUP
2016-04-01 12:16:10
date last changed
2022-01-27 01:17:29
@article{d9656b93-0e65-4006-b92d-e47304f62e9f,
  abstract     = {{Motivation: Genome-wide gene expression measurements, as currently determined by the microarray technology, can be represented mathematically as points in a high-dimensional gene expression space. Genes interact with each other in regulatory networks, restricting the cellular gene expression profiles to a certain manifold, or surface, in gene expression space. To obtain knowledge about this manifold, various dimensionality reduction methods and distance metrics are used. For data points distributed on curved manifolds, a sensible distance measure would be the geodesic distance along the manifold. In this work, we examine whether an approximate geodesic distance measure captures biological similarities better than the traditionally used Euclidean distance. Results: We computed approximate geodesic distances, determined by the Isomap algorithm, for one set of lymphoma and one set of lung cancer microarray samples. Compared with the ordinary Euclidean distance metric, this distance measure produced more instructive, biologically relevant, visualizations when applying multidimensional scaling. This suggests the Isomap algorithm as a promising tool for the interpretation of microarray data. Furthermore, the results demonstrate the benefit and importance of taking nonlinearities in gene expression data into account.}},
  author       = {{Nilsson, Jens and Fioretos, Thoas and Höglund, Mattias and Fontes, Magnus}},
  issn         = {{1367-4803}},
  keywords     = {{Lymphoma; Microarray; Gene expression; Manifold learning; Lung cancer; Isomap; Nonlinear dimensionality reduction}},
  language     = {{eng}},
  number       = {{6}},
  pages        = {{874--880}},
  publisher    = {{Oxford University Press}},
  series       = {{Bioinformatics}},
  title        = {{Approximate geodesic distances reveal biologically relevant structures in microarray data}},
  url          = {{http://dx.doi.org/10.1093/bioinformatics/btg496}},
  doi          = {{10.1093/bioinformatics/btg496}},
  volume       = {{20}},
  year         = {{2004}},
}