Manifold Learning in Computational Biology
(2008)- Abstract
- This thesis deals with manifold learning techniques and their application in gene expression data analysis. Manifold learning is the study of methods that aim to infer geometrical structure from data sampled from manifolds, enabling nonlinear solutions to various machine learning tasks. Gene expression data analysis is the analysis of measurements of the abundance of gene products from a set of genes in the cell, which, by the use of microarray technology, can include the whole genome. Since the expression of one gene is dynamically linked to the expression of others, it is reasonable to assume that such expression data exhibits nonlinear structure, why it would be natural to approach its analysis using nonlinear methods, such as manifold... (More)
- This thesis deals with manifold learning techniques and their application in gene expression data analysis. Manifold learning is the study of methods that aim to infer geometrical structure from data sampled from manifolds, enabling nonlinear solutions to various machine learning tasks. Gene expression data analysis is the analysis of measurements of the abundance of gene products from a set of genes in the cell, which, by the use of microarray technology, can include the whole genome. Since the expression of one gene is dynamically linked to the expression of others, it is reasonable to assume that such expression data exhibits nonlinear structure, why it would be natural to approach its analysis using nonlinear methods, such as manifold learning.
Within the methodological development of manifold learning this thesis presents a method for robust estimation of geodesic distances (paper I), and a method for supervised manifold learning based on kernel dimension reduction (paper II). An extension of the latter algorithm to partitioned data is also presented. Further, a method for variable importance assessment in manifold learning is proposed (paper IV).
Within gene expression data analysis, results are presented that demonstrates better performance of manifold learning methods compared to linear methods in visualization of microarray samples (paper III). It is also demonstrated how genes can be ranked according to their influence on the observed structure in such nonlinear representations (paper IV). Finally, it is shown how biologically relevant gene/gene similarity measures can be obtained using unsupervised and supervised manifold learning (paper V). (Less)
Please use this url to cite or link to this publication:
https://lup.lub.lu.se/record/1034593
- author
- Nilsson, Jens LU
- supervisor
- opponent
-
- Professor Tegnér, Jesper, Karolinska Institutet, Stockholm
- organization
- publishing date
- 2008
- type
- Thesis
- publication status
- published
- subject
- keywords
- Nonlinear Dimensionality Reduction, Gene Expression Data, Computational Biology, Machine Learning, Manifold Learning
- pages
- 180 pages
- publisher
- Centre for Mathematical Sciences, Lund University
- defense location
- Lecture room MH:B, Centre for Math. Sciences, Sölvegatan 18, Lund
- defense date
- 2008-03-14 13:15:00
- ISBN
- 978-91-628-7407-0
- language
- English
- LU publication?
- yes
- id
- 9d52eb24-e1f4-4687-b1f4-6a7cf139a115 (old id 1034593)
- date added to LUP
- 2016-04-01 13:36:25
- date last changed
- 2018-11-21 20:17:55
@phdthesis{9d52eb24-e1f4-4687-b1f4-6a7cf139a115, abstract = {{This thesis deals with manifold learning techniques and their application in gene expression data analysis. Manifold learning is the study of methods that aim to infer geometrical structure from data sampled from manifolds, enabling nonlinear solutions to various machine learning tasks. Gene expression data analysis is the analysis of measurements of the abundance of gene products from a set of genes in the cell, which, by the use of microarray technology, can include the whole genome. Since the expression of one gene is dynamically linked to the expression of others, it is reasonable to assume that such expression data exhibits nonlinear structure, why it would be natural to approach its analysis using nonlinear methods, such as manifold learning.<br/><br> <br/><br> Within the methodological development of manifold learning this thesis presents a method for robust estimation of geodesic distances (paper I), and a method for supervised manifold learning based on kernel dimension reduction (paper II). An extension of the latter algorithm to partitioned data is also presented. Further, a method for variable importance assessment in manifold learning is proposed (paper IV). <br/><br> <br/><br> Within gene expression data analysis, results are presented that demonstrates better performance of manifold learning methods compared to linear methods in visualization of microarray samples (paper III). It is also demonstrated how genes can be ranked according to their influence on the observed structure in such nonlinear representations (paper IV). Finally, it is shown how biologically relevant gene/gene similarity measures can be obtained using unsupervised and supervised manifold learning (paper V).}}, author = {{Nilsson, Jens}}, isbn = {{978-91-628-7407-0}}, keywords = {{Nonlinear Dimensionality Reduction; Gene Expression Data; Computational Biology; Machine Learning; Manifold Learning}}, language = {{eng}}, publisher = {{Centre for Mathematical Sciences, Lund University}}, school = {{Lund University}}, title = {{Manifold Learning in Computational Biology}}, url = {{https://lup.lub.lu.se/search/files/3475579/1034608.pdf}}, year = {{2008}}, }