Advanced

SMSSVD : SubMatrix Selection Singular Value Decomposition

Henningsson, Rasmus LU and Fontes, Magnus LU (2019) In Bioinformatics (Oxford, England) 35(3). p.478-486
Abstract

Motivation: High throughput biomedical measurements normally capture multiple overlaid biologically relevant signals and often also signals representing different types of technical artefacts like e.g. batch effects. Signal identification and decomposition are accordingly main objectives in statistical biomedical modeling and data analysis. Existing methods, aimed at signal reconstruction and deconvolution, in general, are either supervised, contain parameters that need to be estimated or present other types of ad hoc features. We here introduce SubMatrix Selection Singular Value Decomposition (SMSSVD), a parameter-free unsupervised signal decomposition and dimension reduction method, designed to reduce noise, adaptively for each... (More)

Motivation: High throughput biomedical measurements normally capture multiple overlaid biologically relevant signals and often also signals representing different types of technical artefacts like e.g. batch effects. Signal identification and decomposition are accordingly main objectives in statistical biomedical modeling and data analysis. Existing methods, aimed at signal reconstruction and deconvolution, in general, are either supervised, contain parameters that need to be estimated or present other types of ad hoc features. We here introduce SubMatrix Selection Singular Value Decomposition (SMSSVD), a parameter-free unsupervised signal decomposition and dimension reduction method, designed to reduce noise, adaptively for each low-rank-signal in a given data matrix, and represent the signals in the data in a way that enable unbiased exploratory analysis and reconstruction of multiple overlaid signals, including identifying groups of variables that drive different signals. Results: The SMSSVD method produces a denoised signal decomposition from a given data matrix. It also guarantees orthogonality between signal components in a straightforward manner and it is designed to make automation possible. We illustrate SMSSVD by applying it to several real and synthetic datasets and compare its performance to golden standard methods like PCA (Principal Component Analysis) and SPC (Sparse Principal Components, using Lasso constraints). The SMSSVD is computationally efficient and despite being a parameter-free method, in general, outperforms existing statistical learning methods. Availability and implementation: A Julia implementation of SMSSVD is openly available on GitHub (https://github.com/rasmushenningsson/SubMatrixSelectionSVD.jl). Supplementary information: Supplementary data are available at Bioinformatics online.

(Less)
Please use this url to cite or link to this publication:
author
organization
publishing date
type
Contribution to journal
publication status
published
subject
in
Bioinformatics (Oxford, England)
volume
35
issue
3
pages
9 pages
publisher
Oxford University Press
external identifiers
  • scopus:85061118872
ISSN
1367-4803
DOI
10.1093/bioinformatics/bty566
language
English
LU publication?
yes
id
1af440f1-7da6-4bfd-afa2-3695f4f2c376
date added to LUP
2019-02-15 08:28:54
date last changed
2019-03-12 04:21:21
@article{1af440f1-7da6-4bfd-afa2-3695f4f2c376,
  abstract     = {<p>Motivation: High throughput biomedical measurements normally capture multiple overlaid biologically relevant signals and often also signals representing different types of technical artefacts like e.g. batch effects. Signal identification and decomposition are accordingly main objectives in statistical biomedical modeling and data analysis. Existing methods, aimed at signal reconstruction and deconvolution, in general, are either supervised, contain parameters that need to be estimated or present other types of ad hoc features. We here introduce SubMatrix Selection Singular Value Decomposition (SMSSVD), a parameter-free unsupervised signal decomposition and dimension reduction method, designed to reduce noise, adaptively for each low-rank-signal in a given data matrix, and represent the signals in the data in a way that enable unbiased exploratory analysis and reconstruction of multiple overlaid signals, including identifying groups of variables that drive different signals. Results: The SMSSVD method produces a denoised signal decomposition from a given data matrix. It also guarantees orthogonality between signal components in a straightforward manner and it is designed to make automation possible. We illustrate SMSSVD by applying it to several real and synthetic datasets and compare its performance to golden standard methods like PCA (Principal Component Analysis) and SPC (Sparse Principal Components, using Lasso constraints). The SMSSVD is computationally efficient and despite being a parameter-free method, in general, outperforms existing statistical learning methods. Availability and implementation: A Julia implementation of SMSSVD is openly available on GitHub (https://github.com/rasmushenningsson/SubMatrixSelectionSVD.jl). Supplementary information: Supplementary data are available at Bioinformatics online.</p>},
  author       = {Henningsson, Rasmus and Fontes, Magnus},
  issn         = {1367-4803},
  language     = {eng},
  number       = {3},
  pages        = {478--486},
  publisher    = {Oxford University Press},
  series       = {Bioinformatics (Oxford, England)},
  title        = {SMSSVD : SubMatrix Selection Singular Value Decomposition},
  url          = {http://dx.doi.org/10.1093/bioinformatics/bty566},
  volume       = {35},
  year         = {2019},
}