Regression analysis and modelling of data acquisition for SELDI-TOF mass spectrometry.

Sköld, Martin; Rydén, Tobias; Samuelsson, Viktoria; Welinder, Charlotte; Ekblad, Lars; Olsson, Håkan; Baldetorp, Bo

Regression analysis and modelling of data acquisition for SELDI-TOF mass spectrometry.

Mark

Sköld, Martin ^LU ; Rydén, Tobias ^LU ; Samuelsson, Viktoria ; Welinder, Charlotte ^LU ; Ekblad, Lars ^LU

; Olsson, Håkan ^LU

and Baldetorp, Bo ^LU (2007) In Bioinformatics 23(11). p.1401-1409

Abstract: Motivation: Pre-processing of SELDI-TOF mass spectrometry data is currently performed on a largel y ad hoc basis. This makes comparison of results from independent analyses troublesome and does not provide a framework for distinguishing different sources of variation in data. Results: In this article, we consider the task of pooling a large number of single-shot spectra, a task commonly performed automatically by the instrument software. By viewing the underlying statistical problem as one of heteroscedastic linear regression, we provide a framework for introducing robust methods and for dealing with missing data resulting from a limited span of recordable intensity values provided by the instrument. Our framework provides an... (More); Motivation: Pre-processing of SELDI-TOF mass spectrometry data is currently performed on a largel y ad hoc basis. This makes comparison of results from independent analyses troublesome and does not provide a framework for distinguishing different sources of variation in data. Results: In this article, we consider the task of pooling a large number of single-shot spectra, a task commonly performed automatically by the instrument software. By viewing the underlying statistical problem as one of heteroscedastic linear regression, we provide a framework for introducing robust methods and for dealing with missing data resulting from a limited span of recordable intensity values provided by the instrument. Our framework provides an interpretation of currently used methods as a maximum-likelihood estimator and allows theoretical derivation of its variance. We observe that this variance depends crucially on the total number of ionic species, which can vary considerably between different pooled spectra. This variation in variance can potentially invalidate the results from naive methods of discrimination/classification and we outline appropriate data transformations. Introducing methods from robust statistics did not improve the standard errors of the pooled samples. Imputing missing values however-using the EM algorithm-had a notable effect on the result; for our data, the pooled height of peaks which were frequently truncated increased by up to 30%. (Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/166222

author

Sköld, Martin ^LU ; Rydén, Tobias ^LU ; Samuelsson, Viktoria ; Welinder, Charlotte ^LU ; Ekblad, Lars ^LU

; Olsson, Håkan ^LU

and Baldetorp, Bo ^LU

organization

publishing date

2007

type

Contribution to journal

publication status

published

subject

Bioinformatics and Systems Biology

in

Bioinformatics

volume

23

issue

11

pages

1401 - 1409

publisher

Oxford University Press

external identifiers

wos:000247781300013
scopus:34447310845

ISSN

1367-4803

DOI

10.1093/bioinformatics/btm104

language

English

LU publication?

yes

id

78f285a9-4bb0-40e5-9664-edbbf6527e83 (old id 166222)

date added to LUP

2016-04-01 16:45:09

date last changed

2022-01-28 21:50:32

@article{78f285a9-4bb0-40e5-9664-edbbf6527e83,
  abstract     = {{Motivation: Pre-processing of SELDI-TOF mass spectrometry data is currently performed on a largel y ad hoc basis. This makes comparison of results from independent analyses troublesome and does not provide a framework for distinguishing different sources of variation in data. Results: In this article, we consider the task of pooling a large number of single-shot spectra, a task commonly performed automatically by the instrument software. By viewing the underlying statistical problem as one of heteroscedastic linear regression, we provide a framework for introducing robust methods and for dealing with missing data resulting from a limited span of recordable intensity values provided by the instrument. Our framework provides an interpretation of currently used methods as a maximum-likelihood estimator and allows theoretical derivation of its variance. We observe that this variance depends crucially on the total number of ionic species, which can vary considerably between different pooled spectra. This variation in variance can potentially invalidate the results from naive methods of discrimination/classification and we outline appropriate data transformations. Introducing methods from robust statistics did not improve the standard errors of the pooled samples. Imputing missing values however-using the EM algorithm-had a notable effect on the result; for our data, the pooled height of peaks which were frequently truncated increased by up to 30%.}},
  author       = {{Sköld, Martin and Rydén, Tobias and Samuelsson, Viktoria and Welinder, Charlotte and Ekblad, Lars and Olsson, Håkan and Baldetorp, Bo}},
  issn         = {{1367-4803}},
  language     = {{eng}},
  number       = {{11}},
  pages        = {{1401--1409}},
  publisher    = {{Oxford University Press}},
  series       = {{Bioinformatics}},
  title        = {{Regression analysis and modelling of data acquisition for SELDI-TOF mass spectrometry.}},
  url          = {{http://dx.doi.org/10.1093/bioinformatics/btm104}},
  doi          = {{10.1093/bioinformatics/btm104}},
  volume       = {{23}},
  year         = {{2007}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Regression analysis and modelling of data acquisition for SELDI-TOF mass spectrometry.