Methodological study of affine transformations of gene expression data with proposed robust non-parametric multi-dimensional normalization method

Bengtsson, Henrik; Hössjer, Ola

Methodological study of affine transformations of gene expression data with proposed robust non-parametric multi-dimensional normalization method

Mark

Bengtsson, Henrik ^LU and Hössjer, Ola ^LU (2006) In BMC Bioinformatics 7.

Abstract: Background: Low-level processing and normalization of microarray data are most important steps in microarray analysis, which have profound impact on downstream analysis. Multiple methods have been suggested to date, but it is not clear which is the best. It is therefore important to further study the different normalization methods in detail and the nature of microarray data in general. Results: A methodological study of affine models for gene expression data is carried out. Focus is on two-channel comparative studies, but the findings generalize also to single- and multi-channel data. The discussion applies to spotted as well as in-situ synthesized microarray data. Existing normalization methods such as curve-fit ("lowess") normalization,... (More); Background: Low-level processing and normalization of microarray data are most important steps in microarray analysis, which have profound impact on downstream analysis. Multiple methods have been suggested to date, but it is not clear which is the best. It is therefore important to further study the different normalization methods in detail and the nature of microarray data in general. Results: A methodological study of affine models for gene expression data is carried out. Focus is on two-channel comparative studies, but the findings generalize also to single- and multi-channel data. The discussion applies to spotted as well as in-situ synthesized microarray data. Existing normalization methods such as curve-fit ("lowess") normalization, parallel and perpendicular translation normalization, and quantile normalization, but also dye-swap normalization are revisited in the light of the affine model and their strengths and weaknesses are investigated in this context. As a direct result from this study, we propose a robust non-parametric multi-dimensional affine normalization method, which can be applied to any number of microarrays with any number of channels either individually or all at once. A high-quality cDNA microarray data set with spike-in controls is used to demonstrate the power of the affine model and the proposed normalization method. Conclusion: We find that an affine model can explain non-linear intensity-dependent systematic effects in observed log-ratios. Affine normalization removes such artifacts for non-differentially expressed genes and assures that symmetry between negative and positive log-ratios is obtained, which is fundamental when identifying differentially expressed genes. In addition, affine normalization makes the empirical distributions in different channels more equal, which is the purpose of quantile normalization, and may also explain why dye-swap normalization works or fails. All methods are made available in the aroma package, which is a platform-independent package for R. (Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/908525

author

Bengtsson, Henrik ^LU and Hössjer, Ola ^LU

organization

Mathematical Statistics

publishing date

2006

type

Contribution to journal

publication status

published

subject

Bioinformatics and Computational Biology

in

BMC Bioinformatics

volume

7

publisher

BioMed Central (BMC)

external identifiers

wos:000239735500001
pmid:16509971
scopus:33746977174

ISSN

1471-2105

DOI

10.1186/1471-2105-7-100

language

English

LU publication?

yes

id

94b23bf5-525d-4c4d-8d6e-5122b2be3579 (old id 908525)

date added to LUP

2016-04-01 16:23:45

date last changed

2025-10-14 12:01:27

@article{94b23bf5-525d-4c4d-8d6e-5122b2be3579,
  abstract     = {{Background: Low-level processing and normalization of microarray data are most important steps in microarray analysis, which have profound impact on downstream analysis. Multiple methods have been suggested to date, but it is not clear which is the best. It is therefore important to further study the different normalization methods in detail and the nature of microarray data in general. Results: A methodological study of affine models for gene expression data is carried out. Focus is on two-channel comparative studies, but the findings generalize also to single- and multi-channel data. The discussion applies to spotted as well as in-situ synthesized microarray data. Existing normalization methods such as curve-fit ("lowess") normalization, parallel and perpendicular translation normalization, and quantile normalization, but also dye-swap normalization are revisited in the light of the affine model and their strengths and weaknesses are investigated in this context. As a direct result from this study, we propose a robust non-parametric multi-dimensional affine normalization method, which can be applied to any number of microarrays with any number of channels either individually or all at once. A high-quality cDNA microarray data set with spike-in controls is used to demonstrate the power of the affine model and the proposed normalization method. Conclusion: We find that an affine model can explain non-linear intensity-dependent systematic effects in observed log-ratios. Affine normalization removes such artifacts for non-differentially expressed genes and assures that symmetry between negative and positive log-ratios is obtained, which is fundamental when identifying differentially expressed genes. In addition, affine normalization makes the empirical distributions in different channels more equal, which is the purpose of quantile normalization, and may also explain why dye-swap normalization works or fails. All methods are made available in the aroma package, which is a platform-independent package for R.}},
  author       = {{Bengtsson, Henrik and Hössjer, Ola}},
  issn         = {{1471-2105}},
  language     = {{eng}},
  publisher    = {{BioMed Central (BMC)}},
  series       = {{BMC Bioinformatics}},
  title        = {{Methodological study of affine transformations of gene expression data with proposed robust non-parametric multi-dimensional normalization method}},
  url          = {{http://dx.doi.org/10.1186/1471-2105-7-100}},
  doi          = {{10.1186/1471-2105-7-100}},
  volume       = {{7}},
  year         = {{2006}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Methodological study of affine transformations of gene expression data with proposed robust non-parametric multi-dimensional normalization method