Improving missing value imputation of microarray data by using spot quality weights

Johansson, Peter; Häkkinen, Jari

Improving missing value imputation of microarray data by using spot quality weights

Mark

Johansson, Peter ^LU and Häkkinen, Jari ^LU

(2006) In BMC Bioinformatics 7.

Abstract: Background

Microarray technology has become popular for gene expression profiling, and many analysis tools have been developed for data interpretation. Most of these tools require complete data, but measurement values are often missing A way to overcome the problem of incomplete data is to impute the missing data before analysis. Many imputation methods have been suggested, some naïve and other more sophisticated taking into account correlation in data. However, these methods are binary in the sense that each spot is considered either missing or present. Hence, they are depending on a cutoff separating poor spots from good spots. We suggest a different approach in which a continuous spot quality weight is built into the imputation... (More); Background

Microarray technology has become popular for gene expression profiling, and many analysis tools have been developed for data interpretation. Most of these tools require complete data, but measurement values are often missing A way to overcome the problem of incomplete data is to impute the missing data before analysis. Many imputation methods have been suggested, some naïve and other more sophisticated taking into account correlation in data. However, these methods are binary in the sense that each spot is considered either missing or present. Hence, they are depending on a cutoff separating poor spots from good spots. We suggest a different approach in which a continuous spot quality weight is built into the imputation methods, allowing for smooth imputations of all spots to larger or lesser degree.

Results

We assessed several imputation methods on three data sets containing replicate measurements, and found that weighted methods performed better than non-weighted methods. Of the compared methods, best performance and robustness were achieved with the weighted nearest neighbours method (WeNNI), in which both spot quality and correlations between genes were included in the imputation.

Conclusion

Including a measure of spot quality improves the accuracy of the missing value imputation. WeNNI, the proposed method is more accurate and less sensitive to parameters than the widely used kNNimpute and LSimpute algorithms. (Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/796425

author

Johansson, Peter ^LU and Häkkinen, Jari ^LU

organization

Computational Biology and Biological Physics

publishing date

2006

type

Contribution to journal

publication status

published

subject

Bioinformatics and Computational Biology

in

BMC Bioinformatics

volume

7

article number

306

publisher

BioMed Central (BMC)

external identifiers

pmid:16780582
wos:000239518800002
scopus:33746974277
pmid:16780582

ISSN

1471-2105

DOI

10.1186/1471-2105-7-306

language

English

LU publication?

yes

id

19c685d1-3da8-4f71-b6f1-aa4fef779639 (old id 796425)

date added to LUP

2016-04-04 09:12:05

date last changed

2025-10-14 09:09:38

@article{19c685d1-3da8-4f71-b6f1-aa4fef779639,
  abstract     = {{Background<br/><br>
Microarray technology has become popular for gene expression profiling, and many analysis tools have been developed for data interpretation. Most of these tools require complete data, but measurement values are often missing A way to overcome the problem of incomplete data is to impute the missing data before analysis. Many imputation methods have been suggested, some naïve and other more sophisticated taking into account correlation in data. However, these methods are binary in the sense that each spot is considered either missing or present. Hence, they are depending on a cutoff separating poor spots from good spots. We suggest a different approach in which a continuous spot quality weight is built into the imputation methods, allowing for smooth imputations of all spots to larger or lesser degree.<br/><br>
<br/><br>
<br/><br>
Results<br/><br>
We assessed several imputation methods on three data sets containing replicate measurements, and found that weighted methods performed better than non-weighted methods. Of the compared methods, best performance and robustness were achieved with the weighted nearest neighbours method (WeNNI), in which both spot quality and correlations between genes were included in the imputation.<br/><br>
<br/><br>
<br/><br>
Conclusion<br/><br>
Including a measure of spot quality improves the accuracy of the missing value imputation. WeNNI, the proposed method is more accurate and less sensitive to parameters than the widely used kNNimpute and LSimpute algorithms.}},
  author       = {{Johansson, Peter and Häkkinen, Jari}},
  issn         = {{1471-2105}},
  language     = {{eng}},
  publisher    = {{BioMed Central (BMC)}},
  series       = {{BMC Bioinformatics}},
  title        = {{Improving missing value imputation of microarray data by using spot quality weights}},
  url          = {{http://dx.doi.org/10.1186/1471-2105-7-306}},
  doi          = {{10.1186/1471-2105-7-306}},
  volume       = {{7}},
  year         = {{2006}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Improving missing value imputation of microarray data by using spot quality weights