Advanced

Batch adjustment by reference alignment (BARA) : Improved prediction performance in biological test sets with batch effects

Gradin, Robin LU ; Lindstedt, Malin LU and Johansson, Henrik LU (2019) In PLoS ONE 14(2).
Abstract

Many biological data acquisition platforms suffer from inadvertent inclusion of biologically irrelevant variance in analyzed data, collectively termed batch effects. Batch effects can lead to difficulties in downstream analysis by lowering the power to detect biologically interesting differences and can in certain instances lead to false discoveries. They are especially troublesome in predictive modelling where samples in training sets and test sets are often completely correlated with batches. In this article, we present BARA, a normalization method for adjusting batch effects in predictive modelling. BARA utilizes a few reference samples to adjust for batch effects in a compressed data space spanned by the training set. We evaluate... (More)

Many biological data acquisition platforms suffer from inadvertent inclusion of biologically irrelevant variance in analyzed data, collectively termed batch effects. Batch effects can lead to difficulties in downstream analysis by lowering the power to detect biologically interesting differences and can in certain instances lead to false discoveries. They are especially troublesome in predictive modelling where samples in training sets and test sets are often completely correlated with batches. In this article, we present BARA, a normalization method for adjusting batch effects in predictive modelling. BARA utilizes a few reference samples to adjust for batch effects in a compressed data space spanned by the training set. We evaluate BARA using a collection of publicly available datasets and three different prediction models, and compare its performance to already existing methods developed for similar purposes. The results show that data normalized with BARA generates high and consistent prediction performances. Further, they suggest that BARA produces reliable performances independent of the examined classifiers. We therefore conclude that BARA has great potential to facilitate the development of predictive assays where test sets and training sets are correlated with batch.

(Less)
Please use this url to cite or link to this publication:
author
organization
publishing date
type
Contribution to journal
publication status
published
subject
in
PLoS ONE
volume
14
issue
2
publisher
Public Library of Science
external identifiers
  • scopus:85061975258
ISSN
1932-6203
DOI
10.1371/journal.pone.0212669
language
English
LU publication?
yes
id
3f673bb9-450b-4808-9b68-16da6a9fa06b
date added to LUP
2019-03-06 13:00:26
date last changed
2019-06-06 03:00:14
@article{3f673bb9-450b-4808-9b68-16da6a9fa06b,
  abstract     = {<p>Many biological data acquisition platforms suffer from inadvertent inclusion of biologically irrelevant variance in analyzed data, collectively termed batch effects. Batch effects can lead to difficulties in downstream analysis by lowering the power to detect biologically interesting differences and can in certain instances lead to false discoveries. They are especially troublesome in predictive modelling where samples in training sets and test sets are often completely correlated with batches. In this article, we present BARA, a normalization method for adjusting batch effects in predictive modelling. BARA utilizes a few reference samples to adjust for batch effects in a compressed data space spanned by the training set. We evaluate BARA using a collection of publicly available datasets and three different prediction models, and compare its performance to already existing methods developed for similar purposes. The results show that data normalized with BARA generates high and consistent prediction performances. Further, they suggest that BARA produces reliable performances independent of the examined classifiers. We therefore conclude that BARA has great potential to facilitate the development of predictive assays where test sets and training sets are correlated with batch.</p>},
  articleno    = {e0212669},
  author       = {Gradin, Robin and Lindstedt, Malin and Johansson, Henrik},
  issn         = {1932-6203},
  language     = {eng},
  month        = {02},
  number       = {2},
  publisher    = {Public Library of Science},
  series       = {PLoS ONE},
  title        = {Batch adjustment by reference alignment (BARA) : Improved prediction performance in biological test sets with batch effects},
  url          = {http://dx.doi.org/10.1371/journal.pone.0212669},
  volume       = {14},
  year         = {2019},
}