Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Novel risk genes for systemic lupus erythematosus predicted by random forest classification

Almlöf, Jonas Carlsson ; Alexsson, Andrei ; Imgenberg-Kreuz, Juliana ; Sylwan, Lina ; Bäcklin, Christofer ; Leonard, Dag ; Nordmark, Gunnel ; Tandre, Karolina ; Eloranta, Maija-Leena and Padyukov, Leonid , et al. (2017) In Scientific Reports 7(1).
Abstract

Genome-wide association studies have identified risk loci for SLE, but a large proportion of the genetic contribution to SLE still remains unexplained. To detect novel risk genes, and to predict an individual's SLE risk we designed a random forest classifier using SNP genotype data generated on the "Immunochip" from 1,160 patients with SLE and 2,711 controls. Using gene importance scores defined by the random forest classifier, we identified 15 potential novel risk genes for SLE. Of them 12 are associated with other autoimmune diseases than SLE, whereas three genes (ZNF804A, CDK1, and MANF) have not previously been associated with autoimmunity. Random forest classification also allowed prediction of patients at risk for lupus nephritis... (More)

Genome-wide association studies have identified risk loci for SLE, but a large proportion of the genetic contribution to SLE still remains unexplained. To detect novel risk genes, and to predict an individual's SLE risk we designed a random forest classifier using SNP genotype data generated on the "Immunochip" from 1,160 patients with SLE and 2,711 controls. Using gene importance scores defined by the random forest classifier, we identified 15 potential novel risk genes for SLE. Of them 12 are associated with other autoimmune diseases than SLE, whereas three genes (ZNF804A, CDK1, and MANF) have not previously been associated with autoimmunity. Random forest classification also allowed prediction of patients at risk for lupus nephritis with an area under the curve of 0.94. By allele-specific gene expression analysis we detected cis-regulatory SNPs that affect the expression levels of six of the top 40 genes designed by the random forest analysis, indicating a regulatory role for the identified risk variants. The 40 top genes from the prediction were overrepresented for differential expression in B and T cells according to RNA-sequencing of samples from five healthy donors, with more frequent over-expression in B cells compared to T cells.

(Less)
Please use this url to cite or link to this publication:
author
; ; ; ; ; ; ; ; and , et al. (More)
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; and (Less)
organization
publishing date
type
Contribution to journal
publication status
published
subject
in
Scientific Reports
volume
7
issue
1
article number
6236
publisher
Nature Publishing Group
external identifiers
  • pmid:28740209
  • wos:000406260100040
  • scopus:85025823732
ISSN
2045-2322
DOI
10.1038/s41598-017-06516-1
language
English
LU publication?
yes
id
bd995694-7e1b-4159-a59e-0b517a30be71
date added to LUP
2017-08-02 07:19:42
date last changed
2024-04-14 15:50:37
@article{bd995694-7e1b-4159-a59e-0b517a30be71,
  abstract     = {{<p>Genome-wide association studies have identified risk loci for SLE, but a large proportion of the genetic contribution to SLE still remains unexplained. To detect novel risk genes, and to predict an individual's SLE risk we designed a random forest classifier using SNP genotype data generated on the "Immunochip" from 1,160 patients with SLE and 2,711 controls. Using gene importance scores defined by the random forest classifier, we identified 15 potential novel risk genes for SLE. Of them 12 are associated with other autoimmune diseases than SLE, whereas three genes (ZNF804A, CDK1, and MANF) have not previously been associated with autoimmunity. Random forest classification also allowed prediction of patients at risk for lupus nephritis with an area under the curve of 0.94. By allele-specific gene expression analysis we detected cis-regulatory SNPs that affect the expression levels of six of the top 40 genes designed by the random forest analysis, indicating a regulatory role for the identified risk variants. The 40 top genes from the prediction were overrepresented for differential expression in B and T cells according to RNA-sequencing of samples from five healthy donors, with more frequent over-expression in B cells compared to T cells.</p>}},
  author       = {{Almlöf, Jonas Carlsson and Alexsson, Andrei and Imgenberg-Kreuz, Juliana and Sylwan, Lina and Bäcklin, Christofer and Leonard, Dag and Nordmark, Gunnel and Tandre, Karolina and Eloranta, Maija-Leena and Padyukov, Leonid and Bengtsson, Christine and Jönsen, Andreas and Dahlqvist, Solbritt Rantapää and Sjöwall, Christopher and Bengtsson, Anders A. and Gunnarsson, Iva and Svenungsson, Elisabet and Rönnblom, Lars and Sandling, Johanna K. and Syvänen, Ann-Christine}},
  issn         = {{2045-2322}},
  language     = {{eng}},
  month        = {{12}},
  number       = {{1}},
  publisher    = {{Nature Publishing Group}},
  series       = {{Scientific Reports}},
  title        = {{Novel risk genes for systemic lupus erythematosus predicted by random forest classification}},
  url          = {{http://dx.doi.org/10.1038/s41598-017-06516-1}},
  doi          = {{10.1038/s41598-017-06516-1}},
  volume       = {{7}},
  year         = {{2017}},
}