Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Computational Inference, Validation, and Analysis of 5’UTR-Leader Sequences of Alleles of Immunoglobulin Heavy Chain Variable Genes

Huang, Yixun ; Thörnqvist, Linnea LU and Ohlin, Mats LU orcid (2021) In Frontiers in Immunology 12.
Abstract

Upstream and downstream sequences of immunoglobulin genes may affect the expression of such genes. However, these sequences are rarely studied or characterized in most studies of immunoglobulin repertoires. Inference from large, rearranged immunoglobulin transcriptome data sets offers an opportunity to define the upstream regions (5’-untranslated regions and leader sequences). We have now established a new data pre-processing procedure to eliminate artifacts caused by a 5’-RACE library generation process, reanalyzed a previously studied data set defining human immunoglobulin heavy chain genes, and identified novel upstream regions, as well as previously identified upstream regions that may have been identified in error. Upstream... (More)

Upstream and downstream sequences of immunoglobulin genes may affect the expression of such genes. However, these sequences are rarely studied or characterized in most studies of immunoglobulin repertoires. Inference from large, rearranged immunoglobulin transcriptome data sets offers an opportunity to define the upstream regions (5’-untranslated regions and leader sequences). We have now established a new data pre-processing procedure to eliminate artifacts caused by a 5’-RACE library generation process, reanalyzed a previously studied data set defining human immunoglobulin heavy chain genes, and identified novel upstream regions, as well as previously identified upstream regions that may have been identified in error. Upstream sequences were also identified for a set of previously uncharacterized germline gene alleles. Several novel upstream region variants were validated, for instance by their segregation to a single haplotype in heterozygotic subjects. SNPs representing several sequence variants were identified from population data. Finally, based on the outcomes of the analysis, we define a set of testable hypotheses with respect to the placement of particular alleles in complex IGHV locus haplotypes, and discuss the evolutionary relatedness of particular heavy chain variable genes based on sequences of their upstream regions.

(Less)
Please use this url to cite or link to this publication:
author
; and
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
5’-untranslated region, adaptive immune receptor repertoire (AIRR), germline gene inference, immunoglobulin germline gene, immunoglobulin heavy chain variable domain, leader sequence
in
Frontiers in Immunology
volume
12
article number
730105
publisher
Frontiers Media S. A.
external identifiers
  • pmid:34671351
  • scopus:85117358263
ISSN
1664-3224
DOI
10.3389/fimmu.2021.730105
language
English
LU publication?
yes
id
8d073b56-39b8-4498-99eb-2833c594ed1c
date added to LUP
2022-03-21 17:11:09
date last changed
2024-08-14 14:35:42
@article{8d073b56-39b8-4498-99eb-2833c594ed1c,
  abstract     = {{<p>Upstream and downstream sequences of immunoglobulin genes may affect the expression of such genes. However, these sequences are rarely studied or characterized in most studies of immunoglobulin repertoires. Inference from large, rearranged immunoglobulin transcriptome data sets offers an opportunity to define the upstream regions (5’-untranslated regions and leader sequences). We have now established a new data pre-processing procedure to eliminate artifacts caused by a 5’-RACE library generation process, reanalyzed a previously studied data set defining human immunoglobulin heavy chain genes, and identified novel upstream regions, as well as previously identified upstream regions that may have been identified in error. Upstream sequences were also identified for a set of previously uncharacterized germline gene alleles. Several novel upstream region variants were validated, for instance by their segregation to a single haplotype in heterozygotic subjects. SNPs representing several sequence variants were identified from population data. Finally, based on the outcomes of the analysis, we define a set of testable hypotheses with respect to the placement of particular alleles in complex IGHV locus haplotypes, and discuss the evolutionary relatedness of particular heavy chain variable genes based on sequences of their upstream regions.</p>}},
  author       = {{Huang, Yixun and Thörnqvist, Linnea and Ohlin, Mats}},
  issn         = {{1664-3224}},
  keywords     = {{5’-untranslated region; adaptive immune receptor repertoire (AIRR); germline gene inference; immunoglobulin germline gene; immunoglobulin heavy chain variable domain; leader sequence}},
  language     = {{eng}},
  month        = {{10}},
  publisher    = {{Frontiers Media S. A.}},
  series       = {{Frontiers in Immunology}},
  title        = {{Computational Inference, Validation, and Analysis of 5’UTR-Leader Sequences of Alleles of Immunoglobulin Heavy Chain Variable Genes}},
  url          = {{http://dx.doi.org/10.3389/fimmu.2021.730105}},
  doi          = {{10.3389/fimmu.2021.730105}},
  volume       = {{12}},
  year         = {{2021}},
}