Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Computational inference and analysis of 5’UTR-leader sequence of alleles of immunoglobulin heavy chain genes

Huang, Yixun (2020) BINP50 20201
Degree Projects in Bioinformatics
Popular Abstract
How the IGHV upstream sequences look like?

The antibody, also called immunoglobulin (lG), can identify and bind specifically to antigens such as bacteria, viruses, and other disease-causing organisms. Its high variety is achieved by the selection and recombination of genes from normal full-length genomic genes. The antibody contains two identical heavy chains and two identical light chains. What I invested in this study is the 5′ untranslated region (5’UTR) and leader sequences of genes which locate upstream of the immunoglobulin heavy chain variable (IGHV) genes The genes do not play a direct role in antibody specificity for its antigen, but their diversity may affect production level of the antibody.
The knowledge of them are not... (More)
How the IGHV upstream sequences look like?

The antibody, also called immunoglobulin (lG), can identify and bind specifically to antigens such as bacteria, viruses, and other disease-causing organisms. Its high variety is achieved by the selection and recombination of genes from normal full-length genomic genes. The antibody contains two identical heavy chains and two identical light chains. What I invested in this study is the 5′ untranslated region (5’UTR) and leader sequences of genes which locate upstream of the immunoglobulin heavy chain variable (IGHV) genes The genes do not play a direct role in antibody specificity for its antigen, but their diversity may affect production level of the antibody.
The knowledge of them are not complete because they are less commonly examined.

So how do to determine the dataset of 5’UTR-leader sequences? Here, I developed a new pipeline of immunoglobulin 5’UTR-leader sequence analysis. Firstly, I optimized the pre-process approach to improve the accuracy of 5’UTR-leader germline dataset. Next, I inferred 5’UTR-leader sequences of each gene and sum them together into a dataset. I doubt some sequences in global reference database with the evidence provided by my study. Moreover, I tested the quality of 5’UTR-leader sequence for two example alleles by two processes: haplotype inference and complementarity-determining region 3 (CDR3) length distribution analysis. Both analysis are commonly used to explore immunoglobulin genes in previous studies. The results of both processes supported the genetic variation of inferred 5’UTR-leader sequences.

Given the importance of antibodies for our survival in a hostile environment populated by microbes and viruses, it is important for us to understand how antibody production is regulated and controlled. This study provides a validated view of the diversity of sequences outside of the gene that encode the final antibodies. As such regions are important for the production of antibodies it allows us to in the future develop our understanding of such genetic diversity and its role in the shaping of the set of antibodies that protect us all from disease.

Master’s Degree Project in Bioinformatics 30 credits 2020
Department of Biology, Lund University

Advisor: Mats Ohlin
Department of Immunotechnology, Lund University (Less)
Please use this url to cite or link to this publication:
author
Huang, Yixun
supervisor
organization
course
BINP50 20201
year
type
H2 - Master's Degree (Two Years)
subject
language
English
id
9039919
date added to LUP
2021-02-08 16:15:47
date last changed
2021-02-08 16:15:47
@misc{9039919,
  author       = {{Huang, Yixun}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Computational inference and analysis of 5’UTR-leader sequence of alleles of immunoglobulin heavy chain genes}},
  year         = {{2020}},
}