Automated inference of cell type hierarchies for single-cell RNA-seq data
(2020) BINP50 20201Degree Projects in Bioinformatics
- Abstract
- One of the oldest, but still not answered questions in single-cell RNA-sequencing (scRNA-seq) is how to define a cell type transcriptomically, and this has been becoming more and more relevant as the amount of data grows. It has been recommended that cells should be grouped hierarchically rather than as cell types that are exclusively related to each other, so that it may be able to reflect cell developmental trajectory, show within-cell type gene expression variation, and facilitate cell type comparison across species. Here, I have developed an algorithm for inferring cell type hierarchy directly from scRNA-seq data. In brief, the algorithm applies Leiden clustering to split the data recursively for constructing a hierarchy in a... (More)
- One of the oldest, but still not answered questions in single-cell RNA-sequencing (scRNA-seq) is how to define a cell type transcriptomically, and this has been becoming more and more relevant as the amount of data grows. It has been recommended that cells should be grouped hierarchically rather than as cell types that are exclusively related to each other, so that it may be able to reflect cell developmental trajectory, show within-cell type gene expression variation, and facilitate cell type comparison across species. Here, I have developed an algorithm for inferring cell type hierarchy directly from scRNA-seq data. In brief, the algorithm applies Leiden clustering to split the data recursively for constructing a hierarchy in a “top-down” style. In order to decide the depth and breadth of the hierarchy, self projection confusion rate is used as a key parameter for measuring cluster separability at each layer. The developed algorithm has been tested on a dataset with manually annotated cell type hierarchy and shows reasonable performance. (Less)
- Popular Abstract
- Cell, tell me who you really are?
Cell is the building block of life, and in a multicellular organism (e.g. human), there can be an enormous number of cells, scientists have never stopped their effort on understanding these building blocks of life. Perhaps the first step to study cells is to sort out “who” they are or what cell types they should be called.
A cell type commonly refers to cells with shared characteristics, and is traditionally identified based on their morphology, location and function etc. These traditional ways of cell type identification are though low-throughput and laborious. With the fast-developing single-cell RNA-sequencing (scRNA-seq) technology, this process has been greatly sped up, because a large number of... (More) - Cell, tell me who you really are?
Cell is the building block of life, and in a multicellular organism (e.g. human), there can be an enormous number of cells, scientists have never stopped their effort on understanding these building blocks of life. Perhaps the first step to study cells is to sort out “who” they are or what cell types they should be called.
A cell type commonly refers to cells with shared characteristics, and is traditionally identified based on their morphology, location and function etc. These traditional ways of cell type identification are though low-throughput and laborious. With the fast-developing single-cell RNA-sequencing (scRNA-seq) technology, this process has been greatly sped up, because a large number of single cells can be now easily profiled transcriptomically and classified into groups (i.e. transcriptomic cell types) on the basis of their genome-wide gene expression profiles.
It has been recommended that cells should be grouped hierarchically rather than as cell types that are exclusively related to each other. There are several good reasons for doing so. First, a hierarchy, can simultaneously include cell types from both coarse and fine division that may mirror different cell developmental stages. Second, within-cell type gene expression variation could be presented as subgroups on a hierarchy to maybe reflect different physiological conditions. Last but not least, when we compare cell types across species, their similarity and difference may be also shown contemporarily at different levels of a hierarchy.
Automated inference of cell type hierarchies from scRNA-seq data
Here, in my master’s degree project, I have developed an algorithm for inferring cell type hierarchy directly from scRNA-seq data. In brief, the algorithm applies one high-performance graph clustering method to recursively split a cell population into a series of transcriptomic cell types with nested rank, which can be arranged into a hierarchy. The developed algorithm is completely automated and can be highly de novo (i.e. does not require any prior knowledge about the studied cells), and it has been tested on a dataset with expert-annotated cell type hierarchy and showed reasonable performance.
Master’s Degree Project in Bioinformatics 30 credits 2020
Department of Biology, Lund University
Advisor: Konstantin Khodosevich
Advisors Biotech Research & Innovation Centre (BRIC), University of Copenhagen, Copenhagen, Denmark (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/9029666
@misc{9029666, abstract = {{One of the oldest, but still not answered questions in single-cell RNA-sequencing (scRNA-seq) is how to define a cell type transcriptomically, and this has been becoming more and more relevant as the amount of data grows. It has been recommended that cells should be grouped hierarchically rather than as cell types that are exclusively related to each other, so that it may be able to reflect cell developmental trajectory, show within-cell type gene expression variation, and facilitate cell type comparison across species. Here, I have developed an algorithm for inferring cell type hierarchy directly from scRNA-seq data. In brief, the algorithm applies Leiden clustering to split the data recursively for constructing a hierarchy in a “top-down” style. In order to decide the depth and breadth of the hierarchy, self projection confusion rate is used as a key parameter for measuring cluster separability at each layer. The developed algorithm has been tested on a dataset with manually annotated cell type hierarchy and shows reasonable performance.}}, author = {{Li, Yuan}}, language = {{eng}}, note = {{Student Paper}}, title = {{Automated inference of cell type hierarchies for single-cell RNA-seq data}}, year = {{2020}}, }