Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Automated inference of cell type hierarchies for single-cell RNA-seq data

Li, Yuan (2020) BINP50 20201
Degree Projects in Bioinformatics
Abstract
One of the oldest, but still not answered questions in single-cell RNA-sequencing (scRNA-seq) is how to define a cell type transcriptomically, and this has been becoming more and more relevant as the amount of data grows. It has been recommended that cells should be grouped hierarchically rather than as cell types that are exclusively related to each other, so that it may be able to reflect cell developmental trajectory, show within-cell type gene expression variation, and facilitate cell type comparison across species. Here, I have developed an algorithm for inferring cell type hierarchy directly from scRNA-seq data. In brief, the algorithm applies Leiden clustering to split the data recursively for constructing a hierarchy in a... (More)
One of the oldest, but still not answered questions in single-cell RNA-sequencing (scRNA-seq) is how to define a cell type transcriptomically, and this has been becoming more and more relevant as the amount of data grows. It has been recommended that cells should be grouped hierarchically rather than as cell types that are exclusively related to each other, so that it may be able to reflect cell developmental trajectory, show within-cell type gene expression variation, and facilitate cell type comparison across species. Here, I have developed an algorithm for inferring cell type hierarchy directly from scRNA-seq data. In brief, the algorithm applies Leiden clustering to split the data recursively for constructing a hierarchy in a “top-down” style. In order to decide the depth and breadth of the hierarchy, self projection confusion rate is used as a key parameter for measuring cluster separability at each layer. The developed algorithm has been tested on a dataset with manually annotated cell type hierarchy and shows reasonable performance. (Less)
Popular Abstract
Cell, tell me who you really are?

Cell is the building block of life, and in a multicellular organism (e.g. human), there can be an enormous number of cells, scientists have never stopped their effort on understanding these building blocks of life. Perhaps the first step to study cells is to sort out “who” they are or what cell types they should be called.

A cell type commonly refers to cells with shared characteristics, and is traditionally identified based on their morphology, location and function etc. These traditional ways of cell type identification are though low-throughput and laborious. With the fast-developing single-cell RNA-sequencing (scRNA-seq) technology, this process has been greatly sped up, because a large number of... (More)
Cell, tell me who you really are?

Cell is the building block of life, and in a multicellular organism (e.g. human), there can be an enormous number of cells, scientists have never stopped their effort on understanding these building blocks of life. Perhaps the first step to study cells is to sort out “who” they are or what cell types they should be called.

A cell type commonly refers to cells with shared characteristics, and is traditionally identified based on their morphology, location and function etc. These traditional ways of cell type identification are though low-throughput and laborious. With the fast-developing single-cell RNA-sequencing (scRNA-seq) technology, this process has been greatly sped up, because a large number of single cells can be now easily profiled transcriptomically and classified into groups (i.e. transcriptomic cell types) on the basis of their genome-wide gene expression profiles.

It has been recommended that cells should be grouped hierarchically rather than as cell types that are exclusively related to each other. There are several good reasons for doing so. First, a hierarchy, can simultaneously include cell types from both coarse and fine division that may mirror different cell developmental stages. Second, within-cell type gene expression variation could be presented as subgroups on a hierarchy to maybe reflect different physiological conditions. Last but not least, when we compare cell types across species, their similarity and difference may be also shown contemporarily at different levels of a hierarchy.
Automated inference of cell type hierarchies from scRNA-seq data
Here, in my master’s degree project, I have developed an algorithm for inferring cell type hierarchy directly from scRNA-seq data. In brief, the algorithm applies one high-performance graph clustering method to recursively split a cell population into a series of transcriptomic cell types with nested rank, which can be arranged into a hierarchy. The developed algorithm is completely automated and can be highly de novo (i.e. does not require any prior knowledge about the studied cells), and it has been tested on a dataset with expert-annotated cell type hierarchy and showed reasonable performance.



Master’s Degree Project in Bioinformatics 30 credits 2020
Department of Biology, Lund University

Advisor: Konstantin Khodosevich
Advisors Biotech Research & Innovation Centre (BRIC), University of Copenhagen, Copenhagen, Denmark (Less)
Please use this url to cite or link to this publication:
author
Li, Yuan
supervisor
organization
course
BINP50 20201
year
type
H2 - Master's Degree (Two Years)
subject
language
English
id
9029666
date added to LUP
2020-09-21 13:52:30
date last changed
2020-09-22 09:18:57
@misc{9029666,
  abstract     = {{One of the oldest, but still not answered questions in single-cell RNA-sequencing (scRNA-seq) is how to define a cell type transcriptomically, and this has been becoming more and more relevant as the amount of data grows. It has been recommended that cells should be grouped hierarchically rather than as cell types that are exclusively related to each other, so that it may be able to reflect cell developmental trajectory, show within-cell type gene expression variation, and facilitate cell type comparison across species. Here, I have developed an algorithm for inferring cell type hierarchy directly from scRNA-seq data. In brief, the algorithm applies Leiden clustering to split the data recursively for constructing a hierarchy in a “top-down” style. In order to decide the depth and breadth of the hierarchy, self projection confusion rate is used as a key parameter for measuring cluster separability at each layer. The developed algorithm has been tested on a dataset with manually annotated cell type hierarchy and shows reasonable performance.}},
  author       = {{Li, Yuan}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Automated inference of cell type hierarchies for single-cell RNA-seq data}},
  year         = {{2020}},
}