Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Cancer Driver Gene Detection using Deep Convolutional Neural Networks on H3K4me3 Enrichment Profiles

Pielies Avellí, Marc LU (2022) FYTM03 20212
Department of Astronomy and Theoretical Physics - Undergoing reorganization
Abstract
In spite of our current knowledge regarding the biology underlying cancer genesis, reliable methods for the discovery of cancer driver (CD) genes are still in great need. The rather recent incorporation of epigenetic markers to the cancer paradigm has nevertheless opened the door for the development of new computational approaches to the problem.

This work is aimed to study the enrichment of certain genome regions with the histone post-translational modification (PTM) H3K4me3. This epigenetic marker can be used to distinguish cancer driver genes from neutral genes (NGs). To this end, a convolutional neural network (CNN) comparing H3K4me3 enrichment profiles for matching healthy and cancer samples is proposed and evaluated. The obtained... (More)
In spite of our current knowledge regarding the biology underlying cancer genesis, reliable methods for the discovery of cancer driver (CD) genes are still in great need. The rather recent incorporation of epigenetic markers to the cancer paradigm has nevertheless opened the door for the development of new computational approaches to the problem.

This work is aimed to study the enrichment of certain genome regions with the histone post-translational modification (PTM) H3K4me3. This epigenetic marker can be used to distinguish cancer driver genes from neutral genes (NGs). To this end, a convolutional neural network (CNN) comparing H3K4me3 enrichment profiles for matching healthy and cancer samples is proposed and evaluated. The obtained results for OriGENE, the presented model, show promise in pan-cancer but also tissue-specific cancer driver gene detection. (Less)
Popular Abstract
According to the World Health Organization (WHO), in 2020 nearly 10 million people died of cancer worldwide. Cancers are known to emerge when groups of cells divide and grow at higher rates than usual. Therefore, if one wants to understand the Genesis of the disease, it is compulsory to discover the core elements regulating the aforementioned processes.

These fundamental elements are the genes, discrete pieces of DNA encoding the information about “how to create and maintain alive” almost any living being that one can imagine. This information is encoded using four basic molecules that are named with the letters A,T,G and C. Our genes can then be thought as the words constituting a book that could be called The Human Genome. Since the... (More)
According to the World Health Organization (WHO), in 2020 nearly 10 million people died of cancer worldwide. Cancers are known to emerge when groups of cells divide and grow at higher rates than usual. Therefore, if one wants to understand the Genesis of the disease, it is compulsory to discover the core elements regulating the aforementioned processes.

These fundamental elements are the genes, discrete pieces of DNA encoding the information about “how to create and maintain alive” almost any living being that one can imagine. This information is encoded using four basic molecules that are named with the letters A,T,G and C. Our genes can then be thought as the words constituting a book that could be called The Human Genome. Since the book is genuinely large and has had a huge impact on our understanding of human nature, we’ll baptize it the Genome Bible.

Let’s assume that our body works like the Theology faculty at a certain university. Even though the text book in both cases is the same in all the classes (which in the analogy would be the cells), the parts of the book that a professor is going to read and the sections that will be skipped are at the end as important as the actual text. Moreover, not only what is read matters but also how the text is read plays probably one of the most important roles: the interpretation and enthusiasm with which the theology professor transmits the message to the students will deeply shape their minds and their future behavior. These factors, which are not intrinsic to a message that has only suffered slight mutations over the past centuries, have led to outcomes as different as the Crusades and the creation of some of the most altruistic and useful NGOs.

The same picture arises in the field of genetics, but in this field it turns out to be even more explicitly a matter of life and death. Enthusiasm, interpretation and book chapter selection are in our case mapped into gene expression levels, gene roles within tissues and gene silencing and activation. All these features and processes are regulated by the so-called epigenetic factors, external elements to the DNA that control how the encoded information is expressed and read.

In our work, we will try to pinpoint which genes are prone to be related with cancer genesis. To do so, a specific type of epigenetic factors that induce changes in the proteins around which the DNA is wrapped will be studied. Our focus will therefore be on the behavior of the teacher and the lecture itself rather than the book. The characterization of the genes will be done by means of machine learning related techniques. This will allow us to find the patterns in the epigenetic factors that lead genes to be expressed in aberrant manners. The results obtained in the project will shed light on the role of epigenetics in tumorigenesis and could have implications in drug and therapy development. (Less)
Please use this url to cite or link to this publication:
author
Pielies Avellí, Marc LU
supervisor
organization
course
FYTM03 20212
year
type
H2 - Master's Degree (Two Years)
subject
keywords
Cancer Driver (CD) Gene, Histone Post-Translational Modifications (PTMs), Convolutional Neural Networks (CNNs).
language
English
id
9091284
date added to LUP
2022-07-11 09:15:57
date last changed
2022-07-11 09:15:57
@misc{9091284,
  abstract     = {{In spite of our current knowledge regarding the biology underlying cancer genesis, reliable methods for the discovery of cancer driver (CD) genes are still in great need. The rather recent incorporation of epigenetic markers to the cancer paradigm has nevertheless opened the door for the development of new computational approaches to the problem. 

This work is aimed to study the enrichment of certain genome regions with the histone post-translational modification (PTM) H3K4me3. This epigenetic marker can be used to distinguish cancer driver genes from neutral genes (NGs). To this end, a convolutional neural network (CNN) comparing H3K4me3 enrichment profiles for matching healthy and cancer samples is proposed and evaluated. The obtained results for OriGENE, the presented model, show promise in pan-cancer but also tissue-specific cancer driver gene detection.}},
  author       = {{Pielies Avellí, Marc}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Cancer Driver Gene Detection using Deep Convolutional Neural Networks on H3K4me3 Enrichment Profiles}},
  year         = {{2022}},
}