Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Integrative Analysis of Cell Line and Tissue Proteomics Data for Suptyping Ovarian Cancer

Skorda, Eleni Theofania (2024) BINP52 20232
Degree Projects in Bioinformatics
Abstract
Ovarian cancer remains a significant challenge in oncology, with current treatments having a high recurrence rate making further investigations into treatment targets and the heterogeneity of cancers necessary. As findings from cell lines models are hard to transfer to in vivo cancers, we wanted to explore possibilities to better integrate cell line and tissue omics data. To do so, this study leveraged proteomics data from The Cancer Genome Atlas (TCGA) as well as local cell line data. We first performed a principal component analysis (PCA) to cluster TCGA phosphoproteomics tissue samples by health status. We then developed a novel approach for integration of TCGA proteomics and phosphoproteomics data with local cell line data to enhance... (More)
Ovarian cancer remains a significant challenge in oncology, with current treatments having a high recurrence rate making further investigations into treatment targets and the heterogeneity of cancers necessary. As findings from cell lines models are hard to transfer to in vivo cancers, we wanted to explore possibilities to better integrate cell line and tissue omics data. To do so, this study leveraged proteomics data from The Cancer Genome Atlas (TCGA) as well as local cell line data. We first performed a principal component analysis (PCA) to cluster TCGA phosphoproteomics tissue samples by health status. We then developed a novel approach for integration of TCGA proteomics and phosphoproteomics data with local cell line data to enhance our understanding of ovarian cancer at a molecular level. This data integration allowed for a more comprehensive analysis, providing insights that individual datasets might not reveal. Subsequent pathway analysis of the integrated proteomics data identified key pathways in each of the identified ovarian cancer cell lines. Differential gene expression analysis was conducted on the genes involved in these shared pathways, yielding valuable information about the specific genes and pathways involved in the disease. The integrated approach not only enhances our understanding of the molecular mechanisms driving ovarian cancer but also sets the stage for future research based on proteomics and phosphoproteomics. This holds promise for identifying novel biomarkers and potential therapeutic targets, which allow for advancements in precision medicine for ovarian cancer. (Less)
Popular Abstract
Understanding Ovarian Cancer: Combining Data Sets to Find New Clues

Ovarian cancer is one of the most difficult to treat cancers. This is mostly because current treatment often leads to recurrence of the tumour. The situation requires that researchers keep looking for new treatment targets and better ways to understand the complexity of the disease. To do so they investigate the unique features of ovarian cancer on a molecular level. The aim of our project is to show differences between tumour cells and healthy cells by analysing proteomic data from several sources such as The Cancer Genome Atlas and local cell line data.

We first reprocessed a large dataset from a repository called CPTAC, focusing on proteins and their... (More)
Understanding Ovarian Cancer: Combining Data Sets to Find New Clues

Ovarian cancer is one of the most difficult to treat cancers. This is mostly because current treatment often leads to recurrence of the tumour. The situation requires that researchers keep looking for new treatment targets and better ways to understand the complexity of the disease. To do so they investigate the unique features of ovarian cancer on a molecular level. The aim of our project is to show differences between tumour cells and healthy cells by analysing proteomic data from several sources such as The Cancer Genome Atlas and local cell line data.

We first reprocessed a large dataset from a repository called CPTAC, focusing on proteins and their phosphorylated forms. We were interested in phosphorylation as it is a chemical modification, which can influence protein function. We used a method called Principal Component Analysis (PCA) which can project thousands of analytes into a few dimensions, enabling the identification of patterns. Using this method, we sorted the data into different groups for tumour tissue and healthy tissue.
To see whether the groups made sense, we looked at three genes that are already known to be associated with ovarian cancer: SND1, MTDH and MKI67. The expression levels of these genes mostly were what we expected, however MTDH showed the opposite trend, likely due to a phosphorylation.

This analysis was followed by a differential protein abundance analysis, a technique which determines the degree of activity of a specific protein in different cells to understand which proteins are switched “on” or “off” in healthy or tumour tissue. We wanted to show the difference in gene activity between the groups identified in the PCA. The analysis did confirm our groups, however, it also showed samples which were misclassified. This is likely due to variability in the tissue samples.

Finishing the exploration of the CPTAC data, we combined it with local cell line data. By doing this integration, we could assign CPTAC tissue samples to specific cell lines. This integration process helps us understand how cell lines used in cancer research are actually connected to tumour tissue. The integration showed that tumour tissue clustered associated with high grade serous ovarian cancer cell lines and healthy tissue with different cell lines representing normal cells. This result was consistent even when applying it to a new dataset and thereby confirmed our approach. The differential abundance analysis of this integrated dataset then showed patterns of gene activity, which indicates that different genes behave differently in different cell line clusters.

Finally, we conducted a pathway analysis to get an idea on how different proteins interact in the cells. This analysis identified several pathways (groups of proteins that work together), which were more or less active in ovarian cancer cells as opposed to healthy cells. Some of these pathways could be potential targets for therapy or new biomarkers.

Overall, the integrative approach of this project allowed a better understanding of the molecular mechanisms of ovarian cancer. This contributes to the ongoing fight against ovarian cancer by putting down the basis for future research which can lead to more precise treatments which are unique to the molecular features of the tumour of a specific patient.

Master’s Degree Project in Bioinformatics 60 credits 2024
Department of Biology, Lund University

Advisor: Fredrik Levander
Department of Immunotechnology, LTH (Less)
Please use this url to cite or link to this publication:
author
Skorda, Eleni Theofania
supervisor
organization
course
BINP52 20232
year
type
H2 - Master's Degree (Two Years)
subject
language
English
id
9175559
date added to LUP
2024-09-27 11:31:35
date last changed
2024-09-27 11:31:35
@misc{9175559,
  abstract     = {{Ovarian cancer remains a significant challenge in oncology, with current treatments having a high recurrence rate making further investigations into treatment targets and the heterogeneity of cancers necessary. As findings from cell lines models are hard to transfer to in vivo cancers, we wanted to explore possibilities to better integrate cell line and tissue omics data. To do so, this study leveraged proteomics data from The Cancer Genome Atlas (TCGA) as well as local cell line data. We first performed a principal component analysis (PCA) to cluster TCGA phosphoproteomics tissue samples by health status. We then developed a novel approach for integration of TCGA proteomics and phosphoproteomics data with local cell line data to enhance our understanding of ovarian cancer at a molecular level. This data integration allowed for a more comprehensive analysis, providing insights that individual datasets might not reveal. Subsequent pathway analysis of the integrated proteomics data identified key pathways in each of the identified ovarian cancer cell lines. Differential gene expression analysis was conducted on the genes involved in these shared pathways, yielding valuable information about the specific genes and pathways involved in the disease. The integrated approach not only enhances our understanding of the molecular mechanisms driving ovarian cancer but also sets the stage for future research based on proteomics and phosphoproteomics. This holds promise for identifying novel biomarkers and potential therapeutic targets, which allow for advancements in precision medicine for ovarian cancer.}},
  author       = {{Skorda, Eleni Theofania}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Integrative Analysis of Cell Line and Tissue Proteomics Data for Suptyping Ovarian Cancer}},
  year         = {{2024}},
}