Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Computational Prediction Models for Proteolytic Cleavage and Epitope Identification

You, Liwen LU (2007)
Abstract
The biological functions of proteins depend on their physical interactions with other molecules, such as proteins and peptides. Therefore, modeling the protein-ligand interactions is important for understanding protein functions in different biological processes. We have focused on the cleavage specificities of HIV-1 protease, HCV NS3 protease and caspases on short oligopeptides or in native proteins; the binding affinity of MHC molecules with short oligopeptides and identification of T cell epitopes. We expect that our findings on HIV-1 protease, HCV NS3 protease and caspases generalize to other proteases.



In this thesis, we have performed analysis on these interactions from different perspectives --- we have extended... (More)
The biological functions of proteins depend on their physical interactions with other molecules, such as proteins and peptides. Therefore, modeling the protein-ligand interactions is important for understanding protein functions in different biological processes. We have focused on the cleavage specificities of HIV-1 protease, HCV NS3 protease and caspases on short oligopeptides or in native proteins; the binding affinity of MHC molecules with short oligopeptides and identification of T cell epitopes. We expect that our findings on HIV-1 protease, HCV NS3 protease and caspases generalize to other proteases.



In this thesis, we have performed analysis on these interactions from different perspectives --- we have extended and collected new substrate data sets; used and compared different prediction methods (e.g. linear support vector machines, neural networks, OSRE method, rough set theory and Gaussian processes) to understand the underlying interaction problems; suggested new methods (i.e. a hierarchical method and Gaussian processes with test reject method) to improve predictions; and extracted cleavage rules for protease cleavage specificities.



From our studies, we have extended oligopeptide substrate data sets and collected native protein substrates for HIV-1 protease, and a new oligopeptide substrate data set for HCV protease. We have shown that all current HIV-1 protease oligopeptide substrate data sets and our HCV data set are linearly separable; for HIV-1 protease, size and hydrophobicity are two important physicochemical properties in the recognition of short oligopeptide substrates to the protease; and linear support vector machine is the state-of-the-art for this protease cleavage prediction problem. Our hierarchical method combining protein secondary structure information and experimental short oligopeptide cleavage information can improve the prediction of HIV-1 protease cleavage sites in native proteins. Our rule extraction method provides simple and accurate cleavage rules with high fidelity for HIV-1 and HCV proteases. For MHC molecules, we showed that high binding affinities are not necessarily correlated to immunogenicity on HLA-restricted peptides. Our test reject method combined with Gaussian processes can simplify experimental design by reducing false positives for detecting potential epitopes in large pathogen genomes. (Less)
Please use this url to cite or link to this publication:
author
supervisor
opponent
  • Professor Higgs, Paul, McMaster University, Canada
organization
publishing date
type
Thesis
publication status
published
subject
keywords
medical informatics, biomathematics biometrics, Bioinformatics, SVM, sequence analysis, rule extraction, protease-peptide interaction, OSRE, MHC, immunology, HIV, hierarchical method, HCV, Gaussian process, false positive, epitope, cleavage specificity, cleavage prediction, binding affinity, caspase, Bioinformatik, medicinsk informatik, biomatematik, Computer science, numerical analysis, systems, control, Datalogi, numerisk analys, system, kontroll
pages
211 pages
publisher
Department of Theoretical Physics, Lund University
defense location
Lecture Hall F of the Department of Theoretical Physics
defense date
2007-11-26 13:30:00
ISBN
978-91-628-7218-2
language
English
LU publication?
yes
id
9105f988-e876-4886-bc86-ceeab02c4c24 (old id 599135)
date added to LUP
2016-04-04 10:42:10
date last changed
2018-11-21 21:00:18
@phdthesis{9105f988-e876-4886-bc86-ceeab02c4c24,
  abstract     = {{The biological functions of proteins depend on their physical interactions with other molecules, such as proteins and peptides. Therefore, modeling the protein-ligand interactions is important for understanding protein functions in different biological processes. We have focused on the cleavage specificities of HIV-1 protease, HCV NS3 protease and caspases on short oligopeptides or in native proteins; the binding affinity of MHC molecules with short oligopeptides and identification of T cell epitopes. We expect that our findings on HIV-1 protease, HCV NS3 protease and caspases generalize to other proteases.<br/><br>
<br/><br>
In this thesis, we have performed analysis on these interactions from different perspectives --- we have extended and collected new substrate data sets; used and compared different prediction methods (e.g. linear support vector machines, neural networks, OSRE method, rough set theory and Gaussian processes) to understand the underlying interaction problems; suggested new methods (i.e. a hierarchical method and Gaussian processes with test reject method) to improve predictions; and extracted cleavage rules for protease cleavage specificities.<br/><br>
<br/><br>
From our studies, we have extended oligopeptide substrate data sets and collected native protein substrates for HIV-1 protease, and a new oligopeptide substrate data set for HCV protease. We have shown that all current HIV-1 protease oligopeptide substrate data sets and our HCV data set are linearly separable; for HIV-1 protease, size and hydrophobicity are two important physicochemical properties in the recognition of short oligopeptide substrates to the protease; and linear support vector machine is the state-of-the-art for this protease cleavage prediction problem. Our hierarchical method combining protein secondary structure information and experimental short oligopeptide cleavage information can improve the prediction of HIV-1 protease cleavage sites in native proteins. Our rule extraction method provides simple and accurate cleavage rules with high fidelity for HIV-1 and HCV proteases. For MHC molecules, we showed that high binding affinities are not necessarily correlated to immunogenicity on HLA-restricted peptides. Our test reject method combined with Gaussian processes can simplify experimental design by reducing false positives for detecting potential epitopes in large pathogen genomes.}},
  author       = {{You, Liwen}},
  isbn         = {{978-91-628-7218-2}},
  keywords     = {{medical informatics; biomathematics biometrics; Bioinformatics; SVM; sequence analysis; rule extraction; protease-peptide interaction; OSRE; MHC; immunology; HIV; hierarchical method; HCV; Gaussian process; false positive; epitope; cleavage specificity; cleavage prediction; binding affinity; caspase; Bioinformatik; medicinsk informatik; biomatematik; Computer science; numerical analysis; systems; control; Datalogi; numerisk analys; system; kontroll}},
  language     = {{eng}},
  publisher    = {{Department of Theoretical Physics, Lund University}},
  school       = {{Lund University}},
  title        = {{Computational Prediction Models for Proteolytic Cleavage and Epitope Identification}},
  year         = {{2007}},
}