Advanced

Protein Quantification Based on Quantitative Shotgun Proteomics Data

Petri, Hannes LU (2016) KIM820 20161
Department of Immunotechnology
Educational programmes, LTH
Abstract
Quantitative proteomics is concerned with measuring the abundances of different proteins in biological samples. In the shotgun approach, proteins are digested by an endoprotease into several smaller peptides, which are then separated by liquid chromatography (LC) and subsequently detected and quantified by a mass spectrometer. In this work, a method for determining the relative abundances of proteins in a set of biological samples using peptide quantities is presented and described. The method consists of a series of largely independent steps, each aimed at enriching and analyzing the available data. First, missing values are imputed using peptides that are registered in multiple charge states. Then, a distance measure is employed to each... (More)
Quantitative proteomics is concerned with measuring the abundances of different proteins in biological samples. In the shotgun approach, proteins are digested by an endoprotease into several smaller peptides, which are then separated by liquid chromatography (LC) and subsequently detected and quantified by a mass spectrometer. In this work, a method for determining the relative abundances of proteins in a set of biological samples using peptide quantities is presented and described. The method consists of a series of largely independent steps, each aimed at enriching and analyzing the available data. First, missing values are imputed using peptides that are registered in multiple charge states. Then, a distance measure is employed to each set of peptides originating in the same protein, in order to exclude outliers and determine a single peptide – the medoid – deemed representative of the true abundance profile. Lastly, missing values in each medoid peptide are imputed using similar peptides. For proteins only giving rise to shared peptides, an additive model is employed to de-convolute the unknown quantity. For evaluation, an algorithm for simulating measured peptide quantities in an LC-MS/MS experiment was devised. Using data derived from simulations, results can be compared to known protein quantities. In addition to simulated data, a data set with known protein quantities derived from bacteria is used to measure the performance of the various procedures. The described method for determining the abundance profile of a combination of proteins using their constituent peptides shows generally favorable results, but more work is required to investigate the benefits and limitations of the method. The method for de-convoluting shared peptides is potentially useful but requires data with very low levels of noise in order to be applicable. (Less)
Popular Abstract
Counting proteins in living beings

Ingress: Determining the abundance of proteins in biological samples is a key tool in medical research. However, simultaneously measuring the levels of many different kinds of proteins is a difficult task, which requires new and improved methods.

Proteins occur in all cells and most processes in every living being. Mushrooms, bacteria, insects, birds and humans are all machines whose cogwheels are, to a large extent, proteins. Muscle tissue, which enables movement of animals, is made up of motor proteins. The enzymes that help digest the food that we eat are a kind of proteins. And the food itself, by the way, is also composed of proteins, which is one of the reasons that we eat it.

Due to their... (More)
Counting proteins in living beings

Ingress: Determining the abundance of proteins in biological samples is a key tool in medical research. However, simultaneously measuring the levels of many different kinds of proteins is a difficult task, which requires new and improved methods.

Proteins occur in all cells and most processes in every living being. Mushrooms, bacteria, insects, birds and humans are all machines whose cogwheels are, to a large extent, proteins. Muscle tissue, which enables movement of animals, is made up of motor proteins. The enzymes that help digest the food that we eat are a kind of proteins. And the food itself, by the way, is also composed of proteins, which is one of the reasons that we eat it.

Due to their important role, proteins are a highly interesting study object if you want to understand how the human body works – and why it sometimes does not work. Perhaps a disease is detectable in an early stage by examining the protein levels in a blood sample. Or maybe discovering that patients with a certain disease have a higher abundance of some proteins that healthy people can give an important clue as to what causes the disease.

These inquiries all require methods for quantifying proteins in biological samples. Unfortunately, this is far from an easy task, since proteins are a heterogeneous class of molecules that are hard to detect and measure, unless you know specifically what you are looking for. A successful approach has been to first split the proteins into smaller parts, called peptides, which are more easily measured. However, this introduces a new problem, since the quantities that you measure need to be “assembled” into proteins again.

This thesis presents a new method for drawing conclusions about protein quantities based on the measured peptide quantities. The method is largely based on computing similarities between several peptides, and determining a single peptide that represents the protein. Evaluating the performance of the method turned out to be tricky, but it seems like the method performs rather well. (Less)
Please use this url to cite or link to this publication:
author
Petri, Hannes LU
supervisor
organization
course
KIM820 20161
year
type
H2 - Master's Degree (Two Years)
subject
language
English
id
8882653
date added to LUP
2016-06-20 11:31:59
date last changed
2016-06-20 11:31:59
@misc{8882653,
  abstract     = {Quantitative proteomics is concerned with measuring the abundances of different proteins in biological samples. In the shotgun approach, proteins are digested by an endoprotease into several smaller peptides, which are then separated by liquid chromatography (LC) and subsequently detected and quantified by a mass spectrometer. In this work, a method for determining the relative abundances of proteins in a set of biological samples using peptide quantities is presented and described. The method consists of a series of largely independent steps, each aimed at enriching and analyzing the available data. First, missing values are imputed using peptides that are registered in multiple charge states. Then, a distance measure is employed to each set of peptides originating in the same protein, in order to exclude outliers and determine a single peptide – the medoid – deemed representative of the true abundance profile. Lastly, missing values in each medoid peptide are imputed using similar peptides. For proteins only giving rise to shared peptides, an additive model is employed to de-convolute the unknown quantity. For evaluation, an algorithm for simulating measured peptide quantities in an LC-MS/MS experiment was devised. Using data derived from simulations, results can be compared to known protein quantities. In addition to simulated data, a data set with known protein quantities derived from bacteria is used to measure the performance of the various procedures. The described method for determining the abundance profile of a combination of proteins using their constituent peptides shows generally favorable results, but more work is required to investigate the benefits and limitations of the method. The method for de-convoluting shared peptides is potentially useful but requires data with very low levels of noise in order to be applicable.},
  author       = {Petri, Hannes},
  language     = {eng},
  note         = {Student Paper},
  title        = {Protein Quantification Based on Quantitative Shotgun Proteomics Data},
  year         = {2016},
}