Advanced

Noise filtering and deconvolution of DNA barcodes

Torche Pedreschi, Paola Carolina LU (2015) FYTK02 20151
Computational Biology and Biological Physics
Abstract
Optical DNA mapping is a technique for obtaining fast DNA sequence dependent barcodes by observing the stained DNA with a microscope, while it is held stretched inside a nanochannel. Noise plays an important role in optical maps and can significantly change the features in a barcode. One way to reduce random noise is to time average many time frame measurements. This method is efficient but requires time, saving more data and involves a computationally demanding process of alignment. For this reason, certain groups opt for measuring only one time frame image. This project is divided in two parts, the first one aims at studying the possible implementation of a low-pass filter to reduce noise in single-time-frame barcodes. A Sinc-Windowed... (More)
Optical DNA mapping is a technique for obtaining fast DNA sequence dependent barcodes by observing the stained DNA with a microscope, while it is held stretched inside a nanochannel. Noise plays an important role in optical maps and can significantly change the features in a barcode. One way to reduce random noise is to time average many time frame measurements. This method is efficient but requires time, saving more data and involves a computationally demanding process of alignment. For this reason, certain groups opt for measuring only one time frame image. This project is divided in two parts, the first one aims at studying the possible implementation of a low-pass filter to reduce noise in single-time-frame barcodes. A Sinc-Windowed filter and the Wiener-Kolmogorov filter are applied on 11600 barcodes and it is found that they both effectively reduce noise. The second part of the project seeks to improve the current method of stretching DNA barcodes via interpolation. Stretching of barcodes is needed for comparing DNA barcodes under different experimental conditions (different salt concentrations, channel diameters, etc.). The proposed approach is based on deconvolving the signal using the Wiener-Helstrom filter and interpolating the deconvolved intensity instead of the measured one, in order to preserve the original width of the Point Spread Function of the system. As a result, the stretched barcode is less blurred and has more resolvable peaks when it undergoes a process of deconvolution before the interpolation. (Less)
Popular Abstract
All organisms store the physiological instructions in their DNA molecules. DNA consists of two intertwined strands, where each strand is formed by a succession of base blocks called nucleotides. The number of base pairs (bp) in DNA range from 10^3bp (in viruses) to 10^11bp (in some animal species). The base blocks, or nucleotides, are organic molecules that can be of four types: Adenine, Thymine, Guanine, or Cytosine (A, T, G or C). The different combination of A, T, G and C along the strands defines the information stored in it, the instructions, and consequently all behaviour in the organism. Bacteria have two types of DNA, the chromosomal DNA and plasmids. Chromosomal DNA contains the essential information for life and without it, the... (More)
All organisms store the physiological instructions in their DNA molecules. DNA consists of two intertwined strands, where each strand is formed by a succession of base blocks called nucleotides. The number of base pairs (bp) in DNA range from 10^3bp (in viruses) to 10^11bp (in some animal species). The base blocks, or nucleotides, are organic molecules that can be of four types: Adenine, Thymine, Guanine, or Cytosine (A, T, G or C). The different combination of A, T, G and C along the strands defines the information stored in it, the instructions, and consequently all behaviour in the organism. Bacteria have two types of DNA, the chromosomal DNA and plasmids. Chromosomal DNA contains the essential information for life and without it, the bacterium cannot exist. Plasmids, on the other hand, carry additional information, for instance, the genes responsible for antibiotic resistance can be stored in there. Plasmids can replicate independently and transfer between bacteria, spreading the resistance. This is a problem to global health and modern medicine (which is based on antimicrobial treatments).

The characterisation of DNA helps us to understand its nature. One way of doing it is by labelling the different nucleotides with fluorescent molecules and observing it with a microscope. One option is to add the DNA in a mixture of fluorescent (YOYO-1 dyes) and non fluorescent beads netropsin molecules). YOYO-1 dyes all nucleotides without preference for any of the four types, but netropsin binds to AT-rich regions, preventing YOYO-1 to bind there. As a consequence, AT-rich parts in the DNA are observed darker in the microscope than GC-rich regions, which allows to measure an intensity profile -a barcode- that reflects the underlying sequence. Since a DNA molecule is coiled in the solution, it is often introduced in a nanochannel for stretching purposes which allows visualization of the barcode and achieve more resolution. This technique for obtaining DNA barcodes by direct observation with a microscope of a stained DNA is called optical mapping. It is a faster and less expensive technique compared to traditional DNA sequencing. On the other hand, it has lower resolution, but it is capable of detecting the important structural variations (such as repeats) larger than 1kbp which are not accessible in traditional DNA sequencing methods.

Noise and fluctuations are inherently present in DNA barcodes, representing 15% of the total signal intensity. A method to reduce the random components of the noise in a barcode is to take several images of the stained DNA in a nanochannel over a few seconds and average them. This is an effective method, however, it requires to measure and save more data, in addition to a process of alignment of the time-frame images. In order to remedy this drawback in present methodology,this thesis proposes, in two parts, methods for reducing noise and increasing resolutions in DNA barcodes.

The first part of this thesis studies the implementation of low-pass filters on single-time-frame images in order to reduce the noise in them without the need for averaging several images. The method is tested on 11600 barcodes from plasmids from a nosocomial outbreak at a neonatal ward at Sahlgrenska University Hospital, provided by Westerlund's Lab, Chalmers University. The results show that the filters effectively reduce the noise and fluctuations in the barcodes.

The second part of the thesis aims at increasing the resolution in the measured barcodes, by reversing the effect of diffraction in a process called deconvolution. The resolution in the barcodes is 1kbp (10^3bp), limited mainly by diffraction. Diffraction is the result of light interacting with the finite size lenses in the microscope. As a consequence, a point-like source of light is observed as a disk, or in general, as a distribution called the \textit{point spread function} (of that system). The method used in this project arises in the context of \textit{Super Resolution} techniques, in which the knowledge of the point spread function is used to estimate, from a measured image, what it was before interacting with the system (deconvolution). We study its application for stretching barcodes by interpolating the deconvolved intensity instead of the measured one. Stretching DNA barcodes is needed since the conditions in which they are measured (the size of the channel, the ionic strength of the solution) determine their length. Stretching them is a way to counteract for these differences. The results for this application are not conclusive. While the deconvolved intensities show more resolved peaks it is not guaranteed that these peaks are a real reflection of the DNA sequence. A further study is needed to bring more conclusive results, and also to study a direct application of deconvolution, which is to work with the higher resolution deconvolved barcodes instead of the measured ones. (Less)
Please use this url to cite or link to this publication:
author
Torche Pedreschi, Paola Carolina LU
supervisor
organization
course
FYTK02 20151
year
type
M2 - Bachelor Degree
subject
keywords
Wiener filter DNA deconvolution noise
language
English
id
7853451
date added to LUP
2015-09-06 20:38:41
date last changed
2017-10-06 16:21:20
@misc{7853451,
  abstract     = {Optical DNA mapping is a technique for obtaining fast DNA sequence dependent barcodes by observing the stained DNA with a microscope, while it is held stretched inside a nanochannel. Noise plays an important role in optical maps and can significantly change the features in a barcode. One way to reduce random noise is to time average many time frame measurements. This method is efficient but requires time, saving more data and involves a computationally demanding process of alignment. For this reason, certain groups opt for measuring only one time frame image. This project is divided in two parts, the first one aims at studying the possible implementation of a low-pass filter to reduce noise in single-time-frame barcodes. A Sinc-Windowed filter and the Wiener-Kolmogorov filter are applied on 11600 barcodes and it is found that they both effectively reduce noise. The second part of the project seeks to improve the current method of stretching DNA barcodes via interpolation. Stretching of barcodes is needed for comparing DNA barcodes under different experimental conditions (different salt concentrations, channel diameters, etc.). The proposed approach is based on deconvolving the signal using the Wiener-Helstrom filter and interpolating the deconvolved intensity instead of the measured one, in order to preserve the original width of the Point Spread Function of the system. As a result, the stretched barcode is less blurred and has more resolvable peaks when it undergoes a process of deconvolution before the interpolation.},
  author       = {Torche Pedreschi, Paola Carolina},
  keyword      = {Wiener filter DNA deconvolution noise},
  language     = {eng},
  note         = {Student Paper},
  title        = {Noise filtering and deconvolution of DNA barcodes},
  year         = {2015},
}