Contig assembly and plasmid analysis using DNA barcodes
(2016) FYTM02 20152Computational Biology and Biological Physics - Undergoing reorganization
- Abstract
- Two methods of computational analysis of DNA barcodes are presented. A DNA barcode is formed by making GC-rich regions of a DNA molecule fluoresce while AT-rich regions remain dark, thus when stretched using nano-channels and viewed in a microscope, the DNA molecule will resemble a barcode with black and white stripes. Because of point-spread functions and pixellation the resolution will be roughly one data point per 200nm (or roughly 700 bp). This resolution is typically enough to distinguish between two different DNA molecules.
First DNA barcodes are used for analyzing an antibiotic resistance outbreak. In the outbreak, antibiotic resistant bacteria infected newborn children at Sahlgrenska University Hospital. The bacteria were of... (More) - Two methods of computational analysis of DNA barcodes are presented. A DNA barcode is formed by making GC-rich regions of a DNA molecule fluoresce while AT-rich regions remain dark, thus when stretched using nano-channels and viewed in a microscope, the DNA molecule will resemble a barcode with black and white stripes. Because of point-spread functions and pixellation the resolution will be roughly one data point per 200nm (or roughly 700 bp). This resolution is typically enough to distinguish between two different DNA molecules.
First DNA barcodes are used for analyzing an antibiotic resistance outbreak. In the outbreak, antibiotic resistant bacteria infected newborn children at Sahlgrenska University Hospital. The bacteria were of different strains and it was suspected that the bacteria shared the antibiotic resistant gene with bacteria not containing it through the exchange of plasmids. A plasmid is a short circular DNA molecule, typical length between 2 kbp to 1 Mbp (base pairs), which bacteria use to store genes that benefit survival (such as antibiotic resistance genes).
The second method is about matching short pieces of DNA sequence, called contigs, to a long intact barcode (from the same molecule as the contigs) to figure out the order of the pieces of sequence. In order to match a sequence to a barcode, the sequence has to be converted into a theoretical barcode first. After that it is compared to the long barcode, to find the optimal placement. Contigs are not supposed to overlap, and that is an assumption used in the methods presented in section 7.
The matching in both methods is facilitated by the use of our new statistical tools in order to reduce the number of false positives in the matching process. The results for the plasmid tracing method show that the method can be used to trace plasmid spread. On the other hand, the results for the contig assembly show that the method has potential to be useful, but at the moment it has been unsuccessful at assembling real contigs into a full, correct, sequence. (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/8727977
- author
- Pichler, Christoffer LU
- supervisor
- organization
- course
- FYTM02 20152
- year
- 2016
- type
- H2 - Master's Degree (Two Years)
- subject
- keywords
- DNA Barcode, Contig, Plasmid, Phase Randomization, Tree, Free Energy, P-value
- language
- English
- id
- 8727977
- date added to LUP
- 2016-05-13 09:32:17
- date last changed
- 2017-10-06 16:10:55
@misc{8727977, abstract = {{Two methods of computational analysis of DNA barcodes are presented. A DNA barcode is formed by making GC-rich regions of a DNA molecule fluoresce while AT-rich regions remain dark, thus when stretched using nano-channels and viewed in a microscope, the DNA molecule will resemble a barcode with black and white stripes. Because of point-spread functions and pixellation the resolution will be roughly one data point per 200nm (or roughly 700 bp). This resolution is typically enough to distinguish between two different DNA molecules. First DNA barcodes are used for analyzing an antibiotic resistance outbreak. In the outbreak, antibiotic resistant bacteria infected newborn children at Sahlgrenska University Hospital. The bacteria were of different strains and it was suspected that the bacteria shared the antibiotic resistant gene with bacteria not containing it through the exchange of plasmids. A plasmid is a short circular DNA molecule, typical length between 2 kbp to 1 Mbp (base pairs), which bacteria use to store genes that benefit survival (such as antibiotic resistance genes). The second method is about matching short pieces of DNA sequence, called contigs, to a long intact barcode (from the same molecule as the contigs) to figure out the order of the pieces of sequence. In order to match a sequence to a barcode, the sequence has to be converted into a theoretical barcode first. After that it is compared to the long barcode, to find the optimal placement. Contigs are not supposed to overlap, and that is an assumption used in the methods presented in section 7. The matching in both methods is facilitated by the use of our new statistical tools in order to reduce the number of false positives in the matching process. The results for the plasmid tracing method show that the method can be used to trace plasmid spread. On the other hand, the results for the contig assembly show that the method has potential to be useful, but at the moment it has been unsuccessful at assembling real contigs into a full, correct, sequence.}}, author = {{Pichler, Christoffer}}, language = {{eng}}, note = {{Student Paper}}, title = {{Contig assembly and plasmid analysis using DNA barcodes}}, year = {{2016}}, }