Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

DOGMA : de novo assembly of densely labelled optical DNA maps using a matrix profile approach

Dvirnas, Albertas LU ; Leal-Garza, Luis Mario ; Abbaspour, Zahra ; Fröbrant, Erik ; Frykholm, Karolin ; Wrande, Marie ; Sandegren, Linus ; Westerlund, Fredrik and Ambjörnsson, Tobias LU (2025) In PLOS ONE 20(12 December).
Abstract

In optical genome mapping (OGM), large numbers of individual DNA maps—sequence-specific data series along single DNA molecules—are produced. Such individual maps have to be stitched together in a process called de novo OGM assembly in order to create consensus OGM maps for corresponding regions along the chromosomes. While there are several types of experimental OGM assays, not all of them have de novo OGM assembly tools available. In particular, in densely-labelled OGM there are no such tools. Here, we present and evaluate DOGMA, a de novo OGM assembly algorithm for densely labelled OGM data which uses matrix profiles. Matrix profile has transformed how data mining problems are approached in time series analysis. Yet, this algorithm... (More)

In optical genome mapping (OGM), large numbers of individual DNA maps—sequence-specific data series along single DNA molecules—are produced. Such individual maps have to be stitched together in a process called de novo OGM assembly in order to create consensus OGM maps for corresponding regions along the chromosomes. While there are several types of experimental OGM assays, not all of them have de novo OGM assembly tools available. In particular, in densely-labelled OGM there are no such tools. Here, we present and evaluate DOGMA, a de novo OGM assembly algorithm for densely labelled OGM data which uses matrix profiles. Matrix profile has transformed how data mining problems are approached in time series analysis. Yet, this algorithm has not been widely explored outside of the time series community— we here use it for OGM de novo assembly for the first time. Further novelties in our algorithm are the introduction of two scores for each individual alignment, use of p-values, a visual representation as barcode islands and the introduction of a method for generating consensus barcodes using amplitude adjustment. Utilizing p-values helps mitigate the risk of errors in the assemblies as caused by false positives. We demonstrate our algorithm by applying it for de novo OGM assembly of synthetic datasets and of an experimental dataset from an Escherichia coli genome. We validate the assemblies using corresponding reference genomes and investigate the strengths and limitations of the algorithm. De novo OGM assembly of dense optical DNA maps shows promise as a complement or an alternative to current OGM techniques for other types of genome mapping assays. The code is available at: https://github.com/dnadevcode/dogma.

(Less)
Please use this url to cite or link to this publication:
author
; ; ; ; ; ; ; and
organization
publishing date
type
Contribution to journal
publication status
published
subject
in
PLOS ONE
volume
20
issue
12 December
article number
e0335633
publisher
Public Library of Science (PLoS)
external identifiers
  • pmid:41325363
  • scopus:105023281685
ISSN
1932-6203
DOI
10.1371/journal.pone.0335633
language
English
LU publication?
yes
id
da59ba83-075f-4674-bc3e-57bc4c8ddba4
date added to LUP
2026-01-14 13:18:17
date last changed
2026-01-28 14:40:52
@article{da59ba83-075f-4674-bc3e-57bc4c8ddba4,
  abstract     = {{<p>In optical genome mapping (OGM), large numbers of individual DNA maps—sequence-specific data series along single DNA molecules—are produced. Such individual maps have to be stitched together in a process called de novo OGM assembly in order to create consensus OGM maps for corresponding regions along the chromosomes. While there are several types of experimental OGM assays, not all of them have de novo OGM assembly tools available. In particular, in densely-labelled OGM there are no such tools. Here, we present and evaluate DOGMA, a de novo OGM assembly algorithm for densely labelled OGM data which uses matrix profiles. Matrix profile has transformed how data mining problems are approached in time series analysis. Yet, this algorithm has not been widely explored outside of the time series community— we here use it for OGM de novo assembly for the first time. Further novelties in our algorithm are the introduction of two scores for each individual alignment, use of p-values, a visual representation as barcode islands and the introduction of a method for generating consensus barcodes using amplitude adjustment. Utilizing p-values helps mitigate the risk of errors in the assemblies as caused by false positives. We demonstrate our algorithm by applying it for de novo OGM assembly of synthetic datasets and of an experimental dataset from an Escherichia coli genome. We validate the assemblies using corresponding reference genomes and investigate the strengths and limitations of the algorithm. De novo OGM assembly of dense optical DNA maps shows promise as a complement or an alternative to current OGM techniques for other types of genome mapping assays. The code is available at: https://github.com/dnadevcode/dogma.</p>}},
  author       = {{Dvirnas, Albertas and Leal-Garza, Luis Mario and Abbaspour, Zahra and Fröbrant, Erik and Frykholm, Karolin and Wrande, Marie and Sandegren, Linus and Westerlund, Fredrik and Ambjörnsson, Tobias}},
  issn         = {{1932-6203}},
  language     = {{eng}},
  number       = {{12 December}},
  publisher    = {{Public Library of Science (PLoS)}},
  series       = {{PLOS ONE}},
  title        = {{DOGMA : de novo assembly of densely labelled optical DNA maps using a matrix profile approach}},
  url          = {{http://dx.doi.org/10.1371/journal.pone.0335633}},
  doi          = {{10.1371/journal.pone.0335633}},
  volume       = {{20}},
  year         = {{2025}},
}