Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Whole-Genome Sequencing of two Swedish Individuals on PromethION

Fatima, Nazeefa (2019) BINP50 20182
Degree Projects in Bioinformatics
Abstract
Background: Chromosomes can undergo various changes such as deletions, inversions, insertions, and/or translocations resulting in structural variation differences between individuals. Structural variants are a common source of variability in the human genome and have been known to be associated with common diseases such as autism, cancer, and rare human diseases [1, 2]. However, they have not yet been extensively studied at the higher resolution. SVs are complex genomic components partially due to being known to emerge in repetitive regions [3]. Alignment of short reads to repetitive regions can cause ambiguity and has, therefore, posed challenges in the past to detect SVs. New approaches for SV detection have been enabled by the recent... (More)
Background: Chromosomes can undergo various changes such as deletions, inversions, insertions, and/or translocations resulting in structural variation differences between individuals. Structural variants are a common source of variability in the human genome and have been known to be associated with common diseases such as autism, cancer, and rare human diseases [1, 2]. However, they have not yet been extensively studied at the higher resolution. SVs are complex genomic components partially due to being known to emerge in repetitive regions [3]. Alignment of short reads to repetitive regions can cause ambiguity and has, therefore, posed challenges in the past to detect SVs. New approaches for SV detection have been enabled by the recent improvements in sequencing technologies. In particular, the new long-read single-molecule sequencing instruments provided by Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT) produce a high yield in a short period while keeping a low cost for a library preparation. These instruments make it possible to generate high quality representations of whole genomes and enable reliable structural variant calling in human individuals [4, 5].
Objectives: A recent study performed on PacBio’s Single-Molecule Real-Time sequencing of two Swedish human genomes, Swe1 (male) and Swe2 (female), as part of the SweGen 1000 Genomes project (https://swefreq.nbis.se), uncovered over 17K SVs per individual as well as various other genomic components [6] that are otherwise not detectable in short reads. As a follow-up study, we have now generated data for the same two Swedish individuals on the ONT’s PromethION system, a new nanopore based sequencing instrument, that is known for its higher throughput as compared to the PacBio.
Results and Conclusion: We present a pilot study that evaluates nanopore data derived from wholegenome sequencing (WGS) on PromethION in comparison to the Single-Molecule Real-Time (SMRT) reads obtained from the PacBio RS II platform. We performed comparative analyses of single- molecule long-read technologies in a context of mappability, and SV detection that resulted in an average of 17k and 24k variants across nanopore and SMRT datasets, respectively. The results will be useful for the large-scale SweGen project in a context of validation and comparison of SVs in Swedish individuals. In addition, the study serves as a bioinformatics pipeline for future long-read data analyses and sets a basis for what to consider when designing future PromethION experiments. (Less)
Popular Abstract
Whole-Genome Sequencing of two Swedish Individuals on Oxford Nanopore PromethION

Chromosomes can undergo various changes such as large deletions and/or insertions, resulting in structural variation differences between individuals. Structural variants (SVs) are a common source of variability in the human genome and are known to be associated with several diseases. SVs often involve complex genomic rearrangements that are difficult to resolve using short read sequencing technologies. New approaches enabled by the latest generation of long-read single-molecule sequencing instruments, provided by Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT), can produce a sufficient amount of data to enable SV detection across entire... (More)
Whole-Genome Sequencing of two Swedish Individuals on Oxford Nanopore PromethION

Chromosomes can undergo various changes such as large deletions and/or insertions, resulting in structural variation differences between individuals. Structural variants (SVs) are a common source of variability in the human genome and are known to be associated with several diseases. SVs often involve complex genomic rearrangements that are difficult to resolve using short read sequencing technologies. New approaches enabled by the latest generation of long-read single-molecule sequencing instruments, provided by Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT), can produce a sufficient amount of data to enable SV detection across entire human genomes to a reasonable cost.
Previously, we performed PacBio sequencing of two Swedish human genomes, as part of the SweGen 1000 Genomes project (https://swefreq.nbis.se) and uncovered over 17,000 SVs per individual (Ameur et al, 2018). A majority of these SVs were not detectable in short reads. As a follow-up, we have now generated data for the same individuals on ONT’s PromethION system, a new nanopore-based platform known for its higher throughput as compared to PacBio.
We present a pilot study that evaluates nanopore data derived from whole-genome sequencing (WGS) on PromethION in comparison to the Single-Molecule Real-Time (SMRT) reads obtained from the PacBio RSII platform. We performed comparative analyses of single- molecule technologies in a context of mappability, and SV detection that resulted in an average of 17k and 24k variants across nanopore and SMRT datasets, respectively. The results will be useful for the large-scale SweGen project, while the study serves as a bioinformatics pipeline for future long-read data analyses and sets a basis for what to consider when designing future PromethION experiments.

Master’s Degree Project in Bioinformatics 30 credits 2019
Department of Biology, Lund University

Advisor: Adam Ameur
National Genomics Infrastructure, Science for Life Laboratory, Uppsala, Sweden (Less)
Please use this url to cite or link to this publication:
author
Fatima, Nazeefa
supervisor
organization
course
BINP50 20182
year
type
H2 - Master's Degree (Two Years)
subject
language
English
id
8978951
date added to LUP
2019-06-03 12:16:16
date last changed
2019-06-03 12:16:16
@misc{8978951,
  abstract     = {{Background: Chromosomes can undergo various changes such as deletions, inversions, insertions, and/or translocations resulting in structural variation differences between individuals. Structural variants are a common source of variability in the human genome and have been known to be associated with common diseases such as autism, cancer, and rare human diseases [1, 2]. However, they have not yet been extensively studied at the higher resolution. SVs are complex genomic components partially due to being known to emerge in repetitive regions [3]. Alignment of short reads to repetitive regions can cause ambiguity and has, therefore, posed challenges in the past to detect SVs. New approaches for SV detection have been enabled by the recent improvements in sequencing technologies. In particular, the new long-read single-molecule sequencing instruments provided by Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT) produce a high yield in a short period while keeping a low cost for a library preparation. These instruments make it possible to generate high quality representations of whole genomes and enable reliable structural variant calling in human individuals [4, 5].
Objectives: A recent study performed on PacBio’s Single-Molecule Real-Time sequencing of two Swedish human genomes, Swe1 (male) and Swe2 (female), as part of the SweGen 1000 Genomes project (https://swefreq.nbis.se), uncovered over 17K SVs per individual as well as various other genomic components [6] that are otherwise not detectable in short reads. As a follow-up study, we have now generated data for the same two Swedish individuals on the ONT’s PromethION system, a new nanopore based sequencing instrument, that is known for its higher throughput as compared to the PacBio. 
Results and Conclusion: We present a pilot study that evaluates nanopore data derived from wholegenome sequencing (WGS) on PromethION in comparison to the Single-Molecule Real-Time (SMRT) reads obtained from the PacBio RS II platform. We performed comparative analyses of single- molecule long-read technologies in a context of mappability, and SV detection that resulted in an average of 17k and 24k variants across nanopore and SMRT datasets, respectively. The results will be useful for the large-scale SweGen project in a context of validation and comparison of SVs in Swedish individuals. In addition, the study serves as a bioinformatics pipeline for future long-read data analyses and sets a basis for what to consider when designing future PromethION experiments.}},
  author       = {{Fatima, Nazeefa}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Whole-Genome Sequencing of two Swedish Individuals on PromethION}},
  year         = {{2019}},
}