Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Whole-genome sequencing and genome regions of special interest : Lessons from major histocompatibility complex, sex determination, and plant self-incompatibility

Vekemans, Xavier ; Castric, Vincent ; Hipperson, Helen ; Müller, Niels A. ; Westerdahl, Helena LU and Cronk, Quentin (2021) In Molecular Ecology 30(23). p.6072-6086
Abstract

Whole-genome sequencing of non-model organisms is now widely accessible and has allowed a range of questions in the field of molecular ecology to be investigated with greater power. However, some genomic regions that are of high biological interest remain problematic for assembly and data-handling. Three such regions are the major histocompatibility complex (MHC), sex-determining regions (SDRs) and the plant self-incompatibility locus (S-locus). Using these as examples, we illustrate the challenges of both assembling and resequencing these highly polymorphic regions and how bioinformatic and technological developments are enabling new approaches to their study. Mapping short-read sequences against multiple alternative references... (More)

Whole-genome sequencing of non-model organisms is now widely accessible and has allowed a range of questions in the field of molecular ecology to be investigated with greater power. However, some genomic regions that are of high biological interest remain problematic for assembly and data-handling. Three such regions are the major histocompatibility complex (MHC), sex-determining regions (SDRs) and the plant self-incompatibility locus (S-locus). Using these as examples, we illustrate the challenges of both assembling and resequencing these highly polymorphic regions and how bioinformatic and technological developments are enabling new approaches to their study. Mapping short-read sequences against multiple alternative references improves genotyping comprehensiveness at the S-locus thereby contributing to more accurate assessments of allelic frequencies. Long-read sequencing, producing reads of several tens to hundreds of kilobase pairs in length, facilitates the assembly of such regions as single sequences can span the multiple duplicated gene copies of the MHC region, and sequence through repetitive stretches and translocations in SDRs and S-locus haplotypes. These advances are adding value to short-read genome resequencing approaches by allowing, for example, more accurate haplotype phasing across longer regions. Finally, we assessed further technical improvements, such as nanopore adaptive sequencing and bioinformatic tools using pangenomes, which have the potential to further expand our knowledge of a number of genomic regions that remain challenging to study with classical resequencing approaches.

(Less)
Please use this url to cite or link to this publication:
author
; ; ; ; and
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
long-read sequencing, major histocompatibility complex, self-incompatibility locus, sex-determining region, whole-genome sequencing
in
Molecular Ecology
volume
30
issue
23
pages
15 pages
publisher
Wiley-Blackwell
external identifiers
  • pmid:34137092
  • scopus:85120153425
ISSN
0962-1083
DOI
10.1111/mec.16020
language
English
LU publication?
yes
additional info
Publisher Copyright: © 2021 The Authors. Molecular Ecology published by John Wiley & Sons Ltd.
id
852cda6e-87f3-43b0-b93b-ceb51d6bcb12
date added to LUP
2022-01-21 11:46:46
date last changed
2024-04-20 19:08:37
@article{852cda6e-87f3-43b0-b93b-ceb51d6bcb12,
  abstract     = {{<p>Whole-genome sequencing of non-model organisms is now widely accessible and has allowed a range of questions in the field of molecular ecology to be investigated with greater power. However, some genomic regions that are of high biological interest remain problematic for assembly and data-handling. Three such regions are the major histocompatibility complex (MHC), sex-determining regions (SDRs) and the plant self-incompatibility locus (S-locus). Using these as examples, we illustrate the challenges of both assembling and resequencing these highly polymorphic regions and how bioinformatic and technological developments are enabling new approaches to their study. Mapping short-read sequences against multiple alternative references improves genotyping comprehensiveness at the S-locus thereby contributing to more accurate assessments of allelic frequencies. Long-read sequencing, producing reads of several tens to hundreds of kilobase pairs in length, facilitates the assembly of such regions as single sequences can span the multiple duplicated gene copies of the MHC region, and sequence through repetitive stretches and translocations in SDRs and S-locus haplotypes. These advances are adding value to short-read genome resequencing approaches by allowing, for example, more accurate haplotype phasing across longer regions. Finally, we assessed further technical improvements, such as nanopore adaptive sequencing and bioinformatic tools using pangenomes, which have the potential to further expand our knowledge of a number of genomic regions that remain challenging to study with classical resequencing approaches.</p>}},
  author       = {{Vekemans, Xavier and Castric, Vincent and Hipperson, Helen and Müller, Niels A. and Westerdahl, Helena and Cronk, Quentin}},
  issn         = {{0962-1083}},
  keywords     = {{long-read sequencing; major histocompatibility complex; self-incompatibility locus; sex-determining region; whole-genome sequencing}},
  language     = {{eng}},
  month        = {{12}},
  number       = {{23}},
  pages        = {{6072--6086}},
  publisher    = {{Wiley-Blackwell}},
  series       = {{Molecular Ecology}},
  title        = {{Whole-genome sequencing and genome regions of special interest : Lessons from major histocompatibility complex, sex determination, and plant self-incompatibility}},
  url          = {{http://dx.doi.org/10.1111/mec.16020}},
  doi          = {{10.1111/mec.16020}},
  volume       = {{30}},
  year         = {{2021}},
}