Advanced

Testing genotyping strategies for ultra-deep sequencing of a co-amplifying gene family : MHC class I in a passerine bird

Biedrzycka, Aleksandra; Sebastian, Alvaro; Migalska, Magdalena; Westerdahl, Helena LU and Radwan, Jacek (2016) In Molecular Ecology Resources
Abstract

Characterization of highly duplicated genes, such as genes of the major histocompatibility complex (MHC), where multiple loci often co-amplify, has until recently been hindered by insufficient read depths per amplicon. Here, we used ultra-deep Illumina sequencing to resolve genotypes at exon 3 of MHC class I genes in the sedge warbler (Acrocephalus schoenobaenus). We sequenced 24 individuals in two replicates and used this data, as well as a simulated data set, to test the effect of amplicon coverage (range: 500-20 000 reads per amplicon) on the repeatability of genotyping using four different genotyping approaches. A third replicate employed unique barcoding to assess the extent of tag jumping, that is swapping of individual tag... (More)

Characterization of highly duplicated genes, such as genes of the major histocompatibility complex (MHC), where multiple loci often co-amplify, has until recently been hindered by insufficient read depths per amplicon. Here, we used ultra-deep Illumina sequencing to resolve genotypes at exon 3 of MHC class I genes in the sedge warbler (Acrocephalus schoenobaenus). We sequenced 24 individuals in two replicates and used this data, as well as a simulated data set, to test the effect of amplicon coverage (range: 500-20 000 reads per amplicon) on the repeatability of genotyping using four different genotyping approaches. A third replicate employed unique barcoding to assess the extent of tag jumping, that is swapping of individual tag identifiers, which may confound genotyping. The reliability of MHC genotyping increased with coverage and approached or exceeded 90% within-method repeatability of allele calling at coverages of >5000 reads per amplicon. We found generally high agreement between genotyping methods, especially at high coverages. High reliability of the tested genotyping approaches was further supported by our analysis of the simulated data set, although the genotyping approach relying primarily on replication of variants in independent amplicons proved sensitive to repeatable errors. According to the most repeatable genotyping method, the number of co-amplifying variants per individual ranged from 19 to 42. Tag jumping was detectable, but at such low frequencies that it did not affect the reliability of genotyping. We thus demonstrate that gene families with many co-amplifying genes can be reliably genotyped using HTS, provided that there is sufficient per amplicon coverage.

(Less)
Please use this url to cite or link to this publication:
author
organization
publishing date
type
Contribution to journal
publication status
epub
subject
keywords
Amplicon sequencing, Bioinformatics, Copy number variation, Next-generation sequencing, Passerine MHC
in
Molecular Ecology Resources
publisher
Wiley-Blackwell
external identifiers
  • scopus:85003888983
  • wos:000403258900007
ISSN
1755-098X
DOI
10.1111/1755-0998.12612
language
English
LU publication?
yes
id
9ce2196a-605b-4282-8736-2258b0687aa3
date added to LUP
2017-02-21 14:46:06
date last changed
2017-09-18 11:34:09
@article{9ce2196a-605b-4282-8736-2258b0687aa3,
  abstract     = {<p>Characterization of highly duplicated genes, such as genes of the major histocompatibility complex (MHC), where multiple loci often co-amplify, has until recently been hindered by insufficient read depths per amplicon. Here, we used ultra-deep Illumina sequencing to resolve genotypes at exon 3 of MHC class I genes in the sedge warbler (Acrocephalus schoenobaenus). We sequenced 24 individuals in two replicates and used this data, as well as a simulated data set, to test the effect of amplicon coverage (range: 500-20 000 reads per amplicon) on the repeatability of genotyping using four different genotyping approaches. A third replicate employed unique barcoding to assess the extent of tag jumping, that is swapping of individual tag identifiers, which may confound genotyping. The reliability of MHC genotyping increased with coverage and approached or exceeded 90% within-method repeatability of allele calling at coverages of &gt;5000 reads per amplicon. We found generally high agreement between genotyping methods, especially at high coverages. High reliability of the tested genotyping approaches was further supported by our analysis of the simulated data set, although the genotyping approach relying primarily on replication of variants in independent amplicons proved sensitive to repeatable errors. According to the most repeatable genotyping method, the number of co-amplifying variants per individual ranged from 19 to 42. Tag jumping was detectable, but at such low frequencies that it did not affect the reliability of genotyping. We thus demonstrate that gene families with many co-amplifying genes can be reliably genotyped using HTS, provided that there is sufficient per amplicon coverage.</p>},
  author       = {Biedrzycka, Aleksandra and Sebastian, Alvaro and Migalska, Magdalena and Westerdahl, Helena and Radwan, Jacek},
  issn         = {1755-098X},
  keyword      = {Amplicon sequencing,Bioinformatics,Copy number variation,Next-generation sequencing,Passerine MHC},
  language     = {eng},
  month        = {11},
  publisher    = {Wiley-Blackwell},
  series       = {Molecular Ecology Resources},
  title        = {Testing genotyping strategies for ultra-deep sequencing of a co-amplifying gene family : MHC class I in a passerine bird},
  url          = {http://dx.doi.org/10.1111/1755-0998.12612},
  year         = {2016},
}