A genomic mutational constraint map using variation in 76,156 human genomes
(2024) In Nature 625(7993). p.92-100- Abstract
- The depletion of disruptive variation caused by purifying natural selection (constraint) has been widely used to investigate protein-coding genes underlying human disorders 1–4, but attempts to assess constraint for non-protein-coding regions have proved more difficult. Here we aggregate, process and release a dataset of 76,156 human genomes from the Genome Aggregation Database (gnomAD)—the largest public open-access human genome allele frequency reference dataset—and use it to build a genomic constraint map for the whole genome (genomic non-coding constraint of haploinsufficient variation (Gnocchi)). We present a refined mutational model that incorporates local sequence context and regional genomic features to detect depletions of... (More)
- The depletion of disruptive variation caused by purifying natural selection (constraint) has been widely used to investigate protein-coding genes underlying human disorders 1–4, but attempts to assess constraint for non-protein-coding regions have proved more difficult. Here we aggregate, process and release a dataset of 76,156 human genomes from the Genome Aggregation Database (gnomAD)—the largest public open-access human genome allele frequency reference dataset—and use it to build a genomic constraint map for the whole genome (genomic non-coding constraint of haploinsufficient variation (Gnocchi)). We present a refined mutational model that incorporates local sequence context and regional genomic features to detect depletions of variation. As expected, the average constraint for protein-coding sequences is stronger than that for non-coding regions. Within the non-coding genome, constrained regions are enriched for known regulatory elements and variants that are implicated in complex human diseases and traits, facilitating the triangulation of biological annotation, disease association and natural selection to non-coding DNA analysis. More constrained regulatory elements tend to regulate more constrained protein-coding genes, which in turn suggests that non-coding constraint can aid the identification of constrained genes that are as yet unrecognized by current gene constraint metrics. We demonstrate that this genome-wide constraint map improves the identification and interpretation of functional human genetic variation. (Less)
Please use this url to cite or link to this publication:
https://lup.lub.lu.se/record/5805be33-cb32-4cb2-90b6-d452ba9c5735
- author
- Chen, S. ; Groop, L. LU ; Melander, O. LU ; Nilsson, P.M. LU ; Smith, J. Gustav LU ; MacArthur, D.G. and Karczewski, K.J.
- author collaboration
- organization
-
- Translational Muscle Research (research group)
- Cardiovascular Research - Hypertension (research group)
- EpiHealth: Epidemiology for Health
- EXODIAB: Excellence of Diabetes Research in Sweden
- MultiPark: Multidisciplinary research focused on Parkinson´s disease
- Internal Medicine - Epidemiology (research group)
- WCMM-Wallenberg Centre for Molecular Medicine
- Heart Failure and Mechanical Support (research group)
- Cardiovascular Epigenetics (research group)
- Cardiology
- Molecular Epidemiology and Cardiology (research group)
- publishing date
- 2024
- type
- Contribution to journal
- publication status
- published
- subject
- keywords
- Chromosome Mapping, Gene Frequency, Genetic Variation, Genome, Human, Genomics, Humans, Mutation, allele, DNA, gene expression, genome, mutation, protein, triangulation, article, gene frequency, genetic variation, human, human genome, natural selection, open reading frame, chromosomal mapping, genetics, genomics
- in
- Nature
- volume
- 625
- issue
- 7993
- pages
- 9 pages
- publisher
- Nature Publishing Group
- external identifiers
-
- scopus:85180828283
- pmid:38057664
- ISSN
- 0028-0836
- DOI
- 10.1038/s41586-023-06045-0
- language
- English
- LU publication?
- yes
- id
- 5805be33-cb32-4cb2-90b6-d452ba9c5735
- date added to LUP
- 2024-03-04 15:00:27
- date last changed
- 2024-04-03 03:32:28
@article{5805be33-cb32-4cb2-90b6-d452ba9c5735, abstract = {{The depletion of disruptive variation caused by purifying natural selection (constraint) has been widely used to investigate protein-coding genes underlying human disorders 1–4, but attempts to assess constraint for non-protein-coding regions have proved more difficult. Here we aggregate, process and release a dataset of 76,156 human genomes from the Genome Aggregation Database (gnomAD)—the largest public open-access human genome allele frequency reference dataset—and use it to build a genomic constraint map for the whole genome (genomic non-coding constraint of haploinsufficient variation (Gnocchi)). We present a refined mutational model that incorporates local sequence context and regional genomic features to detect depletions of variation. As expected, the average constraint for protein-coding sequences is stronger than that for non-coding regions. Within the non-coding genome, constrained regions are enriched for known regulatory elements and variants that are implicated in complex human diseases and traits, facilitating the triangulation of biological annotation, disease association and natural selection to non-coding DNA analysis. More constrained regulatory elements tend to regulate more constrained protein-coding genes, which in turn suggests that non-coding constraint can aid the identification of constrained genes that are as yet unrecognized by current gene constraint metrics. We demonstrate that this genome-wide constraint map improves the identification and interpretation of functional human genetic variation.}}, author = {{Chen, S. and Groop, L. and Melander, O. and Nilsson, P.M. and Smith, J. Gustav and MacArthur, D.G. and Karczewski, K.J.}}, issn = {{0028-0836}}, keywords = {{Chromosome Mapping; Gene Frequency; Genetic Variation; Genome, Human; Genomics; Humans; Mutation; allele; DNA; gene expression; genome; mutation; protein; triangulation; article; gene frequency; genetic variation; human; human genome; natural selection; open reading frame; chromosomal mapping; genetics; genomics}}, language = {{eng}}, number = {{7993}}, pages = {{92--100}}, publisher = {{Nature Publishing Group}}, series = {{Nature}}, title = {{A genomic mutational constraint map using variation in 76,156 human genomes}}, url = {{http://dx.doi.org/10.1038/s41586-023-06045-0}}, doi = {{10.1038/s41586-023-06045-0}}, volume = {{625}}, year = {{2024}}, }