Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Empirical distributions of F(ST) from large-scale human polymorphism data

Elhaik, Eran LU orcid (2012) In PLoS ONE 7(11).
Abstract

Studies of the apportionment of human genetic variation have long established that most human variation is within population groups and that the additional variation between population groups is small but greatest when comparing different continental populations. These studies often used Wright's F(ST) that apportions the standardized variance in allele frequencies within and between population groups. Because local adaptations increase population differentiation, high-F(ST) may be found at closely linked loci under selection and used to identify genes undergoing directional or heterotic selection. We re-examined these processes using HapMap data. We analyzed 3 million SNPs on 602 samples from eight worldwide populations and a consensus... (More)

Studies of the apportionment of human genetic variation have long established that most human variation is within population groups and that the additional variation between population groups is small but greatest when comparing different continental populations. These studies often used Wright's F(ST) that apportions the standardized variance in allele frequencies within and between population groups. Because local adaptations increase population differentiation, high-F(ST) may be found at closely linked loci under selection and used to identify genes undergoing directional or heterotic selection. We re-examined these processes using HapMap data. We analyzed 3 million SNPs on 602 samples from eight worldwide populations and a consensus subset of 1 million SNPs found in all populations. We identified four major features of the data: First, a hierarchically F(ST) analysis showed that only a paucity (12%) of the total genetic variation is distributed between continental populations and even a lesser genetic variation (1%) is found between intra-continental populations. Second, the global F(ST) distribution closely follows an exponential distribution. Third, although the overall F(ST) distribution is similarly shaped (inverse J), F(ST) distributions varies markedly by allele frequency when divided into non-overlapping groups by allele frequency range. Because the mean allele frequency is a crude indicator of allele age, these distributions mark the time-dependent change in genetic differentiation. Finally, the change in mean-F(ST) of these groups is linear in allele frequency. These results suggest that investigating the extremes of the F(ST) distribution for each allele frequency group is more efficient for detecting selection. Consequently, we demonstrate that such extreme SNPs are more clustered along the chromosomes than expected from linkage disequilibrium for each allele frequency group. These genomic regions are therefore likely candidates for natural selection.

(Less)
Please use this url to cite or link to this publication:
author
publishing date
type
Contribution to journal
publication status
published
keywords
Alleles, Asian Continental Ancestry Group/genetics, European Continental Ancestry Group/genetics, Gene Frequency, Genetic Drift, Genetics, Population, HapMap Project, Haplotypes, Humans, Linkage Disequilibrium, Models, Genetic, Polymorphism, Single Nucleotide/genetics, Population Groups, Selection, Genetic
in
PLoS ONE
volume
7
issue
11
article number
e49837
pages
12 pages
publisher
Public Library of Science (PLoS)
external identifiers
  • pmid:23185452
  • scopus:84869783128
ISSN
1932-6203
DOI
10.1371/journal.pone.0049837
language
English
LU publication?
no
id
5e68893b-995a-4787-ac91-8f5f00ed6713
date added to LUP
2019-11-10 16:52:43
date last changed
2025-10-17 12:58:13
@article{5e68893b-995a-4787-ac91-8f5f00ed6713,
  abstract     = {{<p>Studies of the apportionment of human genetic variation have long established that most human variation is within population groups and that the additional variation between population groups is small but greatest when comparing different continental populations. These studies often used Wright's F(ST) that apportions the standardized variance in allele frequencies within and between population groups. Because local adaptations increase population differentiation, high-F(ST) may be found at closely linked loci under selection and used to identify genes undergoing directional or heterotic selection. We re-examined these processes using HapMap data. We analyzed 3 million SNPs on 602 samples from eight worldwide populations and a consensus subset of 1 million SNPs found in all populations. We identified four major features of the data: First, a hierarchically F(ST) analysis showed that only a paucity (12%) of the total genetic variation is distributed between continental populations and even a lesser genetic variation (1%) is found between intra-continental populations. Second, the global F(ST) distribution closely follows an exponential distribution. Third, although the overall F(ST) distribution is similarly shaped (inverse J), F(ST) distributions varies markedly by allele frequency when divided into non-overlapping groups by allele frequency range. Because the mean allele frequency is a crude indicator of allele age, these distributions mark the time-dependent change in genetic differentiation. Finally, the change in mean-F(ST) of these groups is linear in allele frequency. These results suggest that investigating the extremes of the F(ST) distribution for each allele frequency group is more efficient for detecting selection. Consequently, we demonstrate that such extreme SNPs are more clustered along the chromosomes than expected from linkage disequilibrium for each allele frequency group. These genomic regions are therefore likely candidates for natural selection.</p>}},
  author       = {{Elhaik, Eran}},
  issn         = {{1932-6203}},
  keywords     = {{Alleles; Asian Continental Ancestry Group/genetics; European Continental Ancestry Group/genetics; Gene Frequency; Genetic Drift; Genetics, Population; HapMap Project; Haplotypes; Humans; Linkage Disequilibrium; Models, Genetic; Polymorphism, Single Nucleotide/genetics; Population Groups; Selection, Genetic}},
  language     = {{eng}},
  number       = {{11}},
  publisher    = {{Public Library of Science (PLoS)}},
  series       = {{PLoS ONE}},
  title        = {{Empirical distributions of F(ST) from large-scale human polymorphism data}},
  url          = {{http://dx.doi.org/10.1371/journal.pone.0049837}},
  doi          = {{10.1371/journal.pone.0049837}},
  volume       = {{7}},
  year         = {{2012}},
}