Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Geographic population structure analysis of worldwide human populations infers their biogeographical origins

Elhaik, Eran LU orcid ; Tatarinova, Tatiana ; Chebotarev, Dmitri ; Piras, Ignazio S ; Maria Calò, Carla ; De Montis, Antonella ; Atzori, Manuela ; Marini, Monica ; Tofanelli, Sergio and Francalacci, Paolo , et al. (2014) In Nature Communications 5.
Abstract

The search for a method that utilizes biological information to predict humans' place of origin has occupied scientists for millennia. Over the past four decades, scientists have employed genetic data in an effort to achieve this goal but with limited success. While biogeographical algorithms using next-generation sequencing data have achieved an accuracy of 700 km in Europe, they were inaccurate elsewhere. Here we describe the Geographic Population Structure (GPS) algorithm and demonstrate its accuracy with three data sets using 40,000-130,000 SNPs. GPS placed 83% of worldwide individuals in their country of origin. Applied to over 200 Sardinians villagers, GPS placed a quarter of them in their villages and most of the rest within 50... (More)

The search for a method that utilizes biological information to predict humans' place of origin has occupied scientists for millennia. Over the past four decades, scientists have employed genetic data in an effort to achieve this goal but with limited success. While biogeographical algorithms using next-generation sequencing data have achieved an accuracy of 700 km in Europe, they were inaccurate elsewhere. Here we describe the Geographic Population Structure (GPS) algorithm and demonstrate its accuracy with three data sets using 40,000-130,000 SNPs. GPS placed 83% of worldwide individuals in their country of origin. Applied to over 200 Sardinians villagers, GPS placed a quarter of them in their villages and most of the rest within 50 km of their villages. GPS's accuracy and power to infer the biogeography of worldwide individuals down to their country or, in some cases, village, of origin, underscores the promise of admixture-based methods for biogeography and has ramifications for genetic ancestry testing.

(Less)
Please use this url to cite or link to this publication:
@article{1d63b61d-e68e-455a-b344-c868334e0cea,
  abstract     = {{<p>The search for a method that utilizes biological information to predict humans' place of origin has occupied scientists for millennia. Over the past four decades, scientists have employed genetic data in an effort to achieve this goal but with limited success. While biogeographical algorithms using next-generation sequencing data have achieved an accuracy of 700 km in Europe, they were inaccurate elsewhere. Here we describe the Geographic Population Structure (GPS) algorithm and demonstrate its accuracy with three data sets using 40,000-130,000 SNPs. GPS placed 83% of worldwide individuals in their country of origin. Applied to over 200 Sardinians villagers, GPS placed a quarter of them in their villages and most of the rest within 50 km of their villages. GPS's accuracy and power to infer the biogeography of worldwide individuals down to their country or, in some cases, village, of origin, underscores the promise of admixture-based methods for biogeography and has ramifications for genetic ancestry testing.</p>}},
  author       = {{Elhaik, Eran and Tatarinova, Tatiana and Chebotarev, Dmitri and Piras, Ignazio S and Maria Calò, Carla and De Montis, Antonella and Atzori, Manuela and Marini, Monica and Tofanelli, Sergio and Francalacci, Paolo and Pagani, Luca and Tyler-Smith, Chris and Xue, Yali and Cucca, Francesco and Schurr, Theodore G and Gaieski, Jill B and Melendez, Carlalynne and Vilar, Miguel G and Owings, Amanda C and Gómez, Rocío and Fujita, Ricardo and Santos, Fabrício R and Comas, David and Balanovsky, Oleg and Balanovska, Elena and Zalloua, Pierre and Soodyall, Himla and Pitchappan, Ramasamy and Ganeshprasad, Arunkumar and Hammer, Michael and Matisoo-Smith, Lisa and Wells, R Spencer}},
  issn         = {{2041-1723}},
  keywords     = {{Algorithms; Europe; Genetics, Population/methods; Genome, Human/genetics; Humans; Polymorphism, Single Nucleotide/genetics}},
  language     = {{eng}},
  month        = {{04}},
  publisher    = {{Nature Publishing Group}},
  series       = {{Nature Communications}},
  title        = {{Geographic population structure analysis of worldwide human populations infers their biogeographical origins}},
  url          = {{http://dx.doi.org/10.1038/ncomms4513}},
  doi          = {{10.1038/ncomms4513}},
  volume       = {{5}},
  year         = {{2014}},
}