Improving the calculation of statistical significance in genomewide scans
(2005) In Biostatistics 6(4). p.520538 Abstract
 Calculations of the significance of results from linkage analysis can be performed by simulation or by theoretical approximation, with or without the assumption of perfect marker information. Here we concentrate on theoretical approximation. Our starting point is the asymptotic approximation formula presented by Lander and Kruglyak (1995, Nature Genetics, 11, 241247), incorporating the effect of finite marker spacing as suggested by Feingold et al. (1993, American Journal of Human Genetics, 53, 234251). We consider two distinct ways in which this formula can be improved. Firstly, we present a formula for calculating the crossover rate rho for a pedigree of general structure. For a pedigree set, these values may then be weighted into an... (More)
 Calculations of the significance of results from linkage analysis can be performed by simulation or by theoretical approximation, with or without the assumption of perfect marker information. Here we concentrate on theoretical approximation. Our starting point is the asymptotic approximation formula presented by Lander and Kruglyak (1995, Nature Genetics, 11, 241247), incorporating the effect of finite marker spacing as suggested by Feingold et al. (1993, American Journal of Human Genetics, 53, 234251). We consider two distinct ways in which this formula can be improved. Firstly, we present a formula for calculating the crossover rate rho for a pedigree of general structure. For a pedigree set, these values may then be weighted into an overall crossover rate which can be used as input to the original approximation formula. Secondly, the unadjusted pvalue formula is based on the assumption of a Normally distributed nonparametric linkage (NPL) score. This leads to conservative or anticonservative pvalues of varying magnitude depending on the pedigree set structure. We adjust for nonNormality by calculating the marginal distribution of the NPL score under the null hypothesis of no linkage with an arbitrarily small error. The NPL score is then transformed to have a marginal standard Normal distribution and the transformed maximal NPL score, together with a slightly corrected value of the overall crossover rate, is inserted into the original formula in order to calculate the pvalue. We use pedigrees of seven different structures to compare the performance of our suggested approximation formula to the original approximation formula, with and without skewness correction, and to results found by simulation. We also apply the suggested formula to two real pedigree set structure examples. Our method generally seems to provide improved behavior, especially for pedigree sets which show clear departure from Normality, in relation to the competing approximations. (Less)
Please use this url to cite or link to this publication:
https://lup.lub.lu.se/record/223491
 author
 Ängquist, Lars ^{LU} and Hossjer, O
 organization
 publishing date
 2005
 type
 Contribution to journal
 publication status
 published
 subject
 keywords
 allele sharing, adjusted approximation formula, approximation of, distributions, crossover rate, deviation from Normality, extreme value, formulas, hermite polynomials, genomewide significance, marker, density, nonparametric linkage analysis
 in
 Biostatistics
 volume
 6
 issue
 4
 pages
 520  538
 publisher
 Oxford University Press
 external identifiers

 wos:000232102700002
 pmid:15831574
 scopus:18944363606
 pmid:15831574
 ISSN
 14684357
 DOI
 10.1093/biostatistics/kxi025
 language
 English
 LU publication?
 yes
 id
 17be39a8f771437e9be628f60d6f9593 (old id 223491)
 date added to LUP
 20160401 11:38:57
 date last changed
 20220126 08:11:09
@article{17be39a8f771437e9be628f60d6f9593, abstract = {{Calculations of the significance of results from linkage analysis can be performed by simulation or by theoretical approximation, with or without the assumption of perfect marker information. Here we concentrate on theoretical approximation. Our starting point is the asymptotic approximation formula presented by Lander and Kruglyak (1995, Nature Genetics, 11, 241247), incorporating the effect of finite marker spacing as suggested by Feingold et al. (1993, American Journal of Human Genetics, 53, 234251). We consider two distinct ways in which this formula can be improved. Firstly, we present a formula for calculating the crossover rate rho for a pedigree of general structure. For a pedigree set, these values may then be weighted into an overall crossover rate which can be used as input to the original approximation formula. Secondly, the unadjusted pvalue formula is based on the assumption of a Normally distributed nonparametric linkage (NPL) score. This leads to conservative or anticonservative pvalues of varying magnitude depending on the pedigree set structure. We adjust for nonNormality by calculating the marginal distribution of the NPL score under the null hypothesis of no linkage with an arbitrarily small error. The NPL score is then transformed to have a marginal standard Normal distribution and the transformed maximal NPL score, together with a slightly corrected value of the overall crossover rate, is inserted into the original formula in order to calculate the pvalue. We use pedigrees of seven different structures to compare the performance of our suggested approximation formula to the original approximation formula, with and without skewness correction, and to results found by simulation. We also apply the suggested formula to two real pedigree set structure examples. Our method generally seems to provide improved behavior, especially for pedigree sets which show clear departure from Normality, in relation to the competing approximations.}}, author = {{Ängquist, Lars and Hossjer, O}}, issn = {{14684357}}, keywords = {{allele sharing; adjusted approximation formula; approximation of; distributions; crossover rate; deviation from Normality; extreme value; formulas; hermite polynomials; genomewide significance; marker; density; nonparametric linkage analysis}}, language = {{eng}}, number = {{4}}, pages = {{520538}}, publisher = {{Oxford University Press}}, series = {{Biostatistics}}, title = {{Improving the calculation of statistical significance in genomewide scans}}, url = {{http://dx.doi.org/10.1093/biostatistics/kxi025}}, doi = {{10.1093/biostatistics/kxi025}}, volume = {{6}}, year = {{2005}}, }