Using Importance Sampling to Improve Simulation in Linkage Analysis

Lars Ängquist and Ola Hössjer

Centre for Mathematical Sciences
Mathematical Statistics
Lund Institute of Technology,
Lund University,

ISSN 1403-9338
In this article we describe and discuss implementation of a weighted simulation procedure, importance sampling, in the context of nonparametric linkage analysis. The objective is to estimate genome-wide p-values, i.e. the probability that the maximal linkage score exceeds a given threshold under the null hypothesis of no linkage. In order to reduce variance of the p-value estimate for large thresholds, we simulate linkage scores under a distribution different from the null with an artificial disease locus positioned somewhere along the genome. To compensate for the fact that we simulate under the wrong distribution, the simulated scores are reweighted using a certain likelihood ratio. If design parameters of the sampling distribution are chosen correctly, the variance of the final significance value estimate is reduced. This results in more accurate genome-wide p-value estimates for large thresholds, based on a substantially smaller number of simulations than is needed using traditional unweighted simulation.
We illustrate the performance of the method for several pedigree examples, discuss implementation including choice of sampling parameters and describe some possible generalizations.
Key words:
Nonparametric linkage analysis, importance sampling, change of probability measure, exponential tilting, marker information, genome-wide significance