Azra Kurbasic & Lars Ängquist Topics on significance in linkage analysis Abstracts: (Lars; based on tecnical report 2003:3 by Ängquist/Hössjer) This article deals with some topics regarding linkage analysis and significance. The significance calculation with respect to a maximal NPL score may be performed by simulation or by theoretical approximation. Here we will concentrate on the latter case with the further assumption of fully informative data. Our starting point is the asymptotic approximation formula presented by Lander and Kruglyak (1995) which is based on extreme value theory for Gaussian processes. Here we suggest two distinct improvements to this formula. Firstly, we present a formula for calculating the crossover rate for a pedigree of a general structure. For a pedigree set, the corresponding values may be weighted into an overall crossover rate which is needed in the original formula. Secondly, the existing approximations are based on the assumption of a normally distributed NPL score. This may, depending on the pedigree structure, lead to either conservative or anticonservative results. Here, to adjust for non-normality we first calculate the marginal distribution of the NPL score under the null hypothesis of no linkage with an arbitrarily small error. Then a one-one function between the standard normal distribution and a continuous version of the marginal distribution is created. Finally we replace the corresponding quantities in the Lander and Kruglyak formula with the output of this function for the maximal NPL score and an updated crossover rate. We have applied this method to several different pedigree sets and the result is that our suggested improvements, in general, seem to improve the accuracy of the approximation, especially for pedigree sets which correspond to distributions of severe nonnormality. (Azra; based on technical report 2003:30 by Kurbasic/Hössjer) Parametric linkage analysis is usually used to find chromosomal regions linked to a disease (phenotype) that is described with a specific genetic model. This is done by investigating the relations between the disease and genetic markers, that is, well-characterized loci of known position with a clear Mendelian mode of inheritance. Assume we have found an interesting region on a chromosome that we suspect is linked to the disease. Then we want to test the hypothesis of no linkage versus the alternative one of linkage. As a measure we use the maximal lod score $Z_{\mbox{\scriptsize max}}$. It is well known that the maximal lod score has asymptotically a $(2 \ln 10)^{-1}\times (\frac{1}{2}\chi^{2}(0)+\frac{1}{2}\chi^{2}(1))$ distribution under the null hypothesis of no linkage when only one point (one marker) on the chromosome is studied. In this paper, we show, both by simulations and theoretical arguments, that the null hypothesis distribution of $Z_{\mbox{\scriptsize max}}$ has no simple form when more than one marker is used (multipoint analysis). In fact, the distribution of $Z_{\mbox{\scriptsize max}}$ depends on the number of families, their structure, the assumed genetic model, marker denseness, and marker informativity. This means that a constant critical limit of $Z_{\mbox{\scriptsize max}}$ leads to tests associated with different significance levels. Because of the above-mentioned problems, from the statistical point of view a $p$-value is more desirable measure of significance than the maximal lod score.