Modelling of DNA Copy Number Variations using Continuousindex Hidden Markov
Models
Susann Stjernqvist
Centre for Mathematical Sciences
Mathematical Statistics
Lund University,
2008
ISSN 1404028X

Abstract:

The number of copies of DNA have been shown to differ between cancer tumour
cells and healthy cells. These aberrations can be deletions as well as
amplifications. Sometimes entire chromosomes are affected but in other cases
it is only one or several short segments. This thesis will show how to model
the copy numbers and thereby find the deviant regions.


One method to measure DNA copy number variations is array Comparative Genomic
Hybridisation (aCGH) which is a kind of microarray technique. The method
yields the ratio between the number of copies of the DNA of a test sample
and a given reference sample. Each spot on the array, corresponds to a short
sequence of base pairs in the genome.


There are several different methods for modelling aCGH data and the methods
in this thesis belong to the group that uses hidden Markov models. A hidden
Markov chain can be described as a Markov chain observed in noise. Since
the clones are of different lengths, are unevenly spread over the genome
and may overlap we introduce a continuousindex method.


In the first paper a continuousindex hidden Markov model with a fixed number
of states is introduced. The parameters are estimated using Monte Carlo EM
and Markov chains are simulated by MCMC. In the second paper we further develop
the model to make it more realistic and less complex by introducing a latent
continuous Markov jump process. Then the process have a continuous state
space. A Bayesian approach is embraced and we continue using MCMC for the
simulations.






