Introduction

My research is about inference for stochastic dynamical systems. That is, given a model representing mathematically the time-evolution of a system subject to random noise, I am particularly interested in the estimation of the relevant unknown model-parameters using available data. Most of my research is focused on inference for stochastic differential equation (SDE) models with applications to biomedicine. More specifically some of my interests are: mixed-effects modelling via SDEs; likelihood-based and Bayesian inference for SDEs; approximate Bayesian computation (ABC); stochastic modelling for glycemia-insulinemia dynamics. Finally, as a consequence of the need to apply modern (but demanding!) inferential methods for complex high-dimensional stochastic models, I am particularly fascinated by computationally challenging probabilistic methods (such as Markov chain Monte Carlo and Sequential Monte Carlo) and their efficient computer implementations.

Below you will find short descriptions of some of the research areas and applications I have been involved with, together with links to the relevant publications (notice that most of my preprints are freely available in my publications page):


Stochastic models for glycemia dynamics

The picture above illustrate the fit of an SDE model to glycemia data. For details go here.

In my early works, together with Andrea De Gaetano (Rome) and Susanne Ditlevsen (Copenhagen), I considered the problem of formulating models able to accomodate stochastic variability in glycemia dinamics. Previous attempts in literature focussed on deterministic modelling (ODE and DDE based), which are intrinsically unable to represent randomness in the modelled (physiological) system and thus the only random variability which could be contemplated had to be interpreted as measurement error. By using stochastic differential equations this is no more the case. In a 2006 paper we have been able to consider likelihood-based inference via computer intensive simulated Monte Carlo and separate intrinsic stochasticity in glycemia dynamics from measurement error variability. In order to ease the inference, a more computationally feasible model was proposed in a 2008 paper, where the likelihood function is approximated in closed form, but this time measurement error is not modelized.

Relevant papers: JMB 2006; MMB 2008.



Mixed-effects models defined via SDEs

The picture above illustrate the fit of a mixed-effects SDE model to simulated data of five orange trees. For details go here.

It is often the case that a given experiment involves repeated measurements, particularly in biomedicine, where the several replicates might be measurements of the same experiment performed on different subjects or animals. Mixed-effects models assist in modelling variability when it is of interest to "catch" the overall behavior of the entire "population" of subjects, that is making simultaneous inference for the collective dynamics of all subjects and not the individual (subject-specific) behavior, by introducing random parameters. This allow for a more precise estimation of population parameters. Since early '80s mixed-effects dynamical models have had a deterministic flavour, i.e. they were based on ODEs. More recently support for SDEs has been introduced, thus allowing the simultaneous representation of within-subject stochastic variability in addition to collective (between-subjects) variation. Together with Susanne Ditlevsen (Copenhagen) and Andrea De Gaetano (Rome) I have considered likelihood-based inferential methods for mixed-effects models defined via SDEs; see a methodological 2010 paper and a more computational paper from 2011. See also an application to neuronal models.

Relevant papers: SJS 2010; CSDA 2011; NECO 2008.



Approximate Bayesian computation for stochastic dynamical models

© Christian P. Robert

Approximate Bayesian computation (ABC) is a likelihood-free methodology that is enjoying increasingly popularity as it provides a practical approach to perform inference for models that, due to likelihood function intractability, would otherwise be computationally too challenging to be considered (see a review by Sisson and Fan, 2010). Ideally we wish to make inference about unknown parameters using Bayesian methods, i.e. given some data we wish to simulate from the posterior density. However for many complex models not only a closed form expression for such posterior turns to be unavailable, but also very general Monte Carlo Markov chain (MCMC) methods such as Metropolis-Hastings may fail for a number of reasons, including e.g. difficulties in exploring the parameter space, multimodality in the posterior surface, difficulties in constructing adequate proposal densities. Considering that for SDE models simulated trajectories are by nature highly erratic, distance from the observed data might turn unacceptably high for an MCMC algorithm based on rejections, even when the parameter starting values are located in a region close to the bulk of the posterior density. ABC circumvent the evaluation of the intractable likelihood function while still targeting the posterior density or an approximation thereof. ABC methodology is fascinating and extremely flexible. In a preprint from 2012 I considered ABC to allow inference for a very general and typically difficult to treat class of models, namely stochastic differential equation models observed with measurement error. The case of partially observed systems is also considered. Finally a simulation study considering a multidimensional model for stochastic chemical reactions is presented.

Relevant papers: preprint 2012.



MCMC and sequential Monte Carlo for (non-Markovian) dynamics in protein-folding models


This is ongoing research with Julie Lyng Forman (Copenhagen) and Michael Sørensen (Copenhagen) and at the moment there are no results ready to be posted. The problem is about the estimation of folding rates for some protein having a coordinate switching between the folded and unfolded state, which is noticeable in the picture above. The project is about fitting a non-Markovian model expressed as a sum of two diffusions to a (large) data set; this is both theoretically and computationally challenging. Additional information will be posted once encouraging results become available.