Umberto Picchini's Research

My research is about inference for stochastic dynamical systems. That is, given a model
representing mathematically the time-evolution of a system subject to random *noise*, I am particularly interested in the estimation of the relevant unknown model-parameters using
available data. Most of my research is focused on inference for state-space models and stochastic differential equation (SDE) models with applications to biomedicine. More specifically some of my interests are: mixed-effects modelling via SDEs; likelihood-based and Bayesian inference for SDEs; approximate Bayesian computation (ABC); stochastic modelling for glycemia-insulinemia dynamics.
Finally, as a consequence of the need to apply modern (but demanding!) inferential methods for complex high-dimensional stochastic models, I am particularly fascinated by computationally challenging probabilistic methods (such as Markov chain Monte Carlo and Sequential Monte Carlo) and their efficient computer implementations.

I have recently received a research grant from the Swedish research council (project id 2013-5167) for the interdisciplinary project "Statistical Inference and Stochastic Modelling of Protein Folding" (here is an accessible description) for which I am the
principal investigator, in collaboration with Kresten Lindorff-Larsen (Dept. Biology, Copenhagen University) and Julie Lyng Forman (Dept. Biostatistics, Copenhagen University).

**Selected talks:** slides from recent and not-so-recent talks are available at my SlideShare account.

Below you can find short descriptions of some of the research areas and applications I have been involved with, together with links to the relevant publications (notice
that most of my preprints are freely available in my publications page):

- Approximate Bayesian computation (ABC) and other likelihood-free methods
- Modelling of protein-folding data
- Stochastic models for glycemia dynamics
- Mixed-effects models defined via SDEs

Approximate Bayesian computation (ABC) is a *likelihood-free* methodology that
is enjoying increasingly popularity as it provides a practical approach to perform inference
for models that, due to likelihood function intractability, would otherwise be
computationally too challenging to be considered (see a review by Sisson and Fan, 2010 and another review by Marin e al. (2011)). Ideally we wish to make inference about unknown parameters using Bayesian methods, i.e.
given some data we want to simulate from the posterior distribution on the parameters space. However for many complex models not only a closed form expression
for such posterior is unavailable, but also very general Markov chain Monte Carlo (MCMC)
methods such as Metropolis-Hastings may fail for a number of reasons, including e.g.
difficulties in exploring the parameter space, multimodality
in the posterior surface, difficulties in constructing adequate proposal densities.
For stochastic models of my interest (e.g. stochastic differential equation models) simulated trajectories are by nature highly erratic, and therefore
distance of simulated observations from observed data might turn unacceptably high for an MCMC algorithm, even when the parameters are located in the bulk of the posterior distribution. ABC circumvents
the evaluation of the intractable likelihood function while still targeting the posterior distribution
or an approximation thereof. ABC methodology is fascinating and extremely flexible. Here I considered ABC for stochastic differential equation models observed with error.
The case of partially observed systems is also considered. Simulations for pharmacokinetics/pharmacodynamics and for stochastic chemical reactions studies
are presented. The `abc-sde` MATLAB package implementing the methodology is freely available.

In another paper with Julie Lyng Forman (Biostatistics dept., Copenhagen) we used ABC for inference on a (relatively) large dataset and a computationally challenging sum-of-diffusions model for protein folding data.'

In a joint work with Rachele Anderson we consider parameter estimation for a general class of models using an hybrid MLE-Bayesian strategy, ultimately leading to a maximum likelihood estimator
while making use of an ABC-MCMC sampler: this work is based on the strategy popularly known as "data cloning".

In a joint work with Adeline Samson we embed ABC within SAEM (stochastic approximation EM) for maximum likelihood estimation in state-space models (also known
as hidden Markov models).

In a single authored work I have enabled SAEM for complex intractable models, using the concept of *synthetic likelihood*, whereas previously SAEM could be considered only for analytically tractable models.

**Relevant papers:** paper #1 paper #2
paper #3 paper #4 paper #5 **Software:** my `abc-sde` package

I have received a research grant from the Swedish research council for the interdisciplinary project "Statistical Inference and Stochastic Modelling of Protein Folding" (here is an accessible description) for which I am the
principal investigator, in collaboration with Kresten Lindorff-Larsen (Dept. Biology, Copenhagen University) and Julie Lyng Forman (Dept. Biostatistics, Copenhagen University).

In a joint work with Julie Forman we have considered the problem of estimating folding rates for some protein having a coordinate switching between the *folded* and *unfolded* state, which is noticeable in the picture above.
The so called "protein-folding problem" has been referred to as *"the Holy Grail of biochemistry and biophysics"* and therefore we are not contemplating to find a solution to this problem (!).
However some contribution from the inference point of view can be given and we have proposed a new dynamical model (expressed as sum of two diffusions) and a quite fast computational strategy based on Approximate Bayesian Computation (ABC, see above) that seems to work well and could be used in place of exact Bayesian inference,
when large datasets do not allow for the latter.

**Relevant papers:** paper 2014

In my early works, together with Andrea De Gaetano (Rome)
and Susanne Ditlevsen (Copenhagen), I considered the problem of formulating models able to accommodate stochastic variability in glycemia dynamics. Previous attempts in literature focussed on deterministic modelling (ODE and DDE based), which are intrinsically
unable to represent randomness in the modelled (physiological) system and thus the only random variability which could be contemplated had to be interpreted as measurement error.
By using stochastic differential equations this is no more the case. In a 2006 paper we have been able to consider likelihood-based inference via computer intensive simulated Monte Carlo and separate intrinsic stochasticity in glycemia dynamics from
measurement error variability. In order to ease the inference, a more computationally feasible model was proposed in a 2008 paper, where the likelihood function is approximated in closed form,
but this time measurement error is not modelized.

**Relevant papers:** JMB 2006 MMB 2008.

It is often the case that a given experiment involves repeated measurements, particularly in biomedicine, where the several replicates might be
measurements of the same experiment performed on different subjects or animals. Mixed-effects models assist in modelling variability when it is of interest
to "catch" the overall behaviour of the entire "population" of subjects, that is making simultaneous inference for the collective dynamics of all subjects and not the individual (subject-specific) behaviour, by introducing random parameters. This
allow for a more precise estimation of *population parameters*.
Since early '80s mixed-effects dynamical models have had a deterministic flavour, i.e. they were based on ODEs. More recently support for SDEs has been introduced, thus allowing
the simultaneous representation of *within-subject* stochastic variability in addition to collective (*between-subjects*) variation. Together with Susanne Ditlevsen
(Copenhagen) and Andrea De Gaetano (Rome) I have considered likelihood-based inferential methods for mixed-effects models defined via SDEs; see a methodological 2010 paper and a more
computational paper from 2011. See also an application to neuronal models.

With Julie Lyng Forman I have considered a "likelihood free" methodology named *synthetic likelihood* to estimate a model of tumor growth in mice.

**Relevant papers:** SJS 2010 CSDA 2011 NECO 2008
tumor 2016