Umberto Picchini's Master's projects for students

2015: *"Synthetic likelihoods for statistical parameter estimation"*. Feel free to contact me if interested.

Thanks to a rapid increase in computational power for commonly available personal computers,
statisticians and mathematical modellers now have a number of affordable possibilities to perform
computationally intensive statistical inference for relatively complex mathematical models. A typical scenario
involves considering some data (these could be the result of measurements taken during an experiment or be artificial/simulated "data")
and try to fit a mathematical model. A common problem is then to determine plausible values for the unknown quantities
entering the model definition: for example the constant parameters appearing in the model are usually unknown,
or fixed to some convenient value. The statistician's task is to give a reasonable estimate to these parameters using appropriate
statistical methods.
Historically the main tools to perform statistical inference on unknown parameters or other unknown quantities are:
(1) maximum likelihood; (2) Bayesian methods. Both classes have appealing theoretical properties and depending on the model at hand,
the statistician might decide to pick a statistical method belonging to (1) or (2). Here the student should consider a relatively recent methodology named
"synthetic likelihoods" (see reference below).
Synthetic-likelihoods is a Monte Carlo methodology to approximate the likelihood function of the parameters and can be applied both in a Bayesian and in a maximum likelihood framework, although the former is more intuitive. One of the challenges of the methodology is that it requires
the observed data not to be used directly, but instead a (rather arbitrary) set of summary statistics for the data must be constructed, to then obtain an approximated likelihood function using Monte Carlo simulations.

The thesis work considers experimenting with such idea under different models, using some real or simulated data to be decided with the supervisor. Therefore this a
project in: (i) computational statistics and inference; (ii) model exploration; (iii) software coding. Some open questions are related to: how to optimally "tune"
the algorithm, which is always a problem-specific task; how to determine when the asymptotics appear to be met for a given problem.
Comparisons with alternative methodologies.

**Necessary pre-requisites are:** some background in Monte Carlo methods, for example having taken this course or a similar one; interest in statistical inference; ability to code in some programming language.

The work will involve quite a bit of software coding from scratch as an important part of this project, as we are not going to use pre-existing/pre-packaged software solutions.
This is because writing the necessary software is the ultimate test for the knowledge-building and understanding process.

**Reference:**

Simon N. Wood, "Statistical inference for noisy nonlinear ecological dynamic models", Nature, 466(26) 2010 http://www.maths.lth.se/matstat/staff/umberto/SyntheticLikelihoods_MSCproject/wood_2010.pdf (Notice this is for information only: it is **not required** from
the student to be familiar with such article before starting the project).

2016: David Zenkert, PNR-based no-show forecast. Lund University, Sweden.

2015:Danial Ali Akbari, Maximum likelihood estimation using Bayesian Monte Carlo methods. Lund University, Sweden.

2013: Oskar Nilsson, Likelihood-free inference and approximate Bayesian computation for stochastic modelling, Lund University, Sweden.

2012: Angela Ciliberti, Parametric inference for stochastic differential equations, Lund University, Sweden.

2011: Alexander Powne, "Diagnostic measures for generalized linear models", Durham University, UK.