Framåtinkludering i multipel linjär regressioon
Mattis Gottlow and Zahra Sadeghi

Centre for Mathematical Sciences
Mathematical Statistics
Lund University

When selecting a model in linear regression, forward inclusion is often used, starting with a model with no predictors, one predictor is added at a time until there is no significant improvement of the model. For each predictor this is decided by a test. There are numerous ways of performing this test. However, the dependency on data is typically ignored in each step both for the model being tested and the hypothesis set up. Moreover, the actual significance level of the test is unknown.

Alternatively one can take the sequence of models generated by the forward inclusion scheme and then select the model which minimizes an estimation of the prediction error. The traditional methods (such as Mallow's Cp-criterion) ignore that the models being considered depend on data.

In this thesis we will present three Monte Carlo-based model selection methods which, to various extent, will take the above mentioned facts into account. The performance of these methods are investigated in a simulation study and the results show that, by considering the dependency of the elected model on data, a better choice of model can be made.