News
 calendar oral exam
 Calendar for project 3 presentations
 List of questions for the oral exam
 Welcome letter
 There will be several computer labs during the course, some related to project work others related to nonproject work. Compulsory computer labs are held on Wednesday 22 March 2017, Wednesday 29 March and Wednesday 5 April.
 This course is taught jointly with FMSN40: please check http://www.maths.lth.se/matstat/kurser/fmsn40/
R
We will use the statistical program R which can be downloaded from http://ftp.acc.umu.se/mirror/CRAN/ free of charge for all major platforms. It is a good idea to install it on your own computer, if you have one. Also, a good programming practice is to consider an appropriate editor for writing and executing R programs; therefore I have set a page for Rstudio (for Windows/Linux/MacOS).Notice that this course is about Statistics and is not an indepth course about R. We will discuss the commands needed to produce the desired output and answer the relevant statistical questions. However we will not consider tipsandtricks, good programming practice or any advanced use of such powerful computer language. R has a large and friendly user community and you will be able to find plenty of good guides, tutorials and answered questions by a simple Google search. Here follow some of the many guides freely available on the web:
 R.pdf A (small) R Tutorial
 RTutorial.pdf A Short R
Tutorial
 Rintro.pdf An Introduction to R (<the most uptodate version can be found from the R's Help menu)
Computer Labs
You will have the chance to book specific computer labs sessions. That is you do not have to attend all labs reported in the schedule below, only the ones you book. Special attention should be devoted to mandatory labs denoted in RED: you MUST attend one of those each week for the first three weeks.Schedule
IMPORTANT: attendance to labs for nonproject work is COMPULSORY: these are held on Wednesday 22 March 2017 (10.1512.00 OR 13.1515.00), Wednesday 29 March 2017 (10.1512.00 OR 13.1515.00) and Wednesday 5 April 2017 (10.1512.00 OR 13.1515.00) .Week  Contents  Additional Support  Files  

w12  Mon 20/3, 13.1515 in MH:A (=MH:Riesz)  Introduction; Review of simple linear regression:  linear relationships  linear models and basic assumptions (normality, homoscedasticity, linearity, independence)  least squares estimation  basic properties of expectation, variance and covariance  mean and variance of least squares estimators  Rawlings, Ch. 1 
f1.pdf
f1.R 
Wed 22/3, 8.1510 in MH:B (=MH:Gårding)  Continuation of simple linear regression:  distribution of least squares estimators  prediction;  confidence intervals  hypothesis testing, p values, quantiles  Rawlings, Ch. 1 
f2.pdf
f2.R 

Wed 22/3, 10.1512 in MH:230231 or 13.1515 in MH:230231  compulsory computer lab 
lab1
repair1.txt 

Fri 24/3, 8.1510 in MH:230231 or 10.1512 in MH:230231  computer lab: work on project 1 
Project 1 studentmath.txt  
w13  Mon 27/3, 13.1515 in MH:A  Multiple Regression: matrix notation, properties of least squares estimators for multiple regression  confidence intervals for multiple regression  critical requirements: illranked design matrices, lack of invertibility.  Rawlings, Ch. 3, 4, 6.5 
lab1 solutions See the updated f2.pdf f3.pdf multipleregression.R 
Wed 29/3, 8.1510 in MH:B  Analysis of variance: variability decomposition. Global Ftest. ANOVA tables.  Rawlings, Ch. 4. 
f3.pdf multicollinear.R global+partial_Ftests.R 

Wed 29/3, 10.1512 in MH230 or 13.1515  compulsory computer lab 
lab2 catheter.txt  
Thur 30/3,  computer lab: work on project 1. Enroll here  
w14  Mon 3/4, 1315 in MH:A  Partial Ftest. Factors/Categorical variables: modelling with categorical predictors and interaction terms. Rsquared  Rawlings, ch. 9 for class variables, ch. 7 for variables selection 
lab2 solutions f3.pdf f4.pdf f5.pdf f4.R f4.dat 
Wed 5/4, 8.1510 in MH:B  AdjustedRsquared. AIC & BIC, automatic selection methods. Problems areas in least squares;  Rawlings, Ch. 1011 intro to Akaike's AIC 
f5.pdf rsquaredAICstep.R  
Wed 5/4, 10.1512 in MH:230 or 13.1515 in MH:230  compulsory computer lab 
lab3 lab3 solutions student.txt 

Thur 6/4,  computer lab, work on project 1  
w15  BREAK  
w16  BREAK  
w17  Mon 24/4, 13.1515 in MH:A  13.1514:00: Peer assessment, project 1. 14:1515: Regression diagnostics: outliers w.r.t. X (leverage), distribution of residuals, standardised and studentised residuals; graphical tools for residual analysis. Influential observations (Cook's distance, DFBETAS) 
Rawlings, Ch. 1011 
f6.pdf f6.R f6data.txt 
Wed 26/4, 8.1510 in MH:B  Binary data, Bernoulli and binomial distributions, odds ratios and started talking of Logistic regression  Agresti: ch. 1, sec 1.2.1, sec 2.3 
f7.pdf f7.R f7a.R 

Wed 26/4,  Computer lab, keep working on project 1 or start project 2 (in case we did manage to introduce enough material at lecture) 
project2_17.pdf hilda.Rdata 

Fri 28/4  Project 1 final deadline at 16:00: MASM22/FMSN30 students email the report to FMSN30@matstat.lu.se Subject field: Project1 by studid1 and studid2 

w18  Tue 2/5, 10.1512 in E:C  Asymptotic distribution for parameter estimates and odds ratios from logistic regression; standard errors; maximum likelihood estimation for the parameter of Bernoulli experiments; maximum likelihood for regression parameters; NewtonRaphson method 
Agresti: 1.3.1, 1.4.1, 2.3.12.3.3; several topics scattered in chapter 4, particularly sections 4.14.2. Check here section 3 on NewtonRaphson. 
see again files uploaded on 26/4, and in addition f8.pdf f8.R f8a.R Section 3 on Newton Raphson 
Wed 3/5, 8.1510 in E:C  Deviance and Likelihood ratio test. Residuals and model validation in logistic regression. Comparing and fitting observed proportions with predicted proportions. Briefly started Generalised Linear Models and exponential families. 
f9.pdf f10.pdf f9.R f9.txt f9_2.txt  
Wed 3/5 at 10.1512 in MH230  Computer exercise, work on project 2  
Thur 4/5,  Computer exercise, work on project 2  
w19  Mon 8/5, 1315 in MH:A  Poisson distribution and Poisson regression; Negative binomial regression  Agresti: several sections in Chapter 3. Also see the example discussed here 
f10.pdf f10.R f10.txt f10b.txt poissregrawards.R poisson_sim.csv 
Wed 10/5, 8.1510 in MH:B  8.159: Peer assessment project
2; 9.1510: Quantile regression 
material on quantile regression:  intro article  quantile regression with R 
f11.pdf f11.R f11a.R 

Wed 10/5, 10.1512 in E:1147  Computer exercise, work on project 2  
Thur 11/5,  Computer exercise, work on project 2  
Fri 12/5  Project 3 dataCDI.txt data.Rda 

w20  Monday 15/5 at 10.00  Final deadline for project 2: FMSN30/MASM22 students email the report to FMSN30@matstat.lu.se Subject field: Project2 by studid1 and studid2  
Wed 17/5, 10.1512 in MH:230  Computer lab: work on project 3  
Thur 18/5,  Computer lab: work on project 3  
w21  Project 3 oral presentations  
w22  Oral exam 
Level
Advanced level.
Aim
Regression analysis deals with modelling how one characteristic (height, weight, price, concentration, etc) varies with one or several other characteristics (sex, living area, expenditures, temperature, etc). Linear regression is introduced in the basic course in mathematical statistics but here we expand with, e.g., "how do I check that the model fits the data", "what should I do if it doesn't fit", "how uncertain is it", and "how do I use it to draw conclusions about reality".
When performing a survey where people can answer "yes/no" or "little/just fine/much", or "car/bicycle/bus" or some other categorical alternative, you cannot use linear regression. Then you need logistic regression instead. This is the topic in the second half of the course.
Contents
Least squares and maximumlikelihoodmethod; odds ratios; Multiple linear and logistic regression; Matrix formulation; Methods for model validation, residuals, outliers, influential observations, multi colinearity, change of variables; Choice of regressors, Ftest, likelihoodratiotest; Confidence intervals and prediction. Introduction to: Correlated errors, Poisson regression as well as multinomial and ordinal logistic regression.
Prerequisites
At least 60 ECTS at university level including an introductory course in mathematical statistics, e.g. MASA01 Matematical statistics, basic course, 15hp, or MASB02 Mathematical statistics (for chemists) 7.5hp, or MASB03 Mathematical statistics (for physicists) 9hp or MASB11 Biostatistics, basic course 7.5hp, or equivalent.
Teaching and examination
The teaching consists of lectures, exercises, computer exercises and project work. Among the several given computer labs, attendance to three of those is compulsory, namely those held on 25/3 (1012), 31/3 (1315) and 8/4 (1012). The examination is written and oral in the form of project reports, written and oral opposition, and individual oral examination.
Literature
 Rawlings, J.O., Pantula, S.G., Dickey, D.A.: Applied Regression Analysis  A Research Tool, 2ed, Springer, available as ebook,
 Agresti, A. An Introduction To Categorical Data Analysis, 2ed Wiley, 2007, available as ebook.
 Need to refresh matrix theory? Check a minimal introduction and the very useful Matrix Cookbook.
Lecturer
Umberto Picchini, tel 046 222 9270, office MH:321, email address.
Teaching Assistants
Rachele Anderson, tel 046 222 4580 , office MH:323,
rachele@maths.lth.se
Vladimir Pastukhov, tel 462227974, office MH:324, pastuhov@maths.lth.se
Learning outcomes
Knowledge and understanding
For a passing grade the student must
 Describe the differences between continuous and discrete data, and the resulting consequences for the choice of statistical model
 Give an account of the principles behind different estimation principles,
 Describe the statistical properties of such estimates as appear in regression analysis,
 Interpret regression relations in terms of conditional distributions,
 Explain the concepts of odds and odds ratio, and describe their relation to probabilities and to logistic regression.
Skills and abilities
For a passing grade the student must
 Formulate a multiple linear regression model for a concrete problem,
 Formulate a multiple logistic regression model for a concrete problem,
 Estimate the parameters in the regression model and interpret them,
 Examine the validity of the model and make suitable modifications of the model,
 Use the model resulting for prediction,
 Use some statistical computer program for analysis of regression data, and interpret the results,
 Present the analysis and conclusions of a practical problem in a written report and an oral presentation.
Judgement and approach
For a passing grade the student must
 Always control the prerequisites before stating a regression model,
 Evaluate the plausibility of a performed study,
 Relect over the limitations of the chosen model and estimation method, as well as alternative solutions.
Validate: HTML / CSS  Top of page