[Skip navigation]

[LU]

FMSN30/MASM22: Linear and Logistic Regression, 7.5 ECTS credits

News

R

We will use the statistical program R which can be downloaded from http://ftp.acc.umu.se/mirror/CRAN/ free of charge for all major platforms. It is a good idea to install it on your own computer, if you have one. Also, a good programming practice is to consider an appropriate editor for writing and executing R programs; therefore I have set a page for Rstudio (for Windows/Linux/MacOS).

Notice that this course is about Statistics and is not an in-depth course about R. We will discuss the commands needed to produce the desired output and answer the relevant statistical questions. However we will not consider tips-and-tricks, good programming practice or any advanced use of such powerful computer language. R has a large and friendly user community and you will be able to find plenty of good guides, tutorials and answered questions by a simple Google search. Here follow some of the many guides freely available on the web:

Computer Labs

You will have the chance to book specific computer labs sessions. That is you do not have to attend all labs reported in the schedule below, only the ones you book. Special attention should be devoted to mandatory labs denoted in RED: you MUST attend one of those each week for the first three weeks.

Schedule

IMPORTANT: attendance to labs for non-project work is COMPULSORY: these are held on Wednesday 22 March 2017 (10.15-12.00 OR 13.15-15.00), Wednesday 29 March 2017 (10.15-12.00 OR 13.15-15.00) and Wednesday 5 April 2017 (10.15-12.00 OR 13.15-15.00) .
Week Contents Additional Support Files
w12 Mon 20/3, 13.15-15 in MH:A (=MH:Riesz) Introduction; Review of simple linear regression: - linear relationships - linear models and basic assumptions (normality, homoscedasticity, linearity, independence) - least squares estimation - basic properties of expectation, variance and covariance - mean and variance of least squares estimators Rawlings, Ch. 1 f1.pdf
f1.R
Wed 22/3, 8.15-10 in MH:B (=MH:Gårding) Continuation of simple linear regression: - distribution of least squares estimators - prediction; - confidence intervals - hypothesis testing, p values, quantiles Rawlings, Ch. 1 f2.pdf
f2.R
Wed 22/3, 10.15-12 in MH:230-231 or 13.15-15 in MH:230-231 compulsory computer lab lab1
repair1.txt
Fri 24/3, 8.15-10 in MH:230-231 or 10.15-12 in MH:230-231 computer lab: work on project 1 Project 1
studentmath.txt
w13 Mon 27/3, 13.15-15 in MH:A Multiple Regression: matrix notation, properties of least squares estimators for multiple regression - confidence intervals for multiple regression - critical requirements: ill-ranked design matrices, lack of invertibility. Rawlings, Ch. 3, 4, 6.5 lab1 solutions
See the updated f2.pdf
f3.pdf
multipleregression.R
Wed 29/3, 8.15-10 in MH:B Analysis of variance: variability decomposition. Global F-test. ANOVA tables. Rawlings, Ch. 4. f3.pdf
multicollinear.R
global+partial_Ftests.R
Wed 29/3, 10.15-12 in MH230 or 13.15-15 compulsory computer lab lab2
catheter.txt
Thur 30/3, 13.15-15 in MH230 or 15.15-17 in MH:230 computer lab: work on project 1. Enroll here
w14 Mon 3/4, 13-15 in MH:A Partial F-test. Factors/Categorical variables: modelling with categorical predictors and interaction terms. R-squared Rawlings, ch. 9 for class variables, ch. 7 for variables selection lab2 solutions
f3.pdf
f4.pdf
f5.pdf
f4.R
f4.dat
Wed 5/4, 8.15-10 in MH:B Adjusted-R-squared. AIC & BIC, automatic selection methods. Problems areas in least squares; Rawlings, Ch. 10-11
intro to Akaike's AIC
f5.pdf
rsquaredAICstep.R
Wed 5/4, 10.15-12 in MH:230 or 13.15-15 in MH:230 compulsory computer lab lab3
lab3 solutions
student.txt
Thur 6/4, 13.15-15 in MH230 or 15.15-17 in MH:230 computer lab, work on project 1
w15 BREAK
w16 BREAK
w17 Mon 24/4, 13.15-15 in MH:A 13.15-14:00: Peer assessment, project 1.
14:15-15: Regression diagnostics: outliers w.r.t. X (leverage), distribution of residuals, standardised and studentised residuals; graphical tools for residual analysis. Influential observations (Cook's distance, DFBETAS)
Rawlings, Ch. 10-11 f6.pdf
f6.R
f6data.txt
Wed 26/4, 8.15-10 in MH:B Binary data, Bernoulli and binomial distributions, odds ratios and started talking of Logistic regression Agresti: ch. 1, sec 1.2.1, sec 2.3 f7.pdf
f7.R
f7a.R
Wed 26/4, 10.15-12 in MH230 or13.15-15 in MH:230 Computer lab, keep working on project 1 or start project 2 (in case we did manage to introduce enough material at lecture) project2_17.pdf
hilda.Rdata
Fri 28/4 Project 1 final deadline at 16:00: MASM22/FMSN30 students email the report to FMSN30@matstat.lu.se
Subject field: Project1 by studid1 and studid2
w18 Tue 2/5, 10.15-12 in E:C Asymptotic distribution for parameter estimates and odds ratios from logistic regression; standard errors; maximum likelihood estimation for the parameter of Bernoulli experiments; maximum likelihood for regression parameters; Newton-Raphson method Agresti: 1.3.1, 1.4.1, 2.3.1-2.3.3; several topics scattered in chapter 4, particularly sections 4.1-4.2.
Check here section 3 on Newton-Raphson.
see again files uploaded on 26/4, and in addition
f8.pdf
f8.R
f8a.R
Section 3 on Newton Raphson
Wed 3/5, 8.15-10 in E:C Deviance and Likelihood ratio test. Residuals and model validation in logistic regression. Comparing and fitting observed proportions with predicted proportions. Briefly started Generalised Linear Models and exponential families. f9.pdf
f10.pdf
f9.R
f9.txt
f9_2.txt
Wed 3/5 at 10.15-12 in MH230 or 13.15-12 in MH:230 Computer exercise, work on project 2
Thur 4/5, 13.15-15 or 15.15-17 in MH:230 Computer exercise, work on project 2
w19 Mon 8/5, 13-15 in MH:A Poisson distribution and Poisson regression; Negative binomial regression Agresti: several sections in Chapter 3. Also see the example discussed here f10.pdf
f10.R
f10.txt
f10b.txt
poissregr-awards.R
poisson_sim.csv
Wed 10/5, 8.15-10 in MH:B 8.15-9: Peer assessment project 2;
9.15-10: Quantile regression
material on quantile regression:
- intro article
- quantile regression with R
f11.pdf
f11.R
f11a.R
Wed 10/5, 10.15-12 in E:1147 or 13.15-15 in MH230 Computer exercise, work on project 2
Thur 11/5, 13.15-15 in MH230 or 15.15-17 in E:1145 Computer exercise, work on project 2
Fri 12/5 Project 3
dataCDI.txt
data.Rda
w20 Monday 15/5 at 10.00 Final deadline for project 2: FMSN30/MASM22 students email the report to FMSN30@matstat.lu.se Subject field: Project2 by studid1 and studid2
Wed 17/5, 10.15-12 in MH:230 or 13.15-15 in MH:230 Computer lab: work on project 3
Thur 18/5, 13.15-15 in MH:230 or 15.15-17 in MH:230 Computer lab: work on project 3
w21 Project 3 oral presentations
w22 Oral exam

Level

Advanced level.


Aim

Regression analysis deals with modelling how one characteristic (height, weight, price, concentration, etc) varies with one or several other characteristics (sex, living area, expenditures, temperature, etc). Linear regression is introduced in the basic course in mathematical statistics but here we expand with, e.g., "how do I check that the model fits the data", "what should I do if it doesn't fit", "how uncertain is it", and "how do I use it to draw conclusions about reality".

When performing a survey where people can answer "yes/no" or "little/just fine/much", or "car/bicycle/bus" or some other categorical alternative, you cannot use linear regression. Then you need logistic regression instead. This is the topic in the second half of the course.


Contents

Least squares and maximum-likelihood-method; odds ratios; Multiple linear and logistic regression; Matrix formulation; Methods for model validation, residuals, outliers, influential observations, multi co-linearity, change of variables; Choice of regressors, F-test, likelihood-ratio-test; Confidence intervals and prediction. Introduction to: Correlated errors, Poisson regression as well as multinomial and ordinal logistic regression.


Prerequisites

At least 60 ECTS at university level including an introductory course in mathematical statistics, e.g. MASA01 Matematical statistics, basic course, 15hp, or MASB02 Mathematical statistics (for chemists) 7.5hp, or MASB03 Mathematical statistics (for physicists) 9hp or MASB11 Biostatistics, basic course 7.5hp, or equivalent.


Teaching and examination

The teaching consists of lectures, exercises, computer exercises and project work. Among the several given computer labs, attendance to three of those is compulsory, namely those held on 25/3 (10-12), 31/3 (13-15) and 8/4 (10-12). The examination is written and oral in the form of project reports, written and oral opposition, and individual oral examination.


Literature


Lecturer

Umberto Picchini, tel 046 222 9270, office MH:321, email address.


Teaching Assistants

Rachele Anderson, tel 046 222 4580 , office MH:323, rachele@maths.lth.se
Vladimir Pastukhov, tel 462227974, office MH:324, pastuhov@maths.lth.se


Learning outcomes

Knowledge and understanding

For a passing grade the student must

Skills and abilities

For a passing grade the student must

Judgement and approach

For a passing grade the student must