[Skip navigation]

[LU]

FMSN30/MASM22: Linear and Logistic Regression, 7.5 ECTS credits

News

R

We will use the statistical program R which can be downloaded from http://ftp.acc.umu.se/mirror/CRAN/ free of charge for all major platforms. It is a good idea to install it on your own computer, if you have one. Also, a good programming practice is to consider an appropriate editor for writing and executing R programs; therefore I have set a page for Rstudio (for Windows/Linux/MacOS).

Notice that this course is about Statistics and is not an in-depth course about R. We will discuss the commands needed to produce the desired output and answer the relevant statistical questions. However we will not consider tips-and-tricks, good programming practice or any advanced use of such powerful computer language. R has a large and friendly user community and you will be able to find plenty of good guides, tutorials and answered questions by a simple Google search. Here follow some of the many guides freely available on the web:

Course specific help:

Computer Labs

You will have the chance to book specific computer labs sessions. That is you do not have to attend all labs reported in the schedule below, only the ones you book. Special attention should be devoted to mandatory labs denoted in bold: you MUST attend one of those each week for the first three weeks.

Literature

Schedule spring 2019

Week Place Contents Material
w13 Mon 25/3, 13.15-15, MH:Riesz Lecture 1: Introduction; Review of simple linear regression, linear relationships, linear models and basic assumptions (normality, homoscedasticity, linearity, independence), least squares estimation, basic properties of expectation, variance and covariance; mean and variance of least squares estimators. Rawlings Ch.1;
lecture1_vt19.pdf;
lecture1_vt19.R
Wed 27/3, 8.15-10, MH:Riesz Lecture 2: Continuation of simple linear regression; distribution of least squares estimators; prediction; confidence intervals; hypothesis testing, p-values, quantiles. Rawlings Ch.1
Wed 27/3, 10.15-12 MH:230+231 or 13.15-15, MH:230 Compulsory computer lab 1 Sign-up;
lab1_vt19.pdf;
Thur 28/3, 8.15-10, MH:230+231 work on Project 1
w14 Mon 1/4, 13.15-15, MH:Riesz Lecture 3. Multiple Regression: matrix notation, properties of least squares estimators for multiple regression; confidence intervals for multiple regression; critical requirements; ill-ranked design matrices, lack of invertibility. Rawlings Ch.3, 4, 6.5
Wed 3/4, 8.15-10, MH:Riesz Lecture 4. Categorical variables. Analysis of variance: variability decomposition. Global F-test. ANOVA tables. Partial F-test. Rawlings Ch.4, 9.
Wed 3/4, 10.15-12 MH:230+231 or 13.15-15, MH230 Compulsory computer lab 2 Sign-up;
Thu 4/4, 8.15-10, MH:230+231 Work on project 1
w15 Mon 8/4, 13.15-15, MH:Riesz Lecture 5. R-squared, Adjusted-R-squared. AIC & BIC, automatic selection methods Rawlings Ch.7
Wed 10/4, 8.15-10, MH:Riesz Lecture 6. Problem areas in least squares; Regression diagnostics: outliers w.r.t. X (leverage), distribution of residuals, standardised and studentised residuals; graphical tools for residual analysis. Influential observations (Cook's distance, DFBETAS) Rawlings Ch.10-11
Wed 10/4, 10.15-12 MH:230+231 or 13.15-15, MH:230 Compulsory computer lab 3 Sign-up;
Thu 11/4, 8.15-10, MH:230+231 Work on project 1
w16 Mon 15/4, 13.15-15, MH:Riesz 13.15-14:00: Peer assessment, project 1
14:15-15: Wrapping up linear regression
Tue 16/4, 8.15-10, MH:Riesz Lecture 7. Binary data, Bernoulli and binomial distributions, odds ratios and started talking of Logistic regression Agresti Ch. 1, sec 1.2.1, sec 2.3
Wed 17/4, 13.15-15, MH:230 Finish project 1 and start on project 2
Wed 17/4, 16.00 Project 1 final deadline at 16:00. MASM22/FMSN30 students email the report to FMSN30@matstat.lu.se. Subject field: Project1 by studid1 and studid2
w17 EASTER BREAK and RE-EXAM PERIOD
w18
w19 Mon 6/5, 13.15-15, MH:Riesz 8. Maximum likelihood estimation, Newton-Raphson, properties, deviance and likelihood ratio tests. Agresti: 1.3.1, 1.4.1, 2.3.1-2.3.3; several topics scattered in chapter 4, particularly sections 4.1-4.2.
Wed 8/5, 8.15-10, MH:Riesz 9. Akaike (again), Pseudo-R2, residuals and model validation in logistic regression.
Wed 8/5, 10.15-12, MH:230 Work on project 2
Thu 9/5, 8.15-10, MH:230 Work on project 2
w20 Mon 13/5, 13.15-15, MH:Riesz 10. Poisson distribution and Poisson regression; Negative binomial regression Agresti: several sections in Chapter 3.
Wed 15/5, 8.15-10, MH:Riesz 8.15-9.00: Peer assessment project 2:
9.15-10: 11. Summary of logistic regression.
Wed 15/5, 10.15-12, MH:230 Work on project 2 and start on project 3
Thu 16/5, 8.15-10, MH:230 Work on project 2 and/or start on project 3
Fri 17/5, 16.00 Project 2 final deadline at 16.00.. MASM22/FMSN30 students email the report to FMSN30@matstat.lu.se. Subject field: Project2 by studid1 and studid2
w21 Wed 22/5, 13.15-15, MH:230 Work on project 3
Thu 23/5, 8.15-10, MH:230 Work on project 3
w22 Mon 27/5, 8.15-10, MH:Sigma Project 3 oral presentations
Mon 27/5, 10.15-12, MH:Sigma Project 3 oral presentations
Mon 27/5, 15.15-17, MH:Sigma Project 3 oral presentations
Tue 28/5, 10.15-12, MH:Sigma Project 3 oral presentations
Tue 28/5, 15.15-17, MH:Sigma Project 3 oral presentations
Wed 29/5, 8.15-10, MH:Sigma Project 3 oral presentations
Wed 29/5, 13.15-15, MH:Sigma Project 3 oral presentations
Wed 29/5, 15.15-17, MH:Sigma Project 3 oral presentations
w23 Mon 3/6, Tue 4/6, Wed 5/6, 8.15-17.00 Oral exams
w24 Mon 10/6, Tue 11/6, Wed 12/6, Fri 14/6, 8.15-17.00 Oral exams
w25 Mon 17/6, Tue 18/6, Wed 19/6, Thu 120/6, 8.15-17.00 Oral exams

Level

Advanced level.

Aim

Regression analysis deals with modelling how one characteristic (height, weight, price, concentration, etc) varies with one or several other characteristics (sex, living area, expenditures, temperature, etc). Linear regression is introduced in the basic course in mathematical statistics but here we expand with, e.g., "how do I check that the model fits the data", "what should I do if it doesn't fit", "how uncertain is it", and "how do I use it to draw conclusions about reality".

When performing a survey where people can answer "yes/no" or "little/just fine/much", or "car/bicycle/bus" or some other categorical alternative, you cannot use linear regression. Then you need logistic regression instead. This is the topic in the second half of the course.

Contents

Least squares and maximum-likelihood-method; odds ratios; Multiple linear and logistic regression; Matrix formulation; Methods for model validation, residuals, outliers, influential observations, multi co-linearity, change of variables; Choice of regressors, F-test, likelihood-ratio-test; Confidence intervals and prediction. Introduction to: Correlated errors, Poisson regression as well as multinomial and ordinal logistic regression.

Prerequisites

At least 60 ECTS at university level including an introductory course in mathematical statistics, e.g. MASA01 Matematical statistics, basic course, 15hp, or MASB02 Mathematical statistics (for chemists) 7.5hp, or MASB03 Mathematical statistics (for physicists) 9hp or MASB11 Biostatistics, basic course 7.5hp, or equivalent.

Teaching and examination

The teaching consists of lectures, computer exercises and project work. Attendance to the three exercises is compulsory. The examination is written and oral in the form of written reports for project 1 and 2, oral presentation of project 3 and individual oral examination.

Lecturer

Anna Lindgren, tel 046-2224276, office MH:136, Matematikcentrum anna@maths.lth.se.

Teaching Assistants

Learning outcomes

Knowledge and understanding

For a passing grade the student must

Skills and abilities

For a passing grade the student must

Judgement and approach

For a passing grade the student must