[Skip navigation]

[LU]

FMSN30/MASM22: Linear and Logistic Regression, 7.5 ECTS credits

News

R

We will use the statistical program R which can be downloaded from http://ftp.sunet.se/pub/lang/CRAN/ free of charge for all major platforms. It is a good idea to install it on your own computer, if you have one. Also, a good programming practice is to consider an appropriate editor for writing R programs; therefore I have set a page for the Tinn-R editor (Tinn is for Windows only, but the page contains alternative suggestions for other operative systems) and one for Rstudio (for Windows/Linux/MacOS).

Notice that this course is about statistics and not an in-depth course about R. We will discuss the commands needed to produce the desired output and answer the relevant statistical questions. However we will not consider tips-and-tricks, good programming practice or any advanced use of such powerful computer language. R has a large and friendly user community and you will be able to find plenty of good guides, tutorials and answered questions by a simple Google search. Here follow some of the many guides freely available on the web:

Schedule

Preliminary plan.

Week Contents Chapter Notes
v3 Mon 21/1 in E:3308 Introduction; Review of simple linear regression Rawlings, Ch. 1 lecture_1.pdf (updated! only third last slide has been changed)
Wed 23/1 in E:C Multiple regression Rawlings, Ch. 2-3 lecture_2.pdf (updated! only slides 8 and 11 have been modified)
The Matrix Cookbook (<-- might be useful)
R code for lectures 1-2 (can be opened with R or any text editor).
Wed 23/1 in E:1145 Exercises: 1.4, 1.9 (*), 1.14, 1.15, 1.16, 1.19, 1.21 (*); 3.5, 3.6, 3.9, 3.10, 3.12, 3.14
(*): skip the "analysis of variance" part.
Exercises 1.x, Exercises 3.xAnswers to exercises 1.x and 3.x
R code for exercise 1.9
Fri 25/1 in E:Hacke and E:Panter Computer exercise, work on project 1 project 1 manual
project1_data.Rdata (R data-file)
project1_data.csv (semicolon-separated ascii data-file)
v4 Mon 28/1 in E:3308 Analysis of variance Rawlings, Ch. 1.4, 1.6, 4.1-5 lecture_3.pdf
some R code for multiple regression
some R code for t and F tests
Wed 30/1 in E:C Class variables and variable selection Rawlings, Ch. 9 and 7 respectively lecture_4.pdf
R code for lecture 4
data for lecture 4 (<--right mouse click + save-link-as)
Wed 30/1 in E:1145 Exercises: older ones + 4.6(a)(i)+(b)(ii)+(c)(i), 4.10(a)+(b)+(c)+(e) Exercises 4.x Answers to exercises 4.x
Fri 1/2 in E:Falk and E:Val Computer exercise, work on project 1
v5 Mon 4/2 in E:3308 Problems areas in least squares. Rawlings, Ch. 10 lecture_5.pdf
Wed 6/2 in E1406 Regression diagnostics Rawlings, Ch. 11 lecture_6.pdf
R code for lecture 6
data for lecture 6
Wed 6/2 in E:1145 Exercises: old ones + 10.1-10.2, 11.1-11.2 answers Ch. 10-11
Fri 8/2 in E:Falk and E:Val Computer exercise, work on project 1
v6 Mon 11/2 in E:3308 Peer assessment, project 1. Summary of linear regression. If time allows I will start talking about binary data and odds ratios.
Wed 13/2 in E:C Binary data and odds ratios. Logistic regression Christensen, Ch. 1, 2.1-3, 2.6; also check sec. 1.4.1-1.4.2 in Agresti lecture_7.pdf
R code for lecture 7 (as shown at lecture)
R code for lecture 7 (richer version)
Wed 13/2 in E:1145 Exercise: ex. on binary data (but skip (d)) Solution for excercise on binary data
Fri 15/2 in E:Falk and E:Val Computer exercise, final deadline project 1, work on project 2 project2.pdf (manual)
v7 Mon 18/2 in E:3308 Maximum Likelihood and likelihood ratio tests Agresti:, sec. 3.4.4; check here section 3 on Newton-Raphson lecture 8
R code for lecture 8
Wed 20/2 in E:C Residuals and model validation in logistic regression lecture_9.pdf
R code for lecture 9
made up data
more made up data
Wed 20/2 in E: Falk, E:Val Computer exercise, work on project 2
Fri 22/2 in E:Falk and E:Val Computer exercise, work on project 2
v8 Mon 25/2 in E:3308 Poisson, Negative binomial Christensen, Ch. 1.5, 9 lecture_10.pdf
R code
Poisson regr data
Neg bin regr data
Wed 27/2 in E1406 10-11: Peer assessment, project 2;
11-12: Quantile regression
Quantile Regression
An R "vignette"
lecture_11.pdf
f11a.R (simple version, as shown at lecture)
f11.R (applies QR to several situations)
Wed 27/2 in E: Falk, E:Val (Computer) exercise
Fri 1/3 in E:Falk and E:Val Computer exercise, Final deadline project 2, work on project 3 project3.pdf
v9 Mon 4/3 in E: Varg, E:Val Work on project 3
Wed 6/3 in E:C CANCELLED Lecture: Summary
Wed 6/3 in E: Falk, E:Val Work on project 3
Fri 8/3 in E:Falk and E:Val Work on project 3
v10-11 Presentation of project 3. See the calendar for the project 3 presentations (updated)
v12--... Individual oral exams. See the exam calendar

Level

Advanced level.

Aim

Regression analysis deals with modelling how one characteristic (height, weight, price, concentration, etc) varies with one or several other characteristics (sex, living area, expenditures, temperature, etc). Linear regression is introduced in the basic course in mathematical statistics but here we expand with, e.g., "how do I check that the model fits the data", "what should I do if it doesn't fit", "how uncertain is it", and "how do I use it to draw conclusions about reality".

When performing a survey where people can answer "yes/no" or "little/just fine/much", or "car/bicycle/bus" or some other categorical alternative, you cannot use linear regression. Then you need logistic regression instead. This is the topic in the second half of the course.

Contents

Least squares and maximum-likelihood-method; odds ratios; Multiple linear and logistic regression; Matrix formulation; Methods for model validation, residuals, outliers, influential observations, multi co-linearity, change of variables; Choice of regressors, F-test, likelihood-ratio-test; Confidence intervals and prediction. Introduction to: Correlated errors, Poisson regression as well as multinomial and ordinal logistic regression.

Prerequisites

At least 60 ECTS at university level including an introductory course in mathematical statistics, e.g. MASA01 Matematical statistics, basic course, 15hp, or MASB02 Mathematical statistics (for chemists) 7.5hp, or MASB03 Mathematical statistics (for physicists) 9hp or MASB11 Biostatistics, basic course 7.5hp, or equivalent.

Teaching and examination

The teaching consists of lectures, exercises, computer exercises and project work. The examination is written and oral in the form of project reports, written and oral opposition, and individual oral examination.

Literature (NEWS of 10th January: download links have been fixed and you can now download the pdf for each chapter!)

Lecturer

Umberto Picchini, tel 046 222 9270, office MH:125a, email address.

Teaching Assistants

Behnaz Pirzamanbin, tel 046 222 4623, office MH:326, behnaz@maths.lth.se
Elham Pirnia, elham@maths.lth.se

Learning outcomes

Knowledge and understanding

For a passing grade the student must

Skills and abilities

For a passing grade the student must

Judgement and approach

For a passing grade the student must