[Skip navigation]

[LU]

FMSN30/MASM22: Linear and Logistic Regression, 7.5 ECTS credits

News

R

We will use the statistical program R which can be downloaded from http://ftp.acc.umu.se/mirror/CRAN/ free of charge for all major platforms. It is a good idea to install it on your own computer, if you have one. Also, a good programming practice is to consider an appropriate editor for writing and executing R programs; therefore I have set a page for Rstudio (for Windows/Linux/MacOS).

Notice that this course is about Statistics and is not an in-depth course about R. We will discuss the commands needed to produce the desired output and answer the relevant statistical questions. However we will not consider tips-and-tricks, good programming practice or any advanced use of such powerful computer language. R has a large and friendly user community and you will be able to find plenty of good guides, tutorials and answered questions by a simple Google search. Here follow some of the many guides freely available on the web:

Computer Labs

You will have the chance to book specific computer labs sessions. That is you do not have to attend all labs reported in the schedule below, only the ones you book. Special attention should be devoted to mandatory labs denoted in bold: you MUST attend one of those each week for the first three weeks.

Literature

Schedule spring 2018

>
Week Place Contents Additional Support Files
w12 Mon 19/3, 13.15-15, MH:Riesz 1.Introduction; Review of simple linear regression, linear relationships, linear models and basic assumptions (normality, homoscedasticity, linearity, independence), least squares estimation, basic properties of expectation, variance and covariance; mean and variance of least squares estimators Rawlings, Ch. 1 f1_vt18.pdf;
f1_vt18.R (bugfix 21/3)
Wed 22/3, 8.15-10, MH:Riesz 2. Continuation of simple linear regression; distribution of least squares estimators; prediction; confidence intervals; hypothesis testing, p-values, quantiles Rawlings, Ch. 1 f2_vt18.pdf;
f2_vt18.R
Wed 22/3, 10.15-12 or 13.15-15, MH:230 compulsory computer lab 1 Lab 0;
Lab 1;
Lab 1-solutions
Thur  22/3, 8.15-10, MH:230 or
Fri 23/3, 13.15-15, MH:231
work on Project 1 Project 1 (updated 27/3);
plasma.txt
w13 Mon 26/3, 13.15-15, MH:Riesz 3. Multiple Regression: matrix notation, properties of least squares estimators for multiple regression; confidence intervals for multiple regression; critical requirements; ill-ranked design matrices, lack of invertibility. Rawlings, Ch. 3, 4, 6.5 f3_vt18.pdf;
f3_vt18.R;
f3_matriser.R
Tue 27/3, 8.15-10, MH:Riesz 4. Categorical variables. Analysis of variance: variability decomposition. Global F-test. ANOVA tables. Partial F-test. Rawlings, Ch. 4, 9. f4_vt18.pdf (plotfix 27/3);
f4_vt18.R (bugfix 27/3)
Tue 27/3, 10.15-12 or 13.15-15, MH230 Compulsory computer lab 2 Lab 2;
sleep.txt;
Lab2-solutions
Wed 28/3, 13.15-15, MH:230 or MH:231 Work on project 1
w14 EASTER BREAK and RE-EXAM PERIOD
w15
w16 Mon 16/4, 13.15-15, MH:Riesz 5. R-squared, Adjusted-R-squared. AIC & BIC, automatic selection methods Rawlings, Ch. 7 f5_vt18.pdf (error on p.4 fixed);
f5_vt18.R
Wed 18/4, 8.15-10, MH:Riesz 6. Problem areas in least squares; Regression diagnostics: outliers w.r.t. X (leverage), distribution of residuals, standardised and studentised residuals; graphical tools for residual analysis. Influential observations (Cook's distance, DFBETAS) Rawlings, Ch. 10-11 f6_vt18.pdf;
f6_vt18.R;
f6data.txt;
f6_residvar.pdf;
Wed 18/4, 10.15-12 or 13.15-15, MH:230 Compulsory computer lab 3 Lab 3
CDI.txt;
Lab3-solutions
Thu 19/4, 13.15-15, MH:230 or MH:231 Work on project 1
w17 Mon 23/4, 13.15-15, MH:Riesz 13.15-14:00: Peer assessment, project 1 .
14:15-15: Wrapping up linear regression
Wed 25/4, 8.15-10, MH:Riesz 7. Binary data, Bernoulli and binomial distributions, odds ratios and started talking of Logistic regression Agresti: ch. 1, sec 1.2.1, sec 2.3 f7_vt18.pdf;
f7_vt18.R (bugfix 25/4)
Wed 25/4, 10.15-12 or 13.15-15, MH:230 Work on project 1 and start on project 2 Project 2;
pm10.txt
Thu 26/4, 16.00 Project 1 final deadline at 16:00. MASM22/FMSN30 students email the report to FMSN30@matstat.lu.se. Subject field: Project1 by studid1 and studid2
w18 Wed 2/5, 8.15-12, MH:Riesz 8. Maximum likelihood estimation, Newton-Raphson, properties, deviance and likelihood ratio tests. Agresti: 1.3.1, 1.4.1, 2.3.1-2.3.3; several topics scattered in chapter 4, particularly sections 4.1-4.2. f8_vt18.pdf;
f8_vt18.R
Thu 3/5, 8.15-10, MH:Riesz 9. Akaike (again), Pseudo-R2, residuals and model validation in logistic regression. f9_vt18.pdf;
f9_vt18.R;
f9_data.txt
w19Tue 8/5, 10.15-12 or 13.15-15, MH:230 Work on project 2
Wed 9/5, 13.15-15, MH:230 or MH:231 Work on project 2
w20 Tue 15/5, 15.16-17, MH:Riesz 10. Poisson distribution and Poisson regression; Negative binomial regression Agresti: several sections in Chapter 3. f10_vt18.pdf;
f10_vt18.R (bugfix 15/5);
poisson_sim.csv;
f10b.txt
Wed 16/5, 8.15-10, MH:Riesz 8.15-9.00: Peer assessment project 2:
9.15-10: 11. Quantile regression. Summary og logistic regression.
Wed 16/5, 13.15-15, MH:230 or MH:231 Work on project 2 and start on project 3 Project 3;
cardio.txt
Thu 17/5, 13.15-15, MH:230 or MH:231 Work on project 2 and/or start on project 3
Thu 17/5, 16.00 Project 2 final deadline at 16.00.. MASM22/FMSN30 students email the report to FMSN30@matstat.lu.se. Subject field: Project2 by studid1 and studid2
w21 Tue 22/5, 13.15-15, MH:230 or MH:231 Work on project 3
Wed 23/5, 8.15-10, MH:230 or MH:231 Work on project 3
Thu 24/5, 9.15-10, MH:Sigma Project 3 oral resentations: Per Niklas+Lampros, Karolina
Thu 24/5, 13.15-15, MH:Sigma Project 3 oral presentations: Rita+Elisabeth, Rickard+Gabriella, Johan+Martin, Amanda, Carl+Amanda
Thu 24/5, 15.15-16, MH:Sigma Project 3 oral presentations: Kevin+Nikolaos, Martin
Fri 25/5, 10.15-12, MH:Sigma Project 3 oral presentations: Yen+Dongni, Carl+Jesper
w22 Mon 28/5, 9.15-10, MH:Sigma Project 3 oral presentations: Rasmus+Nathaniel, Mara
Mon 28/5, 10.15-11, MH:Sigma Project 3 oral presentations: Adrian+Henrik, Björn
Mon 28/5, 13.15-14, MH:Sigma Project 3 oral presentations: Zongguo, Oskar
Mon 28/5, 15.15-17, MH:Sigma Project 3 oral presentations: Marcus+Jan, Justinas+Niklas
Wed 30/5, 10.15-11, MH:227 Project 3 oral presentations: Emmy+Evelina, Juan Pablo+Amanda, Christ-Roi, Kasper
Wed 30/5, Thu 31/5, Fri 1/6 Oral exams Choose time;
Questions updated 16/5.
w23 Mon 4/6, Tue 5/6, Thu 7/6, Fri 8/6 Oral exams
w24 Mon 11/6, Tue 12/6, Wed 13/6, Thu 14/6, Fri 15/6 Oral exams
w25 Mon 18/6, Tue 19/6, Thu 21/6 Oral exams

Level

Advanced level.

Aim

Regression analysis deals with modelling how one characteristic (height, weight, price, concentration, etc) varies with one or several other characteristics (sex, living area, expenditures, temperature, etc). Linear regression is introduced in the basic course in mathematical statistics but here we expand with, e.g., "how do I check that the model fits the data", "what should I do if it doesn't fit", "how uncertain is it", and "how do I use it to draw conclusions about reality".

When performing a survey where people can answer "yes/no" or "little/just fine/much", or "car/bicycle/bus" or some other categorical alternative, you cannot use linear regression. Then you need logistic regression instead. This is the topic in the second half of the course.

Contents

Least squares and maximum-likelihood-method; odds ratios; Multiple linear and logistic regression; Matrix formulation; Methods for model validation, residuals, outliers, influential observations, multi co-linearity, change of variables; Choice of regressors, F-test, likelihood-ratio-test; Confidence intervals and prediction. Introduction to: Correlated errors, Poisson regression as well as multinomial and ordinal logistic regression.

Prerequisites

At least 60 ECTS at university level including an introductory course in mathematical statistics, e.g. MASA01 Matematical statistics, basic course, 15hp, or MASB02 Mathematical statistics (for chemists) 7.5hp, or MASB03 Mathematical statistics (for physicists) 9hp or MASB11 Biostatistics, basic course 7.5hp, or equivalent.

Teaching and examination

The teaching consists of lectures, computer exercises and project work. Attendance to the three exercises is compulsory. The examination is written and oral in the form of written reports for project 1 and 2, oral presentation of project 3 and individual oral examination.

Lecturer

Anna Lindgren, tel 046-2224276, office MH:136, Matematikcentrum anna@maths.lth.se.

Teaching Assistants

Rachele Anderson, tel 046 2224580 , office MH:323, rachele@maths.lth.se
Vladimir Pastukhov, tel 046 2227974, office MH:324, pastuhov@maths.lth.se

Learning outcomes

Knowledge and understanding

For a passing grade the student must

Skills and abilities

For a passing grade the student must

Judgement and approach

For a passing grade the student must