MASM22/FMSN30(40) Linear and Logistic Regression (with Data Gathering)

R and RStudio

Installing on your own computer

Running R through RStudio

  • Start RStudio, not R.

After a few seconds the screen will look something this:

At the bottom left we have the console or command window, where you can perform commands directly by writing after the prompt > and pressing Enter. This is also where the output of the commands will appear unless they result in new variables, in which case the variables show up in the environment window instead.

At the top right we find the environment/history window. Under the environment tab you can see all the objects that R has in its memory. Under the history tab you have a list of all the commands you have given.

At the bottom right is the files/plots/packages/help/viewer window. Here you can browse among your files and see the plots you have made. You can also manage your R packages, i.e., extensions to R. Here you will also find the help function.

At the top left is the editor window or script window. This is where you edit your programs or scripts. It may be hidden under a large console.

  • Start a new script by choosing File -- New File -- R Script

The simplest form of script is just a sequence of R commands. You should save these so that you can reproduce your calculations later. You run the commands here by pressing Ctrl-R on each successive line. We will use this extensively on the projects.

  • Write 2 + 4 in the script and press Ctrl-R.

In contrast to spreadsheet based programmes like Excel, we will not see the datasheet unless we ask for it. The focus here is on the commands and the results. Data is, as always, entered so that each row represents one individual and every column represents one variable, e.g., age, HbA1c, etc. If you have a small dataset you can enter the data by hand but it is easier to import larger datasets from some other programme, e.g., Excel or a database program. There are many similarities between R and Matlab, but R is much better at handling data sets, including missing data and string variables. The functions for statistical tests and different types of regression are also much more user friendly in R.

Some useful options

RStudio can check your code for some errors.

  • Go to Tools -- Global Options... and then Code and Diagnostics. Check all the boxes and press OK. This will give you notices on code problems in the left hand margin of your script window.