#MASM22/FMSN30(40) Linear and Logistic Regression (with Data Gathering)

This is a quick introduction to R. You will note that it has many similarities with Matlab, as well as some confusing differences.

R is made for computing things. If you want to find the result of \(2 + 4\) you simply write

`2 + 4`

and R will answer

```
2 + 4
#> [1] 6
```

The notation `[1] 6`

means that the first value (in this case the only value) of the answer is 6. If you want to do a multiplication you write

```
2 * 4
#> [1] 8
```

All common mathematical functions are available. In order to calculate 42, \(\sqrt{4}\), \(\ln(4)\) and \(e^4\) you write

```
4^2
#> [1] 16
sqrt(4)
#> [1] 2
log(4)
#> [1] 1.386294
exp(4)
#> [1] 54.59815
```

Note that R uses decimal period, never decimal comma.

If you want to save the result of a calculation you have to give the result a name. This is done using the notation `<-`

(less-than immediately followed by a dash). If we want to calculate \(2+4\) and save the result under the name `myresult`

you give the command

`myresult <- 2 + 4`

A variable named `myresult`

should now be listed in the Environment window. You can also see that it contains the value 6. You can ask R to print the answer by writing

```
myresult
#> [1] 6
```

We can use this variable in other functions. For example, we can write

```
sqrt(myresult)
#> [1] 2.44949
```

to get the square root of 6. The expression `sqrt()`

is a function. All functions in R end in brackets, even if they have no argument, e.g., `q()`

.

You can collect several values into one variable, a vector, using the function `c()`

(*c* for combine or collect):

`x <- c(3, 5, 7, 11, 13)`

You can then perform the same calculation as before but on all the values at the same time:

```
x + 3
#> [1] 6 8 10 14 16
sqrt(x)
#> [1] 1.732051 2.236068 2.645751 3.316625 3.605551
```

You can also combine several variables into one longer variable:

```
y <- c(17, 19, 23, 29, 31)
z <- c(x, y)
z
#> [1] 3 5 7 11 13 17 19 23 29 31
```

Sometimes you will want to create structured data, e.g., series or repeated sequences. There are two commands for this: `seq()`

and `rep()`

. In addition you can use the colon sign `:`

. Try out the following commands and try to understand what they do:

```
seq(1, 100, 9)
seq(to = 100, from = 1, by = 9)
seq(f = 1, t = 100, length.out = 10)
1:3
3:1
rep(c(1, 2, 3), times = 3)
rep(1:3, each = 4)
rep(1:3, t = 3, e = 4)
rep(1:3, length.out = 20)
```

If you need help on a particular function you can use the help function by writing `help(seq)`

or `?seq`

. You can also use the `Help`

window in R Studio. The colon sign is not a function but an operator so you have to write `help(":")`

using quotes.

Sometimes you only want some of the values in a variable. We can choose values using square brackets []:

```
myvalues <- 21:30
myvalues
#> [1] 21 22 23 24 25 26 27 28 29 30
myvalues[1]
#> [1] 21
myvalues[c(1, 3, 5)]
#> [1] 21 23 25
myvalues[1:3]
#> [1] 21 22 23
```

You can also choose to exclude values:

```
myvalues[-1]
#> [1] 22 23 24 25 26 27 28 29 30
myvalues[-c(1, 3, 5)]
#> [1] 22 24 26 27 28 29 30
myvalues[-(2:4)]
#> [1] 21 25 26 27 28 29 30
```

There is a large number of functions in R. Here are some examples of basic statistical functions. The first one creates 100 random numbers from a standard normal distribution. Run help on the others to find out what they do.

```
x <- rnorm(100)
x
mean(x)
var(x)
sd(x)
median(x)
boxplot(x)
boxplot(x, horizontal = TRUE)
hist(x)
```

All variables in R are objects. You can see the objects you have created in the Environment window. You can also list them using the command `ls()`

. If you want to remove an object you use the command `remove()`

or, shorter, `rm()`

:

```
rubbish <- c(1, 19, 23.4)
ls()
#> [1] "myresult" "myvalues" "rubbish" "x" "y" "z"
remove(rubbish)
ls()
#> [1] "myresult" "myvalues" "x" "y" "z"
```

If you want to remove all objects you can combine the two commands into

```
remove(list = ls())
ls()
#> character(0)
```

**Be careful! R will NOT warn you that you are removing anything.** It assumes you know what you are doing.

Now we have a nice empty environment. Save your script file and close R Studio. You can answer `No`

when asked to save the workspace. Since you saved your script file you can run the commands and recreate them again next time you run R.