#Install an R package to import data from Excel files
install.packages("readxl")
R basics
Installing R packages
An R package is like a toolbox, except that instead of containing tools, it contains functions for performing specific tasks such as filtering data or fitting a statistical model. Most of the R packages you will need for these tutorials are freely available from CRAN (The Comprehensive R Archive Network) or GitHub. You can install CRAN R packages using install.packages()
.
If you want to install an R package stored in a GitHub repository, use install_github()
in the devtools R package.
#Install tbi R package to calculate Tea Bag Index paramaters
::install_github("BenjaminDelory/tbi/tbi") devtools
Commenting
You can add comments to your R script using the hash tag symbol: #
Every line that starts with #
will be ignored by R and will not be executed.
#This line of code is a comment and will be ignored by R when running the code
We strongly advise you to add comments to your R code. At the very least, these comments should indicate why you have written each section of your code. It can also be useful to add information about the what and how. These comments can save a lot of time if you need to go back over your code after a while or, even more difficult, if someone else needs to go through your code and understand what you’ve done.
Creating R objects
New R objects are created using the assigment operator: <-
You can think of this assignment operator as an arrow that puts what’s on its right side into an R object located on its left side.
For instance, let’s create an object called x
that contains the value 2.
#Create object
<- 2
x
#Show content of R object
x
[1] 2
Press Alt and the minus sign on your keyboard (Alt+-) to quickly write the assignment operator.
Scalars
A scalar is a quantity that can only hold one value at a time. Here are the most common types of scalars in R:
- Numeric: numbers with a decimal value (e.g., 17.8)
- Integer: numbers without a decimal value (e.g., 18)
- Character: a letter or a combination of letters. Character strings must be enclosed by quotes in your R code.
- Factor: data type used in statistical modelling to specify what are the factors in the model
- Logical: a logical variable can be either
TRUE
orFALSE
You can check the data type of an R object using the class()
function.
<- 2
x class(x)
[1] "numeric"
Vectors
A vector is a sequence of data elements of the same type. Vectors can be created using the c()
function.
#Numeric vector
<- c(1,2,3,4,5)
x1 <- c(1:5)
x1
#Character vector
<- c("control", "treatment")
x
#Logical vector
<- c(TRUE, TRUE, FALSE) x
You can check how many elements there are in a vector using the length()
function.
length(x1)
[1] 5
Matrices
A matrix is an ensemble of data elements of the same type arranged in a 2D layout (i.e., like a table). Matrices can be created using the matrix()
function.
#Generate 25 random numbers between 0 and 1 from a uniform distribution
<- runif(25)
x2
#Arrange these random numbers into a matrix with 5 rows and 5 columns
<- matrix(x2,
x2 ncol = 5,
nrow = 5)
#View matrix
x2
[,1] [,2] [,3] [,4] [,5]
[1,] 0.6445135 0.03635731 0.3081061 0.4967868 0.71427079
[2,] 0.5644063 0.50678207 0.2357141 0.9245398 0.72489181
[3,] 0.4417767 0.94662147 0.5808922 0.6914159 0.03381332
[4,] 0.3135167 0.22928658 0.4377485 0.4733986 0.32219592
[5,] 0.8875961 0.60270605 0.8314840 0.5604592 0.35591664
You can check the size of a matrix using the dim()
function. The first element of the output is the number of rows. The second element of the output is the number of columns.
dim(x2)
[1] 5 5
You can also extract the number of rows and columns using nrow()
and ncol()
, respectively.
nrow(x2)
[1] 5
ncol(x2)
[1] 5
Data frames
A data frame is an ensemble of data elements arranged in a 2D layout (i.e., like a table). Different columns of a data frame can contain different types of data (character, logical, numeric, etc.). It is probably the most common data structure used when analysing ecological data. Data frames can be created using the data.frame()
function.
#Create data frame
<- data.frame(Var1=c(1:6),
x3 Var2=c("R", "i", "s", "f", "u", "n"),
Var3=c(TRUE, TRUE, FALSE, FALSE, TRUE, FALSE))
#View data frame
x3
Var1 Var2 Var3
1 1 R TRUE
2 2 i TRUE
3 3 s FALSE
4 4 f FALSE
5 5 u TRUE
6 6 n FALSE
The functions dim()
, ncol()
, and nrow()
can also be used on data frames.
Lists
A list is a vector containing other objects (vectors, matrices, data frames, other lists, etc.). It can contain elements of various data types. Lists can be created using the list()
function.
#Create a list
<- list(x1, x2, x3)
x4
#View list
x4
[[1]]
[1] 1 2 3 4 5
[[2]]
[,1] [,2] [,3] [,4] [,5]
[1,] 0.6445135 0.03635731 0.3081061 0.4967868 0.71427079
[2,] 0.5644063 0.50678207 0.2357141 0.9245398 0.72489181
[3,] 0.4417767 0.94662147 0.5808922 0.6914159 0.03381332
[4,] 0.3135167 0.22928658 0.4377485 0.4733986 0.32219592
[5,] 0.8875961 0.60270605 0.8314840 0.5604592 0.35591664
[[3]]
Var1 Var2 Var3
1 1 R TRUE
2 2 i TRUE
3 3 s FALSE
4 4 f FALSE
5 5 u TRUE
6 6 n FALSE
The length()
function can be used to check how many data elements there are in a list.
length(x4)
[1] 3
Indexing
One of the main advantages of R is that it is very easy to extract any given value from a data set. This is called indexing. Let’s have a look at a few examples.
Vectors
To extract the ith value of a vector object called x
, you should write x[i]
.
#Extract the third value of the x1 object
#x1 is a vector
3] x1[
[1] 3
Matrices and data frames
To extract the value located at the intersection between the ith row and jth column of a matrix or a data frame object called x
, you should write x[i,j]
.
#Extract the value at the intersection of row 2 and column 3 in the x2 object
#x2 is a matrix
2,3] x2[
[1] 0.2357141
With a data frame, there are a couple of other options to extract data from specific columns. One option is to use the dollar symbol ($
) followed by the column name.
#Extract all the values stored in the second column of the x3 object
#x3 is a data frame
$Var2 x3
[1] "R" "i" "s" "f" "u" "n"
Note that the following code would also work and would produce the same result. To extract all the values from a specific column, simply leave the square brackets empty before the comma. It is important to specify the name of the column (in quotes), otherwise you will simply extract all the values from your data frame.
#Extract all the values stored in the second column of the x3 object
#x3 is a data frame
"Var2"] x3[,
[1] "R" "i" "s" "f" "u" "n"
If you want to subset a matrix or a data frame called x
(i.e., selecting only specifics rows and columns), you should write:
x[rows to select, columns to select]
#Extract only the values located between rows 2 and 4
#in the second column of the x3 object
#x3 is a data frame
2:4, 2] x3[
[1] "i" "s" "f"
Note that writing 2:4
means “from index 2 to index 4”. It is exactly the same as writing c(2,3,4)
.
Lists
To extract the ith element of a list object called x
, you should write x[[i]]
.
#Extract the second element of the x4 object
#x4 is a list
2]] x4[[
[,1] [,2] [,3] [,4] [,5]
[1,] 0.6445135 0.03635731 0.3081061 0.4967868 0.71427079
[2,] 0.5644063 0.50678207 0.2357141 0.9245398 0.72489181
[3,] 0.4417767 0.94662147 0.5808922 0.6914159 0.03381332
[4,] 0.3135167 0.22928658 0.4377485 0.4733986 0.32219592
[5,] 0.8875961 0.60270605 0.8314840 0.5604592 0.35591664