All course materials at tinyurl.com/introRDatasite
What part of your education are you in? (Bachelor, PhD, prof…)
What is your faculty/background? (Economics, Medicine, Biology…)
What is your motivation for learning R?
What is your experience with R?
9:30 Introductions
10:00 Base R + Exercises 1- 6
11:25 Recap & questions
11:30 Coffee break
11:45 Programming + Exercises 7-9
12:45 Lunch break
13:30 Reconvene for afternoon program
Part 1: Basics of R
Download the course materials.
Store them in a local (i.e. not on a mounted drive), accessible location.
Unzip the download to create a single folder. What animal is displayed on animal.png?
Double-click the course-materials.Rproj file. Or: Go to File > Open Project > select course-materials.Rproj > Open
From the ‘Files’ menu (bottom right), click baseR_exercises.Rmd.
When you start programming for yourself:
.RProj file will be createdYou can execute Exercise 0 chunk as a whole with the green triangle:
You can assign both numbers and text to a variable:
You will see your variable (R object) appear in your Environment (top right panel).
See the cheatsheets folder. Or download it.
Saving information as an R object:
Asking for information to be returned:
Note the difference in syntax:
<- operator: storing information = no immediate ‘answer’Functions: code that performs a specific task based on the arguments provided.
Examples:
mean(x): calculate the mean of xmean(x, na.rm = TRUE): calculate the mean of x by leaving out the NAsgetwd(): print the working directory to the screen (requires no arguments)You can perform math with your variables:
and store the results as new variables:
Check “Maths Functions” on the Base R cheatsheet:
A logical is TRUE or FALSE, and can also be written as T or F.
Logicals are mostly used as tests:
| == | is equal to | 
| != | is not | 
| >= | larger than or equal to | 
| < | smaller than | 
For example:
Vectors are created with the function c()
A numeric vector:
What is this vector?
Yep, a character vector!
Vector type defaults to the “lowest common denominator”: everything can be a character, but not everything can be a number or a logical.
Order:
Vectors can be used in mathematical operations
Operations with multiple vectors are performed by aligning the index
c() function.We have two vectors: name and age
How do we combine them?
How about combining name and age in a two-dimensional table structure?
Or: in a multi-dimensional list.
| number of dimensions | function | |
|---|---|---|
| vector | 1 | c() | 
| data frame | 2 | data.frame() | 
| list | any number | list() | 
NB: dataframes and lists appear under Data in the Environment (top right panel in RStudio), vectors under Values.
Special type of vector, defined by levels. Usually as categorical variable in a data frame.
age.By position:
By position:
df, return all columns for everyone living in a country of your choice.df under 40.Let’s add a column to our data:
   name age country  pet
1   Ann  35      UK  cat
2   Bob  22      US none
3 Chloe  50      NL     
4   Dan  51      BE <NA>Notice that:
[1] TRUE[1] NA[1] NA[1] TRUESo: want to test if a value is NA? Use is.na()!
Do we know about our participants’ jobs?
| NA | Information is Not Available | 
| NULL | Information does not exist | 
| noneor0 | Data entry specifying content of 0 | 
| "" | Empty character value | 
An if statement tests if a condition is TRUE or FALSE and executes code depending on the outcome of that test.
To build an if-statement, start with the function if():
Within the {}, insert the code that should be executed if the condition is met:
Make an if statement that tests if a number is larger than 18. Assign the result to the variable age_category.
number <- 8
if(number >= 18){
  age_category <- "adult"
} else{
  age_category <- "minor"
}
print(age_category)[1] "minor"Bonus exercise: expand the if-else statement to assign “toddler” if number is smaller than 2:
Functions consist of (multiple) instruction(s) that form a cohesive unit: 
To make a function yourself, use the function function():
The sequence of operations is in the body of the function (between { }):
First, run the code with the function itself. It will appear in your environment:
Functions are the bread and butter of programming!
A good script will consist mostly of functions, with a minimal amount of code that applies the functions.
Note that:
return (not print).Turn the if-statement from the last exercise into a function. Let the user provide the value for number, and return the age_category.
A loop starts with the iterable object (in this case the vector 1:5), and the temporary name for each item (in this case a_number):
Within { }, you place the instructions:
Note that a_number is 1 in the first iteration of the loop, 2 in the second, etc. It does not exist outside the for loop!
Go over the age column in your dataframe df, and for each age: print() the age category using the test_age function from the previous exercise.
Bonus question: add age category as a new column in df.
# We first set an index that will be increased every time the for-loop runs
i <- 1
for(nr in df$age){
  # Add the age category to a new column in df
  df$age_category[i] <- test_age(nr)
  # Increase the index with 1 after running the code in this for-loop iteration
  i <- i + 1
}
df   name age country  pet age_category
1   Ann  35      UK  cat        adult
2   Bob  22      US none        adult
3 Chloe  50      NL             adult
4   Dan  51      BE <NA>        adult[ ]
(  )
{  }
What data types have you encountered so far?
logical
numeric
character
How can data be missing?
NA (not available)
NULL (non-existent)
"" (empty)
What data structures have you encountered?
vector (one dimension)
data frame (two dimensions)
list (++ dimensions)
What functions have you encountered so far?
c()
data.frame()
is.na()
mean()
summary()
Programming basics
How does a function work? Type in your console:
?mean
Use a search engine (often useful: Stackoverflow)
(Generative AI)