$x
[1] 1 2 3 4 5
$y
[1] "a" "b"
2024-01-18
class 2: Made our own histograms by binning data, understand mean, median and comparing 2 distributions
class 1: installed R, Rstudio and tidyverse
package
If you already had these installed, check that they are the latest versions or re-install
version
= 4.3.2 ; Rstudio (2022 or 2023 versions) ; tidyverse (just update it if you haven’t installed last week with install.packages('tidyverse')
Refresh: directory structure, .Rproj
ect, R
script
Introduce R datatypes: {tip: check datatype with class()}
Simple: numeric
, character
. Other simple: factor
, logical
Compound: combination of simple datatypes: vector = c(x, y)
, list(x, y)
, data.frame()
; [won’t cover: matrix()
, array()
]
Good practices: Commenting code for documentation. # this code does x (if I get it to work)
Subset compound datatypes
Control statements: decision making: if()
and repeat steps in a loop: for()
Key reference : base R cheatsheet.PDF
If you are well versed with R basics and are looking for some challenge, this slide is for you
Download the code we used to generate the data for the class2 activity
plot the individual data points from this code
Highlight the 30 points that your team got within this plot
This will take you ~2 lectures to finish so keep working on it
If you got any questions, you can ask me when I come around ; or the TAs if they are free from helping others learn R basics
.Rproj
ect file: works from the current directory, saves history of R commands, and all scripts that were open last time.
.R
script: A place where you save lines of R code and can run them all from top to bottom with one click or a command : source('..R')
Directory structure
bash/terminal commands
mkdir
: make directory
cd
: change directory
ls
: list files and directories
touch x.txt
: make a empty files with any extension, ex: x.txt
Take simple datatypes: 1
; "apple"
, TRUE
Combine into complex data types: c(1, 3, 5, 7)
/ c('apples', 'oranges')
; list( numbers, fruits..)
Make decisions: if(x > 3) "apples are too sweet"
Repeat actions with only slight changes: for(i in 1:5) do something with each i
Make concise code by reusing parts as functions: do_magic <- function(x) {"x + magic here"}
Use the script: class3_R-basics.R
from Canvas/files/ to follow along
Missing elements marked as NA or NaN
Making these datatypes in R ; assigning to a variable
numeric: x <- 35
character: y <- 'apples'
logical: a <- TRUE
; b <- F
; c <- T
factor:
combination of simple datatypes: vector = c(x, y)
, list(x, y)
, data.frame(x1 = x, x2 = y)
When printed, the list looks like this
$x
[1] 1 2 3 4 5
$y
[1] "a" "b"
A vector where entries are ordered and stored as “levels”
Use cases
Order of colours assigned when plotting stuff
Analyzing categorical questionnaires
Named vector: c('a' = 1, 'b' = 2)
List: list(x = 1:5, y = c('a', 'b'))
Refresh: directory structure, .Rproj
ect, R
script, directory structure
Introduce R datatypes: {tip: check datatype with class()}
Simple: numeric
, character
. Other simple: factor
, logical
Compound: combination of simple datatypes: vector = c(x, y)
, list(x, y)
, data.frame()
Good practices: Commenting code for documentation. # this code does x (if I get it to work)
Subset/indexing compound datatypes: x[3]
, x[x>0]
, dataframe[1,5]
Control statements: decision making: if()
and repeat steps in a loop: for()
or vectorize using functions
first 20 minutes of lecture 4 ; Date: 23/1/24
Let us quickly finish talking about
Vector subsetting / indexing
Programing : if..else
; for
and function()
Note: Download the class3_R-basics.R
file again from the website (I updated it.. You can rename last weeks file with the _old suffix … _old.R
)
vectors: single element: x[3]
, multiple elements: x[x>0]
, by name x['a']
?
Lists and dataframes:
whole row/column list$x1
or dataframe$x1
or dataframe[ , 2]
Single entry: dataframe[1,5]
if()
and repeat steps in a loop: for()
if..else
:
for()
loops