Appendix A — Rcode

Here is a list of the R code we use in this class.

A.1 R Code : Creating Variables in R

A.2 Numeric Variables in R

Code Description

variable <- c(#, #, #, #, etc.)

tired <- c(1,2,3,4)

variable = an object that you will define in R

<- = “assign”; tells R to save whatever comes on the right to whatever object is on the left.

c = combine : tells R to combine whatever happens in the parentheses

() = parentheses to group related terms

# = what you store in the variable; each item should be separated by a comma and space.

hist(dat$variable) For continuous variables : draws a histogram.

A.3 String Variables

variable <- c(“name1”, “name2”, “name1”, etc.)

emotion <- c(“sad”, “happy”, “sad”)

variable = an object that you will define in R

<- = “assign”; tells R to save whatever comes on the right to whatever object is on the left.

c = combine : tells R to combine whatever happens in the parentheses

() = parentheses to group related terms

# = what you store in the variable; each item should be separated by a comma and space.

as.factor(variable)

as.factor(emotion)

as.factor() # converts a string variable into a categorical factor
variable <- as.factor(variable) # “saves” this conversion as the original variable
plot(dat$variable) For categorical variables : draws a barplot. For continuous variables :  illustrates values of the variable (y-axis) as a function of their index (x-axis).

A.4 R Commands for Importing and Navigating Data

R Command What it Does
dat <- read.csv(“path/file.csv”, stringsAsFactors = T) loads the data file into R (or use the “point & click method”); sets string variables to be categorical factor variables.
head(dat) looks at the first 6 rows of the data file
tail(dat) looks at the last 6 rows of the data file
nrow(dat) displays the number of rows (each row = an individual)
ncol(dat) displays the number of columns (each column = a variable)
names(dat) displays the names of the object (column names = names of variables)
dat$variable displays the variable from a dataset
dat$variable[i] displays the individual row [i] from the variable
dat[i, j] displays an individual row [i] and column [j] from the dataset

A.5 R Commands for Visualizing Data

R Command What it Does
summary(dat) Reports descriptive statistics for all variables in the dataset.
summary(dat$variable) Reports descriptive statistics for a categorical variable (frequency / number of individuals in each level) or continuous variable (mean, range, etc.)
as.numeric(dat$variable) Makes the variable numeric (for continuous graphs)
as.factor(dat$variable) Makes the variable a categorical factor (for categorical graphs)
dat$variable <- as.factor(dat$variable) Assigns the as.factor output to the original variable. (In other words, this saves your new categorical factor variable by overwriting the old one.)
plot(dat$variable) For categorical variables : draws a barplot. For continuous variables :  illustrates values of the variable (y-axis) as a function of their index (x-axis).
hist(dat$variable) For continuous variables : draws a histogram.
par(mfrow = c(i, j)) Splits your graphics window into i rows and j columns.

A.6 R Code : Descriptive Statistics

Below is a list of code we’ll use to calculate descriptive statistics in R.

R Command What It Does
summary(dat) Reports descriptive statistics for all variables in the dataset.
summary(dat$variable)

Reports descriptive statistics for a continuous variable.

Reports frequency for a categorical variable.

mean(dat$variable, na.rm = T) Reports the mean (average) of a variable; you must include the na.rm = T argument if there is missing data (otherwise R will return NA as the result).
median(dat$variable, na.rm = T) Reports the median (middle point) of a variable.
range(dat$variable, na.rm = T) Reports the lower limit and upper limit of the variable.
sd(dat$variable, na.rm = T) Reports the standard deviation of the variable.

hist(dat$variable)

abline(v =mean(dat$variable))

Draws a line on a plot or histogram at specified values (e.g., this draws a vertical line at the mean of dat$variable. You can replace v with h to draw a horizontal line. We will use abline() later in the semester in a different way.
par(mfrow = c(i, j)) Splits your graphics window into i rows and j columns (replace i and j with numbers)