Psych 102 - Lab 1
The goal of this lab is to get more practice working in R, and practice loading (and evaluating) datasets. Answer the questions below; take a screenshot of your code and output into a google document as needed. Make sure to keep this organized and easy to read (e.g. number your answers, clean code, etc.) to help your GSI grade.
Enter in a math equation that used to give you difficulty as a kid below, and run the code to see the result.
Work with your classmates in discussion section to create another dataframe with three variables - at least one numeric and one “string”. Create that dataframe here, and use R code to answer the following questions.
the number of individuals (rows) in the dataset.
the number of variables (columns) in the dataset.
the names of the variables in the dataset.
the value of the 5th row, 2nd column in the dataset.
print a numeric variable from the dataset
print a categorical variable from the dataset
In lecture, we wrote a for-loop to simulate the “Monty Hall” problem. Adapt this for-loop such that there are 100 doors (where the contestant chooses one, then Monty opens 98 other doors.) Under these conditions, what is the probability of winning if you switch? If you don’t switch?
Please try to complete these problems on your own after section. If you get stuck, post to the class “discord” (and feel free to help others get unstuck.) If you really get stuck, just explain what you tried. Thanks!
- Look over the supplemental readings for instructions on how to load and navigate a dataset into R from a .csv file. Load one dataset posted to our “Datasets” folder; make sure to give the dataset a name; check the headers, and set stringsAsFactors = T. Note that some datasets are saved as .xlsx files, and will need to be loaded using a package - feel free to skip these for now if this feels confusing. For each dataset, use R to report the following statistics. Add comments to your code that describe what you learn from the output.
- the number of individuals (rows) in the dataset.
- the number of variables (columns) in the dataset.
- the names of the variables in the dataset.
- the value of the 5th row, 2nd column in the dataset.
- print a numeric variable from the dataset
- print a categorical variable from the dataset
- Print two variables from the dataset - what do you learn from just looking at these data? No need for summary statistics yet :)
- What is a question you might ask about a variable from this dataset? Why might this question matter?
- Read the article “Data Organization in Spreadsheets”. Choose one of the datasets uploaded to the Datasets folder on bCourses, and look over the dataset (and corresponding Codebook). What are some ways that this dataset adhered to these “best practices”? What are some ways that the dataset did not? What is something from this article that you learned? What did you have a question about?