Chapter 3 | Description

Welcome Back

Check-In Here : tinyurl.com/dvcminidata

Agenda

  • 8:55 - 9:10. Check-In.

  • 9:10 - 9:40. Describing Data with Brain.

  • 9:40 - 10:10. Describing Data with R.

  • 10:10 - 10:20. Break Time

  • 10:20 - 11:00. Work on Lab 3.

Part 1 : Describing Data with Brain.

KEY IDEA : The Mean as Prediction

  • How would you feel… GOOD / BAD / INDIFFERENT
    • …if I told you that the average on the R Exam last semester was a 50%?
    • …if I told you that the average on the R Exam last semester was a 90%?
    • …if everyone you knew and loved called you “average”?
  • Why would you feel this way?

LET’S PLAY : WHERE THE LINE?

The histogram organizes the data, but you can see the same patterns in both graphs. The Mean = 3.4148148

The Mean As Prediction (and Error)

  • How can you visualize each of these components?
  • How might you translate this equation into a simpler language?

\[ \Huge y_i = \bar{Y} + \epsilon_i \]

  • \(y_i\) = the individual’s (\(_i\)) actual score of a variable (\(y\)) we are trying to predict.
  • \(\bar{Y}\) = our prediction (the mean, in this case)
  • \(\epsilon_i\) = residual error

Part 2 : Describing Data in R.

Problem 1. World Happiness Problems.

Download the “World Happiness” data, and import this data into R. We’ll be focusing on .

  1. Load the data (I’ll call it d if you want to follow along with my code), check to make sure it loaded correctly, and report the sample size and names of the variables.
  2. Graph the variable “ladder” as a histogram. Change the arguments to make the graph “look nice”.
  3. Report the mean, median, range, and standard deviation for this variables. Add vertical lines to your graphs above that illustrate the mean (in red), the median (in blue), and the standard deviation (in dashed red).
  4. Describe what each of these statistics tells you about the happiness of countries in the dataset. (Human brain stuff; no R needed! Goal is to connect the stats to knowledge and understanding of the variable.)
  5. Describe what additional questions you have about this variable (what do you NOT learn about happiness levels of countries from this variable, and what might you want to know?)

BREAK TIME : MEET BACK AT 10:20

Part 3 : Working on Lab 3, Problem 2.

In Small Groups: Graph a numeric variable from the World Happiness Dataset, run descriptive statistics (mean, median, range, standard deviation), and describe what these statistics tell you about the data in the graph. What do you learn about this variable? What other questions about this variable do you have?