library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.2 ✔ tibble 3.2.1
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.0.4
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
mtcars %>% summarize(mean=mean(disp))
## mean
## 1 230.7219
mtcars %>% ggplot(aes(x=disp)) + geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
This histogram is difficult to describe because it does not seem to be part of a clear, continuous distribution. Even if we play around with the number of bins (using bins=10, or bins=50). You can give the modes (one, or three), say it’s skewed, and that there are no clear outliers. There is no correct answer: you just need to describe what you see.
This is exploratory data analysis, and the goal is to generate hypotheses that we will test statistically.
Note that the “self_contained: true” section makes it so that you produce an HTML file that contains all the graphics and everything needed to see your results. And that’s what you’ll submit on canvas.