P2D1: Getting data and gun deaths
Partner presentations (10 minutes).
Find a partner in the class and share your project 1 code and charts.
- Compare your work to identify different ways that each of you completed the job.
- Discuss with each other how long it took for you to complete each chart.
- Share any cool google resources that you have found while completing the project.
Navigating Github and R packages
- devtools and Github R packages like waffle and beeswarm
- full_data.csv url link for
read_csv()
.
Exploring gun deaths
What is the problem with raw counts and minority groups?
#' Used this information to build the values.
# https://www.census.gov/quickfacts/fact/table/US/POP010220
dat_pop <- tibble(
table_var = c("Asian/Pacific Islander",
"Black", "Hispanic",
"Native American/Native Alaskan", "White"),
N = 331449281 *c(.061, .134, .185, .013, .763))
What are some solutions to displaying this small population impact? Issues with those solutions?
Trying the waffle chart
The plague of bar charts
It is difficult to understand why statisticians commonly limit their inquiries to Averages, and do not revel in more comprehensive views. Their souls seem as dull to the charm of variety as that of the native of one of our flat English counties, whose retrospect of Switzerland was that, if its mountains could be thrown into its lakes, two nuisances would be got rid of at once.
The problem with bar charts:
- They display summaries of data, not the data.
- They make it hard to visualize uncertainty.
- They don’t facilitate easy comparisons beyond one category.
- They should be avoided with time-series data
Two excellent references:
Munging data with dplyr
Understanding case_when()
case_when()
is particularly useful inside mutate when you want to create a new variable that relies on a complex combination of existing variables. Write a short sentence that says what this code is doing?
starwars %>%
select(name:mass, gender, species) %>%
mutate(
type = case_when(
height > 200 | mass > 200 ~ "large",
species == "Droid" ~ "robot",
TRUE ~ "other"
)
)
#> # A tibble: 87 x 6
#> name height mass gender species type
#> <chr> <int> <dbl> <chr> <chr> <chr>
#> 1 Luke Skywalker 172 77 male Human other
#> 2 C-3PO 167 75 <NA> Droid robot
#> 3 R2-D2 96 32 <NA> Droid robot