Data wrangling with the Tidyverse II
Materials for class on Thursday, October 4, 2018
Contents
Slides
Download the slides from today’s lecture:
2016 elections, food security, and mortality
Setting up your project
(Practice makes perfect!)
Do the following:
Create a new RStudio project named “elections-food-death” (or something) and put it somewhere on your computer.
Navigate to that new project folder on your computer with Windows File Explorer or macOS Finder (i.e. however you look at files on your computer). Create a new folder in your project called “data”.
Download this CSV file:
clean_combined_data.csv
You’ll probably need to right click on the link and select “Save link as…” or something similar—often browsers will load the CSV file like a web page, which isn’t helpful.
The R code I used to clean and merge these three datasets can be seen here. The raw data can be downloaded here.
Using Windows File Explorer or macOS Finder, move the newly downloaded CSV file into the “data” folder you created.
Download this R Markdown file:
elections-food-death-questions.Rmd
Again, you’ll probably need to right click on the link and select “Save link as…”
and place it in your newly-created project (but not in your data folder—put it in the main directory).
In the end, your project folder should be structured like this:
elections-food-death\
elections-food-death-questions.Rmd
elections-food-death.Rproj
data\
clean_combined_data.csv
Questions
I provided you with once chunk in the R Markdown file to load the pre-cleaned data. Choose a few of these questions and answer them with a table or a plot (or both). You should be able to answer most with filter(BLAH == "BLOOP")
or group_by(BLAH) %>% summarize(SOMETHING = BLAH)
and with ggplot()
.
(Note: I have no idea what you will find here. This is a fishing expedition; see what interesting stories you can find!)
Filtering, grouping, and summarizing
- Which counties/states had the highest proportion of Republican votes in 2016?
- Which counties/states had the highest proportion of Democratic votes in 2016?
- Which counties/states had the greatest difference of Democratic to Republican votes in 2016? Which counties were the narrowest?
- Which counties/states have the highest mortality?
- Which counties/states have the greatest number of fast food restaurants? Grocery stores? Stores that accept SNAP?
- Which counties/states have the greatest proportion of people with low access to food? Children with low access to food?
- Anything else that looks interesting!
Relationships and correlations
What is the relationship between the following variables at a county (or state) level? Which counties (or states) have the strongest or weakest relationships? What could that relationship possibly mean?
- Mortality rate and low access to food
- Mortality rate and 2016 election results
- SNAP-eligible stores per 1000 and low access to food
- Fast food prevalence and low access to food
- 2016 election results and fast food prevalence
- Low access to food and 2016 election results
- Anything else that looks interesting!
Clearest and muddiest things
Go to this form and answer these three questions:
- What was the muddiest thing from class today? What are you still wondering about?
- What was the clearest thing from class today?
- What was the most exciting thing you learned?
I’ll compile the questions and send out answers after class.