\(~\) \(~\)
The midterm assessment was designed to evaluate your ‘fundamental’ skills as a data scientist, corresponding to Course Objectives (1)-(4,5) in the syllabus. Those skills break down into –
R
practices, reproducibilitydplyr
, tidyr
, forcats
, stringr
, lubridate
purrr::map
familyggplot2
,gt
\(~\) \(~\)
As we move forward this semester (today and after Spring Break 3/5 - 3/13), we will continuously utilize and extend these ‘fundamental’ skills to maximize R
and Rstudio’s potential for data science and analysis. With the aforementioned fundamentals in hand, we are going to learn how to use these ‘advanced’ communication and analysis tools –
ggplotly
, reactable
gt
with gtExtras
(and flextable
, ftExtra
)flexdashboard
flexdashboard
sf
orsp
+ tidyverse
+ ggmaps
and ggplot2
ggplotly
and/or leaflet
R
hosted website
flexdashboards
shiny
appsdt_plyr
, collapse
, h2o
, sparklyr
db_plyr
\(~\) \(~\)
First, we’re going to briefly cover joining multiple data sources with dplyr
with spatial examples. Then we’re going to build onto last week’s lecture and expand our spatial data toolbox in R
with more advanced use of sf
, ggmaps
, tidycensus
, as well as an introduction to fully interactive leaflet
. Finally, we will apply these tools in an activity where we create and edit a more advanced spatial dashboard together.
#Install the packages for today if you don't already have them
install.packages(c("sf", "ggmap", "tmap", "tidycensus", "leaflet", "osmdata", "tigris"))
R
Joining data from multiple sources is another aspect of data wrangling which was covered in PUBH 7461, but is an important part of working with real-world data that we should make sure we’re on the same page about heading into the final project.
Laura Le’s wonderful lecture regarding joining data in R, as well as an example/activity with NYC flight data can be found on Canvas here.
sf
, ggmap
, tidycensus
, ggplot
, plotly
sf
ResourcesLike many things in the R
universe, the sf
package has wonderful documentation and examples. Please spend some time reviewing these on your own.
sf
+ ggplotly
First, let’s download the ggthemes
package for a few more thematic choices in our ggplot
’s.
#Install ggrepel if necessary
if (!require(ggthemes)) {
install.packages("ggthemes", quiet = TRUE)
}
#Call the library
library(ggthemes, quietly = TRUE)
Next, let’s read in our MN .shp file (from last week’s lecture).
#Read in the shape file (don't make a tibble)
<- st_read("./data/USA_Counties/USA_Counties.shp", quiet = TRUE) %>%
mn.df ::clean_names() %>%
janitorfilter(state_name %in% "Minnesota")
Next, let’s build our ggplot
but add a little more information with our usual data wrangling skills and employ a better ggthemes
.
<- mn.df %>%
mn_pop.gg ::select(name, white:other, renter_occ, owner_occ, geometry) %>%
dplyrrename(county = name) %>%
pivot_longer(
cols = white:other, #tidy long data by category
names_to = "race_category",
values_to = "race_pop"
%>%
) mutate(
race_category = str_replace_all(race_category, "_", " ") %>%
str_to_title() %>%
as_factor()
%>%
) group_by(county) %>% #County level population
mutate(county_pop = sum(race_pop)) %>%
group_by(county, race_category) %>%
summarise(
perc_race = race_pop / county_pop,
perc_rent = renter_occ / (renter_occ + owner_occ),
geometry = geometry
%>%
) ungroup() %>%
nest(data = c("race_category", "perc_race", "geometry")) %>%
mutate(
text_label = map_chr(.x = data,
~str_c(
"\n",
$race_category,
.x": ",
::percent(.x$perc_race, accuracy = 0.0001),
scalescollapse = ""
)
),text_label = str_c(county, "\nDemographics", text_label, "\nAvg. Rental Percentage: ", scales::percent(perc_rent, accuracy = 0.01))
%>%
) unnest(data) %>%
st_as_sf() %>%
ggplot() +
geom_sf(aes(fill = perc_rent, text = text_label),
colour = "black", size = 0.8, alpha = 0.6) +
labs(
title = "2017 MN ACS Rent vs. Own % by County"
+
) scale_fill_viridis_c("Percent Rental", labels = scales::percent) +
theme_map() +
theme(
plot.title = element_text(size = 24,
hjust = 0.5),
legend.text = element_text(size = 20),
legend.title = element_text(size = 20),
legend.position = "right"
)
#Plotly
ggplotly(mn_pop.gg,
tooltip = "text",
height = 600,
width = 800) %>%
style(hoveron = "fills")