class: title-slide, center, middle background-image: url(images/ouaerial.jpeg) background-size: cover # .heat[.fancy[Mapping and Interactive Graphics in R]] ## .heat[.fancy[Ani Ruhil]] --- name: agenda ## .fancy[ .heat[ Agenda ]] - building maps in R - `leaflet` maps - interactive/animated graphics with `highcharter` --- class: inverse, middle, center # .fancy[.salt[ maps with `ggplot2` ]] --- We need a few libraries to get us started on our way ... ```r library(ggplot2) library(ggmap) library(maps) library(mapdata) library(maptools) library(ggthemes) ``` How about a county map of the USA? What about Ohio's 88 counties? ```r map_data("county") -> usa # get basic map data for all USA counties subset(usa, region == "ohio") -> oh # subset to counties in Ohio names(oh) ``` ``` ## [1] "long" "lat" "group" "order" "region" "subregion" ``` * long = longitude -- measure `east-west` positions. The prime meridian is assigned the value of 0 degrees, and runs through Greenwich (England). Athens, Ohio has a longitude of -82.101255 * lat = latitude -- measure `north-south` position. The equator is defined as 0 degrees, the North Pole as 90 degrees north, and the South Pole as 90 degrees south. Athens, Ohio has a latitude of 39.329240 * group = an identifier that is unique for each subregion (here the counties) * order = an identifier that indicates the `order in which the boundary lines should be drawn` * region = string indicator for `regions` (here the states) * subregion = string indicator for `sub-regions` (here the county names) --- ```r ggplot() + geom_polygon(data = oh, aes(x = long, y = lat), fill = "white", color = "black") + ggtitle("a") # bad ggplot() + geom_polygon(data = oh, aes(x = long, y = lat, group = group), fill = "white", color = "black") + ggtitle("b") # a slightly better basic map ``` .pull-left[ <img src="Module07_files/figure-html/map3b1-1.png" width="100%" style="display: block; margin: auto;" /> ] .pull-right[ <img src="Module07_files/figure-html/map3b2-1.png" width="100%" style="display: block; margin: auto;" /> ] --- ```r ggplot() + geom_polygon(data = oh, aes(x = long, y = lat, group = group), fill = "white", color = "black") + * coord_fixed(1.3) + ggtitle("c") # a better map ggplot() + geom_polygon(data = oh, aes(x = long, y = lat, group = group, fill = subregion), color = "black", alpha = 0.3) + * coord_fixed(1.3) + * guides(fill = FALSE) + ggtitle("d") # a colored map ``` .pull-left[ <img src="Module07_files/figure-html/map4b1-1.png" width="100%" style="display: block; margin: auto;" /> ] .pull-right[ <img src="Module07_files/figure-html/map4b2-1.png" width="100%" style="display: block; margin: auto;" /> ] --- ## .fancy[ Labeling the counties ] - to label counties we need to find the `centroid` of each county and then use the county names - county names will have to be formatted into `titlecase` - taking the mean/median of latitude/longitude will not work so we use specific code to find the `centroids` ```r library(stringr) str_to_title(oh$subregion) -> oh$county library(sp) getLabelPoint <- # Returns a county-named list of label points function(county){Polygon(county[c('long', 'lat')])@labpt} by(oh, oh$county, getLabelPoint) -> centroids # Returns list"", centroids) -> centroids2# Convert to Data Frame rownames(centroids) -> centroids2$county names(centroids2) <- c('clong', 'clat', "county") # Appropriate Header ``` --- ### Now the code for the labeled plot ... ```r ggplot() + geom_polygon( data = oh, aes(x = long, y = lat, group = group), fill = "white", color = "gray") + coord_fixed(1.3) + geom_text( data = centroids2, aes(x = clong, y = clat, label = county), color = "darkblue", size = 1) + theme_map() ``` --- ### ... and the plot itself <img src="Module07_files/figure-html/map6b-1.png" width="70%" style="display: block; margin: auto;" /> --- ## .fancy[ Using `fill` ] What if we want to `fill` each county with grouped values of some variable such as population density, percent in poverty, median educational attainment, etc? (1) find and prepare the variable we want to use (2) `merge` this variable with the data used to generate the map (3) generate the map ```r library(readxl) read_excel("data/acpovertyOH.xlsx", sheet = "counties") -> acpovertyOH c("ranking", "county", "child1216", "child0711", "all1216", "all0711") -> colnames(acpovertyOH) merge(oh, acpovertyOH[, c(2:3)], by = "county", all.x = TRUE, sort = FALSE) -> my.df my.df[order(my.df$order), ] -> my.df ``` --- ### Now the code for the map ... ```r ggplot() + geom_polygon( data = my.df, aes(x = long, y = lat, group = group, fill = child1216), color = "black" ) + coord_fixed(1.3) + geom_text( data = centroids2, aes(x = clong, y = clat, label = county), color = "black", size = 2.25 ) + scale_fill_distiller(palette = "Spectral") + labs(fill = "Child Poverty %") + theme_map() + theme(legend.position = "bottom") ``` --- ### ... and now the map itself <img src="Module07_files/figure-html/map8b-1.png" width="70%" style="display: block; margin: auto;" /> --- That isn't a bad map but we could do better, by creating `quartiles` (4 groups) or `quintiles` (5 groups) so that it is easier to pinpoint which county falls into what group ```r library(dplyr) my.df %>% mutate( grouped_poverty = cut( child1216, breaks = c(quantile(my.df$child1216, probs = seq(0, 1, by = 0.2))), labels = c("0-20", "20-40", "40-60", "60-80", "80-100"), include.lowest = TRUE) ) -> my.df ``` ```r ggplot() + geom_polygon(data = my.df, aes(x = long, y = lat, group = group, fill = grouped_poverty), color = "black") + coord_fixed(1.3) + geom_text(data = centroids2, aes(x = clong, y = clat, label = county), color = "white", size = 2.25) + scale_fill_brewer(palette = "Set1", direction = -1) + labs(fill = "Poverty Quintiles") + theme_map() ``` `map is on the following slide ...` --- <img src="Module07_files/figure-html/map11-1.png" width="70%" style="display: block; margin: auto;" /> --- class: inverse, middle, center # .fancy[.salt[ Using `urbnmapr` for maps ]] --- ```r library(tidyverse) library(urbnmapr) states %>% ggplot(aes(long, lat, group = group)) + geom_polygon(fill = "grey", color = "#ffffff", size = 0.25) + coord_map(projection = "albers", lat0 = 39, lat1 = 45) ``` <img src="Module07_files/figure-html/urbanmapr1-1.png" width="65%" style="display: block; margin: auto;" /> --- ```r counties %>% ggplot(aes(long, lat, group = group)) + geom_polygon(fill = "grey", color = "#ffffff", size = 0.05) + coord_map(projection = "albers", lat0 = 39, lat1 = 45) ``` <img src="Module07_files/figure-html/urbanmapr2-1.png" width="65%" style="display: block; margin: auto;" /> --- class: inverse, center, middle # .fancy[.salt[ Mapping with `leaflet` ]] --- `leaflet` is an easy to learn a JavaScript library that generates interactive maps ```r library(leaflet) library(leaflet.extras) library(widgetframe) leaflet() %>% setView(lat = 39.322577, lng = -82.106336, zoom = 14) %>% addTiles() %>% setMapWidgetStyle() %>% frameWidget(width = '1000', height = '320') -> m1 m1 ```
--- drop a pin on Building 21 ```r leaflet() %>% setView(lat = 39.322577, lng = -82.106336, zoom = 15) %>% addMarkers(lat = 39.319984, lng = -82.107084, popup = c("The Ridges, Building 21")) %>% addTiles() %>% setMapWidgetStyle() %>% frameWidget(width = '1000', height = '320') -> m2 m2 ```
- `popup = ` generates a default marker with specific text --- # .fancy[ `NYC Bike data` ] Let us map some bike-share stations in New York City. The actual data-frame is large so we draw a random sample of 30 rows with the `sample_n()` command from `dplyr` ```r load("data/citibike.RData") library(dplyr) citibike %>% sample_n(30) -> citibike2 ``` ```r leaflet(data = citibike2, width = "100%") %>% setView(lat = 40.74, lng = -73.99, zoom = 12) %>% addTiles() %>% addMarkers(data = citibike2, lat = ~start.station.latitude, lng = ~start.station.longitude, label =, popup = %>% setMapWidgetStyle() %>% frameWidget(width = '1000', height = '320') -> m3 m3 ``` --- - If you click on a marker you will see the station's name - One can do a lot more in terms of customizing the markers but I leave that to you to explore - `leaflet` will do more than just draw markers but unfortunately we do not have time to explore its other features
--- class: middle, inverse, center # .fancy[.salt[ `patchwork` ... Multiple graphics on one canvas ]] --- Often when you are building a visualization you end up needing to squeeze multiple graphics into a single canvas, like the example below <img src="Module07_files/figure-html/patch2c-1.png" width="65%" style="display: block; margin: auto;" /> --- We can do this in many ways but the easiest library to use might be `patchwork` - load `ggplot2` and `patchwork` (and any other libraries you plan to use for the plots) - start by naming each plot; most of us end up naming them p1, p2, and so on (why? because those were the earliest examples on the web) - then decide on how you want the plots ... how many do you have? should they be side-by-side? two side-by-side and the third in a row below these two? Let us create three plots ```r library(ggplot2) library(patchwork) data(mtcars) ggplot(mtcars, aes(x = factor(am, labels = c("Automatic", "Manual")), y = mpg)) + geom_boxplot() + labs(x = "Automatic/Manual", y = "Miles per gallon") -> p1 ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point() + labs(x = "Curb Weight", y = "Miles per gallon") -> p2 ggplot(mtcars, aes(x = qsec, y = mpg)) + geom_point() + labs(x = "Quarter Mile times", y = "Miles per gallon") + facet_wrap(~gear) -> p3 ``` --- .pull-left[ ```r p1 + p2 ``` <img src="Module07_files/figure-html/patch3c-1.png" width="100%" style="display: block; margin: auto;" /> ] .pull-right[ ```r p1 + p2 + plot_layout(ncol = 1) ``` <img src="Module07_files/figure-html/patch3d-1.png" width="100%" style="display: block; margin: auto;" /> ] `plot_layout()` has several options, key ones will be - `ncol`, `nrow`: number of columns/rows - `byrow`: how should the plots be embedded, by filling columns first or by filling rows first - `widths`, `heights`: relative widths/heights of each column and row in the grid. Will get repeated to match the dimensions of the grid. --- .pull-left[ fill row 1 with p1, p2, then row 2 with p1, p2 ```r p1 + p2 + p1 + p2 + plot_layout(ncol = 2, byrow = TRUE) ``` <img src="Module07_files/figure-html/patch4a-1.png" width="100%" style="display: block; margin: auto;" /> ] .pull-right[ fill column 1 with p1, p2, then column 2 with p1, p2 ```r p1 + p2 + p1 + p2 + plot_layout(ncol = 2, byrow = FALSE) ``` <img src="Module07_files/figure-html/patch4b-1.png" width="100%" style="display: block; margin: auto;" /> ] --- .pull-left[ ```r (p1 + (p2 + p3) + plot_layout(ncol = 1)) ``` <img src="Module07_files/figure-html/patch5a-1.png" width="100%" style="display: block; margin: auto;" /> ] .pull-right[ ```r (p1 | p2 | p1) / p3 ``` <img src="Module07_files/figure-html/patch5b-1.png" width="100%" style="display: block; margin: auto;" /> ] the `|` specifies vertical layouts and the `/` specifies horizontal layouts --- ```r (p1 + (p2 + p3) + plot_layout(ncol = 1, heights = c(1,2))) ``` <img src="Module07_files/figure-html/patch5c-1.png" width="60%" style="display: block; margin: auto;" /> - see other settings [here]( - you can also [explore `cowplot` here]( --- class: middle, inverse, center # .fancy[.salt[ `highcharter` -- Interactive graphics ]] --- [`highcharter`]( is one of my favorite packages for dynamic plots because it builds them with ease and yet they are visually stunning (see below)
--- `Scatterplot` ```r load("data/epa.RData") epa %>% filter(year == 2019) %>% sample_n(100) -> epa2 hchart(epa2, "scatter", hcaes(x = city08, y = highway08, group = make)) -> hc1 frameWidget(hc1, width = 1000, height = 350) ```
--- `Line chart` ```r load("data/unemprate.RData") library(lubridate) year(urate$yearmonth) -> urate$year urate %>% group_by(educ_group, year) %>% summarise(avg.urate = mean(rate, na.rm = TRUE)) -> urate2 hchart(urate2, "line", hcaes(x = year, y = avg.urate, group = educ_group)) -> hc2 frameWidget(hc2, width = 1000, height = 375) ```
--- #### `Dressing up the highcharter plot` ```r hchart(urate2, "line", hcaes(x = year, y = avg.urate, group = educ_group)) %>% hc_title(text = "<span style=\"color:#e88e88\"> Unemployment Rates by Educational Group</span>", useHTML = TRUE) %>% hc_tooltip(table = TRUE, sort = TRUE, digits = 2) %>% hc_add_theme(hc_theme_flatdark()) -> hc3 frameWidget(hc3, width = 1050, height = 400) ```
