class: title-slide, center, middle background-image: url(images/ouaerial.jpeg) background-size: cover # .heat[.fancy[Mapping and Interactive Graphics in R]] ## .heat[.fancy[Ani Ruhil]] --- name: agenda ## .fancy[ .heat[ Agenda ]] - building maps in R - `leaflet` maps - interactive/animated graphics with `highcharter` --- class: inverse, middle, center # .fancy[.salt[ maps with `ggplot2` ]] --- We need a few libraries to get us started on our way ... ```r library(ggplot2) library(ggmap) library(maps) library(mapdata) library(maptools) library(ggthemes) ``` How about a county map of the USA? What about Ohio's 88 counties? ```r map_data("county") -> usa # get basic map data for all USA counties subset(usa, region == "ohio") -> oh # subset to counties in Ohio names(oh) ``` ``` ## [1] "long" "lat" "group" "order" "region" "subregion" ``` * long = longitude -- measure `east-west` positions. The prime meridian is assigned the value of 0 degrees, and runs through Greenwich (England). Athens, Ohio has a longitude of -82.101255 * lat = latitude -- measure `north-south` position. The equator is defined as 0 degrees, the North Pole as 90 degrees north, and the South Pole as 90 degrees south. Athens, Ohio has a latitude of 39.329240 * group = an identifier that is unique for each subregion (here the counties) * order = an identifier that indicates the `order in which the boundary lines should be drawn` * region = string indicator for `regions` (here the states) * subregion = string indicator for `sub-regions` (here the county names) --- ```r ggplot() + geom_polygon(data = oh, aes(x = long, y = lat), fill = "white", color = "black") + ggtitle("a") # bad ggplot() + geom_polygon(data = oh, aes(x = long, y = lat, group = group), fill = "white", color = "black") + ggtitle("b") # a slightly better basic map ``` .pull-left[ <img src="Module07_files/figure-html/map3b1-1.png" width="100%" style="display: block; margin: auto;" /> ] .pull-right[ <img src="Module07_files/figure-html/map3b2-1.png" width="100%" style="display: block; margin: auto;" /> ] --- ```r ggplot() + geom_polygon(data = oh, aes(x = long, y = lat, group = group), fill = "white", color = "black") + * coord_fixed(1.3) + ggtitle("c") # a better map ggplot() + geom_polygon(data = oh, aes(x = long, y = lat, group = group, fill = subregion), color = "black", alpha = 0.3) + * coord_fixed(1.3) + * guides(fill = FALSE) + ggtitle("d") # a colored map ``` .pull-left[ <img src="Module07_files/figure-html/map4b1-1.png" width="100%" style="display: block; margin: auto;" /> ] .pull-right[ <img src="Module07_files/figure-html/map4b2-1.png" width="100%" style="display: block; margin: auto;" /> ] --- ## .fancy[ Labeling the counties ] - to label counties we need to find the `centroid` of each county and then use the county names - county names will have to be formatted into `titlecase` - taking the mean/median of latitude/longitude will not work so we use specific code to find the `centroids` ```r library(stringr) str_to_title(oh$subregion) -> oh$county library(sp) getLabelPoint <- # Returns a county-named list of label points function(county){Polygon(county[c('long', 'lat')])@labpt} by(oh, oh$county, getLabelPoint) -> centroids # Returns list do.call("rbind.data.frame", centroids) -> centroids2# Convert to Data Frame rownames(centroids) -> centroids2$county names(centroids2) <- c('clong', 'clat', "county") # Appropriate Header ``` --- ### Now the code for the labeled plot ... ```r ggplot() + geom_polygon( data = oh, aes(x = long, y = lat, group = group), fill = "white", color = "gray") + coord_fixed(1.3) + geom_text( data = centroids2, aes(x = clong, y = clat, label = county), color = "darkblue", size = 1) + theme_map() ``` --- ### ... and the plot itself <img src="Module07_files/figure-html/map6b-1.png" width="70%" style="display: block; margin: auto;" /> --- ## .fancy[ Using `fill` ] What if we want to `fill` each county with grouped values of some variable such as population density, percent in poverty, median educational attainment, etc? (1) find and prepare the variable we want to use (2) `merge` this variable with the data used to generate the map (3) generate the map ```r library(readxl) read_excel("data/acpovertyOH.xlsx", sheet = "counties") -> acpovertyOH c("ranking", "county", "child1216", "child0711", "all1216", "all0711") -> colnames(acpovertyOH) merge(oh, acpovertyOH[, c(2:3)], by = "county", all.x = TRUE, sort = FALSE) -> my.df my.df[order(my.df$order), ] -> my.df ``` --- ### Now the code for the map ... ```r ggplot() + geom_polygon( data = my.df, aes(x = long, y = lat, group = group, fill = child1216), color = "black" ) + coord_fixed(1.3) + geom_text( data = centroids2, aes(x = clong, y = clat, label = county), color = "black", size = 2.25 ) + scale_fill_distiller(palette = "Spectral") + labs(fill = "Child Poverty %") + theme_map() + theme(legend.position = "bottom") ``` --- ### ... and now the map itself <img src="Module07_files/figure-html/map8b-1.png" width="70%" style="display: block; margin: auto;" /> --- That isn't a bad map but we could do better, by creating `quartiles` (4 groups) or `quintiles` (5 groups) so that it is easier to pinpoint which county falls into what group ```r library(dplyr) my.df %>% mutate( grouped_poverty = cut( child1216, breaks = c(quantile(my.df$child1216, probs = seq(0, 1, by = 0.2))), labels = c("0-20", "20-40", "40-60", "60-80", "80-100"), include.lowest = TRUE) ) -> my.df ``` ```r ggplot() + geom_polygon(data = my.df, aes(x = long, y = lat, group = group, fill = grouped_poverty), color = "black") + coord_fixed(1.3) + geom_text(data = centroids2, aes(x = clong, y = clat, label = county), color = "white", size = 2.25) + scale_fill_brewer(palette = "Set1", direction = -1) + labs(fill = "Poverty Quintiles") + theme_map() ``` `map is on the following slide ...` --- <img src="Module07_files/figure-html/map11-1.png" width="70%" style="display: block; margin: auto;" /> --- class: inverse, middle, center # .fancy[.salt[ Using `urbnmapr` for maps ]] --- ```r library(tidyverse) library(urbnmapr) states %>% ggplot(aes(long, lat, group = group)) + geom_polygon(fill = "grey", color = "#ffffff", size = 0.25) + coord_map(projection = "albers", lat0 = 39, lat1 = 45) ``` <img src="Module07_files/figure-html/urbanmapr1-1.png" width="65%" style="display: block; margin: auto;" /> --- ```r counties %>% ggplot(aes(long, lat, group = group)) + geom_polygon(fill = "grey", color = "#ffffff", size = 0.05) + coord_map(projection = "albers", lat0 = 39, lat1 = 45) ``` <img src="Module07_files/figure-html/urbanmapr2-1.png" width="65%" style="display: block; margin: auto;" /> --- class: inverse, center, middle # .fancy[.salt[ Mapping with `leaflet` ]] --- `leaflet` is an easy to learn a JavaScript library that generates interactive maps ```r library(leaflet) library(leaflet.extras) library(widgetframe) leaflet() %>% setView(lat = 39.322577, lng = -82.106336, zoom = 14) %>% addTiles() %>% setMapWidgetStyle() %>% frameWidget(width = '1000', height = '320') -> m1 m1 ```
--- drop a pin on Building 21 ```r leaflet() %>% setView(lat = 39.322577, lng = -82.106336, zoom = 15) %>% addMarkers(lat = 39.319984, lng = -82.107084, popup = c("The Ridges, Building 21")) %>% addTiles() %>% setMapWidgetStyle() %>% frameWidget(width = '1000', height = '320') -> m2 m2 ```
- `popup = ` generates a default marker with specific text --- # .fancy[ `NYC Bike data` ] Let us map some bike-share stations in New York City. The actual data-frame is large so we draw a random sample of 30 rows with the `sample_n()` command from `dplyr` ```r load("data/citibike.RData") library(dplyr) citibike %>% sample_n(30) -> citibike2 ``` ```r leaflet(data = citibike2, width = "100%") %>% setView(lat = 40.74, lng = -73.99, zoom = 12) %>% addTiles() %>% addMarkers(data = citibike2, lat = ~start.station.latitude, lng = ~start.station.longitude, label = ~start.station.id, popup = ~start.station.name) %>% setMapWidgetStyle() %>% frameWidget(width = '1000', height = '320') -> m3 m3 ``` --- - If you click on a marker you will see the station's name - One can do a lot more in terms of customizing the markers but I leave that to you to explore - `leaflet` will do more than just draw markers but unfortunately we do not have time to explore its other features
--- class: middle, inverse, center # .fancy[.salt[ `patchwork` ... Multiple graphics on one canvas ]] --- Often when you are building a visualization you end up needing to squeeze multiple graphics into a single canvas, like the example below <img src="Module07_files/figure-html/patch2c-1.png" width="65%" style="display: block; margin: auto;" /> --- We can do this in many ways but the easiest library to use might be `patchwork` - load `ggplot2` and `patchwork` (and any other libraries you plan to use for the plots) - start by naming each plot; most of us end up naming them p1, p2, and so on (why? because those were the earliest examples on the web) - then decide on how you want the plots ... how many do you have? should they be side-by-side? two side-by-side and the third in a row below these two? Let us create three plots ```r library(ggplot2) library(patchwork) data(mtcars) ggplot(mtcars, aes(x = factor(am, labels = c("Automatic", "Manual")), y = mpg)) + geom_boxplot() + labs(x = "Automatic/Manual", y = "Miles per gallon") -> p1 ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point() + labs(x = "Curb Weight", y = "Miles per gallon") -> p2 ggplot(mtcars, aes(x = qsec, y = mpg)) + geom_point() + labs(x = "Quarter Mile times", y = "Miles per gallon") + facet_wrap(~gear) -> p3 ``` --- .pull-left[ ```r p1 + p2 ``` <img src="Module07_files/figure-html/patch3c-1.png" width="100%" style="display: block; margin: auto;" /> ] .pull-right[ ```r p1 + p2 + plot_layout(ncol = 1) ``` <img src="Module07_files/figure-html/patch3d-1.png" width="100%" style="display: block; margin: auto;" /> ] `plot_layout()` has several options, key ones will be - `ncol`, `nrow`: number of columns/rows - `byrow`: how should the plots be embedded, by filling columns first or by filling rows first - `widths`, `heights`: relative widths/heights of each column and row in the grid. Will get repeated to match the dimensions of the grid. --- .pull-left[ fill row 1 with p1, p2, then row 2 with p1, p2 ```r p1 + p2 + p1 + p2 + plot_layout(ncol = 2, byrow = TRUE) ``` <img src="Module07_files/figure-html/patch4a-1.png" width="100%" style="display: block; margin: auto;" /> ] .pull-right[ fill column 1 with p1, p2, then column 2 with p1, p2 ```r p1 + p2 + p1 + p2 + plot_layout(ncol = 2, byrow = FALSE) ``` <img src="Module07_files/figure-html/patch4b-1.png" width="100%" style="display: block; margin: auto;" /> ] --- .pull-left[ ```r (p1 + (p2 + p3) + plot_layout(ncol = 1)) ``` <img src="Module07_files/figure-html/patch5a-1.png" width="100%" style="display: block; margin: auto;" /> ] .pull-right[ ```r (p1 | p2 | p1) / p3 ``` <img src="Module07_files/figure-html/patch5b-1.png" width="100%" style="display: block; margin: auto;" /> ] the `|` specifies vertical layouts and the `/` specifies horizontal layouts --- ```r (p1 + (p2 + p3) + plot_layout(ncol = 1, heights = c(1,2))) ``` <img src="Module07_files/figure-html/patch5c-1.png" width="60%" style="display: block; margin: auto;" /> - see other settings [here](https://github.com/thomasp85/patchwork) - you can also [explore `cowplot` here](https://cran.r-project.org/web/packages/cowplot/vignettes/introduction.html) --- class: middle, inverse, center # .fancy[.salt[ `highcharter` -- Interactive graphics ]] --- [`highcharter`](http://jkunst.com/highcharter/) is one of my favorite packages for dynamic plots because it builds them with ease and yet they are visually stunning (see below)
--- `Scatterplot` ```r load("data/epa.RData") epa %>% filter(year == 2019) %>% sample_n(100) -> epa2 hchart(epa2, "scatter", hcaes(x = city08, y = highway08, group = make)) -> hc1 frameWidget(hc1, width = 1000, height = 350) ```
--- `Line chart` ```r load("data/unemprate.RData") library(lubridate) year(urate$yearmonth) -> urate$year urate %>% group_by(educ_group, year) %>% summarise(avg.urate = mean(rate, na.rm = TRUE)) -> urate2 hchart(urate2, "line", hcaes(x = year, y = avg.urate, group = educ_group)) -> hc2 frameWidget(hc2, width = 1000, height = 375) ```
--- #### `Dressing up the highcharter plot` ```r hchart(urate2, "line", hcaes(x = year, y = avg.urate, group = educ_group)) %>% hc_title(text = "<span style=\"color:#e88e88\"> Unemployment Rates by Educational Group</span>", useHTML = TRUE) %>% hc_tooltip(table = TRUE, sort = TRUE, digits = 2) %>% hc_add_theme(hc_theme_flatdark()) -> hc3 frameWidget(hc3, width = 1050, height = 400) ```
--- class: right, middle <img class="circle" src="https://github.com/aniruhil.png" width="175px"/> # Find me at... [<svg style="height:0.8em;top:.04em;position:relative;" viewBox="0 0 512 512"><path d="M459.37 151.716c.325 4.548.325 9.097.325 13.645 0 138.72-105.583 298.558-298.558 298.558-59.452 0-114.68-17.219-161.137-47.106 8.447.974 16.568 1.299 25.34 1.299 49.055 0 94.213-16.568 130.274-44.832-46.132-.975-84.792-31.188-98.112-72.772 6.498.974 12.995 1.624 19.818 1.624 9.421 0 18.843-1.3 27.614-3.573-48.081-9.747-84.143-51.98-84.143-102.985v-1.299c13.969 7.797 30.214 12.67 47.431 13.319-28.264-18.843-46.781-51.005-46.781-87.391 0-19.492 5.197-37.36 14.294-52.954 51.655 63.675 129.3 105.258 216.365 109.807-1.624-7.797-2.599-15.918-2.599-24.04 0-57.828 46.782-104.934 104.934-104.934 30.213 0 57.502 12.67 76.67 33.137 23.715-4.548 46.456-13.32 66.599-25.34-7.798 24.366-24.366 44.833-46.132 57.827 21.117-2.273 41.584-8.122 60.426-16.243-14.292 20.791-32.161 39.308-52.628 54.253z"/></svg> @aruhil](http://twitter.com/aruhil) [<svg style="height:0.8em;top:.04em;position:relative;" viewBox="0 0 512 512"><path d="M326.612 185.391c59.747 59.809 58.927 155.698.36 214.59-.11.12-.24.25-.36.37l-67.2 67.2c-59.27 59.27-155.699 59.262-214.96 0-59.27-59.26-59.27-155.7 0-214.96l37.106-37.106c9.84-9.84 26.786-3.3 27.294 10.606.648 17.722 3.826 35.527 9.69 52.721 1.986 5.822.567 12.262-3.783 16.612l-13.087 13.087c-28.026 28.026-28.905 73.66-1.155 101.96 28.024 28.579 74.086 28.749 102.325.51l67.2-67.19c28.191-28.191 28.073-73.757 0-101.83-3.701-3.694-7.429-6.564-10.341-8.569a16.037 16.037 0 0 1-6.947-12.606c-.396-10.567 3.348-21.456 11.698-29.806l21.054-21.055c5.521-5.521 14.182-6.199 20.584-1.731a152.482 152.482 0 0 1 20.522 17.197zM467.547 44.449c-59.261-59.262-155.69-59.27-214.96 0l-67.2 67.2c-.12.12-.25.25-.36.37-58.566 58.892-59.387 154.781.36 214.59a152.454 152.454 0 0 0 20.521 17.196c6.402 4.468 15.064 3.789 20.584-1.731l21.054-21.055c8.35-8.35 12.094-19.239 11.698-29.806a16.037 16.037 0 0 0-6.947-12.606c-2.912-2.005-6.64-4.875-10.341-8.569-28.073-28.073-28.191-73.639 0-101.83l67.2-67.19c28.239-28.239 74.3-28.069 102.325.51 27.75 28.3 26.872 73.934-1.155 101.96l-13.087 13.087c-4.35 4.35-5.769 10.79-3.783 16.612 5.864 17.194 9.042 34.999 9.69 52.721.509 13.906 17.454 20.446 27.294 10.606l37.106-37.106c59.271-59.259 59.271-155.699.001-214.959z"/></svg> aniruhil.org](https://aniruhil.org) [<svg style="height:0.8em;top:.04em;position:relative;" viewBox="0 0 512 512"><path d="M476 3.2L12.5 270.6c-18.1 10.4-15.8 35.6 2.2 43.2L121 358.4l287.3-253.2c5.5-4.9 13.3 2.6 8.6 8.3L176 407v80.5c0 23.6 28.5 32.9 42.5 15.8L282 426l124.6 52.2c14.2 6 30.4-2.9 33-18.2l72-432C515 7.8 493.3-6.8 476 3.2z"/></svg> ruhil@ohio.edu](mailto:ruhil@ohio.edu)