Migration Flows and Tidycensus

Census Ohio

{tidycensus} now includes migration flows and boy is this exciting! Tapping Census data is becoming easier by the month, literally. Being so Ohio and Athens-focused I figured it would be nice to see where Athens’ citizens came from. It would be even more interesting to look at some of our other counties and metropolitan areas.

Ani Ruhil true
04-03-2021

Let us start with looking at the get_flows() function that is no part and parcel of {tidycensus}. The most recent year-span for which 5-year data are available is the 2014-2018 period. The default measures will be MOVEDIN, MOVEDOUT, and MOVEDNET but additional variables can be specified by name. The variable list is available here.

library(tidyverse)
library(tidycensus)

get_flows(
  geography = "county",
  state = "OH",
  msa = NULL,
  county = NULL,
  year = 2018,
  variables = c("MOVEDIN", "MOVEDOUT", "MOVEDNET"),
  breakdown = NULL,
  breakdown_labels = FALSE,
  moe_level = 90,
  geometry = FALSE
  ) -> ohdf

glimpse(ohdf)
Rows: 57,420
Columns: 7
$ GEOID1     <chr> "39001", "39001", "39001", "39001", "39001", "390…
$ GEOID2     <chr> NA, NA, NA, NA, NA, NA, "04013", "04013", "04013"…
$ FULL1_NAME <chr> "Adams County, Ohio", "Adams County, Ohio", "Adam…
$ FULL2_NAME <chr> "Asia", "Asia", "Asia", "Europe", "Europe", "Euro…
$ variable   <chr> "MOVEDIN", "MOVEDOUT", "MOVEDNET", "MOVEDIN", "MO…
$ estimate   <dbl> 16, NA, NA, 11, NA, NA, 0, 40, -40, 4, 0, 4, 6, 0…
$ moe        <dbl> 23, NA, NA, 18, NA, NA, 21, 61, 61, 8, 29, 8, 9, …

The breakdown option will allow you to choose the usual indicators to disaggregate the flows by, and the available options can be sourced for specific years (2015 seems to be the most recent year for which this level of detail is available) here. For example, maybe we are curious about race in the 2011-2015 flows.

get_flows(
  geography = "county",
  state = "OH",
  msa = NULL,
  county = NULL,
  year = 2015,
  variables = c("MOVEDIN", "MOVEDOUT", "MOVEDNET"),
  breakdown = c("RACE"),
  breakdown_labels = TRUE,
  moe_level = 90,
  geometry = FALSE
  ) -> ohdf2

glimpse(ohdf2)
Rows: 292,215
Columns: 9
$ GEOID1     <chr> "39001", "39001", "39001", "39001", "39001", "390…
$ GEOID2     <chr> "12031", "12031", "12031", "12071", "12071", "120…
$ FULL1_NAME <chr> "Adams County, Ohio", "Adams County, Ohio", "Adam…
$ FULL2_NAME <chr> "Duval County, Florida", "Duval County, Florida",…
$ RACE       <chr> "00", "00", "00", "00", "00", "00", "00", "00", "…
$ RACE_label <chr> "All races", "All races", "All races", "All races…
$ variable   <chr> "MOVEDIN", "MOVEDOUT", "MOVEDNET", "MOVEDIN", "MO…
$ estimate   <dbl> 0, 84, -84, 2, 6, -4, 0, 38, -38, 6, 0, 6, 0, 3, …
$ moe        <dbl> 20, 131, 131, 4, 11, 12, 20, 61, 61, 9, 31, 9, 20…
ohdf2 %>%
  count(RACE_label)
# A tibble: 5 x 2
  RACE_label                                n
  <chr>                                 <int>
1 All races                             58443
2 Asian alone                           58443
3 Black or African American alone       58443
4 Other race alone or Two or more races 58443
5 White alone                           58443

Let us ask some questions of the data. (1) Where did most international arrivals come from? If we focus on domestic arrivals, what counties have sent us the most residents?

ohdf %>%
  filter(
    variable == "MOVEDIN",
    is.na(GEOID2),
    FULL1_NAME == "Athens County, Ohio"
    ) %>%
  arrange(desc(estimate)) %>%
  select(4, 6, 7)
# A tibble: 4 x 3
  FULL2_NAME      estimate   moe
  <chr>              <dbl> <dbl>
1 Asia                 243   109
2 Europe               104    57
3 Africa                27    24
4 Central America       24    25
ohdf %>%
  filter(
    variable == "MOVEDIN",
    !is.na(GEOID2),
    FULL1_NAME == "Athens County, Ohio"
    ) %>%
  arrange(desc(estimate)) %>%
  select(4, 6, 7)
# A tibble: 316 x 3
   FULL2_NAME              estimate   moe
   <chr>                      <dbl> <dbl>
 1 Franklin County, Ohio       1041   219
 2 Cuyahoga County, Ohio        502   106
 3 Hamilton County, Ohio        482   133
 4 Delaware County, Ohio        365   135
 5 Fairfield County, Ohio       354   126
 6 Montgomery County, Ohio      345   152
 7 Stark County, Ohio           340   121
 8 Meigs County, Ohio           333   184
 9 Summit County, Ohio          276   123
10 Washington County, Ohio      254   136
# … with 306 more rows

Have we had more folks arrive or depart? Unfortunately, we can only pose this question of domestic arrivals and departures since we have no informaiton on where those who departed overseas went to.

ohdf %>%
  filter(
    variable == "MOVEDNET",
    !is.na(GEOID2),
    FULL1_NAME == "Athens County, Ohio"
    ) %>%
  arrange(desc(estimate)) %>%
  select(4, 6, 7)
# A tibble: 316 x 3
   FULL2_NAME              estimate   moe
   <chr>                      <dbl> <dbl>
 1 Franklin County, Ohio        613   236
 2 Cuyahoga County, Ohio        413   128
 3 Hamilton County, Ohio        365   146
 4 Delaware County, Ohio        340   136
 5 Stark County, Ohio           335   122
 6 Montgomery County, Ohio      306   157
 7 Summit County, Ohio          276   123
 8 Fairfield County, Ohio       219   152
 9 Licking County, Ohio         164    64
10 Mahoning County, Ohio        158    80
# … with 306 more rows

Now the much-beloved map, this time with {mapdeck}.

get_flows(
  geography = "county",
  state = "OH",
  msa = NULL,
  county = NULL,
  year = 2018,
  variables = c("MOVEDIN"),
  breakdown = NULL,
  breakdown_labels = FALSE,
  moe_level = 90,
  geometry = TRUE
  ) -> ohdf
library(mapdeck)

ohdf %>% 
  filter(
    !is.na(GEOID2), 
    variable == "MOVEDIN",
    FULL1_NAME == "Athens County, Ohio"
    ) %>% 
  slice_max(
    n = 10, 
    order_by = estimate
    ) %>% 
  mutate(
    width = estimate / 200,
    tooltip = paste0(
      scales::comma(estimate * 5, 1),
      " people moved from ", str_remove(FULL2_NAME, "County"),
      " to ", str_remove(FULL1_NAME, "County"),
      " between 2014 and 2018"
      )
    ) -> dfin

dfin %>% 
  mapdeck(
    location = c(-82.102559, 39.330710), 
    zoom = 10,
    style = mapdeck_style("streets"),
    pitch = 45
    ) %>% 
  add_arc(
    origin = "centroid1",
    destination = "centroid2",
    stroke_width = "width",
    auto_highlight = TRUE,
    highlight_colour = "#FFFFFFFF",
    tooltip = "tooltip"
  )

What if we are curious about the net migration in the largest metropolitan areas?

get_flows(
  geography = "metropolitan statistical area",
  msa = c(
    10420, 15940, 17140, 17460, 18140,
    19380, 26580, 45780, 49660
    ),
  year = 2018,
  variables = c("MOVEDNET"),
  moe_level = 90,
  geometry = TRUE
  ) -> ohdf3
ohdf3 %>% 
  filter(
    !is.na(GEOID2), 
    variable == "MOVEDNET"
    ) %>% 
  mutate(
    width = estimate / 200,
    tooltip = paste0(
      scales::comma(estimate * 5, 1),
      " people moved from ", str_remove(FULL2_NAME, "County"),
      " to ", str_remove(FULL1_NAME, "County"),
      " between 2014 and 2018"
      )
    ) -> dfin

dfin %>% 
  mapdeck(
    location = c(40.341628, -82.923617),
    style = mapdeck_style("light"),
    pitch = 45,
    zoom = 14
    ) %>% 
  add_arc(
    origin = "centroid1",
    destination = "centroid2",
    stroke_width = "width",
    auto_highlight = TRUE,
    highlight_colour = "#FFFFFFFF",
    tooltip = "tooltip"
  )

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY-SA 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Ruhil (2021, April 3). From an Attican Hollow ...: Migration Flows and Tidycensus. Retrieved from https://aniruhil.org/posts/2021-04-02-migration-flows-and-tidycensus/

BibTeX citation

@misc{ruhil2021migration,
  author = {Ruhil, Ani},
  title = {From an Attican Hollow ...: Migration Flows and Tidycensus},
  url = {https://aniruhil.org/posts/2021-04-02-migration-flows-and-tidycensus/},
  year = {2021}
}