ndi: Neighborhood Deprivation Indices

R-CMD-check CRAN status CRAN version CRAN RStudio mirror downloads total CRAN RStudio mirror downloads monthly License GitHub last commit DOI

Date repository last updated: January 23, 2024

Overview

The ndi package is a suite of R functions to compute various metrics of socio-economic deprivation and disparity in the United States. Some metrics are considered “spatial” because they consider the values of neighboring (i.e., adjacent) census geographies in their computation, while other metrics are “aspatial” because they only consider the value within each census geography. Two types of aspatial NDI are available: (1) based on Messer et al. (2006) and (2) based on Andrews et al. (2020) and Slotman et al. (2022) who use variables chosen by Roux and Mair (2010). Both are a decomposition of various demographic characteristics from the U.S. Census Bureau American Community Survey 5-year estimates (ACS-5; 2006-2010 onward) pulled by the tidycensus package. Using data from the ACS-5 (2005-2009 onward), the ndi package can also compute the (1) spatial Racial Isolation Index (RI) based on Anthopolos et al. (2011), (2) spatial Educational Isolation Index (EI) based on Bravo et al. (2021), (3) aspatial Index of Concentration at the Extremes (ICE) based on Feldman et al. (2015) and Krieger et al. (2016), (4) aspatial racial/ethnic Dissimilarity Index (DI) based on Duncan & Duncan (1955), (5) aspatial income or racial/ethnic Atkinson Index (DI) based on Atkinson (1970), (6) aspatial racial/ethnic Isolation Index (II) based on Shevky & Williams (1949; ISBN-13:978-0-837-15637-8) and Bell (1954), (7) aspatial racial/ethnic Correlation Ratio based on Bell (1954) and White (1986), (8) aspatial racial/ethnic Location Quotient based on Merton (1939) and Sudano et al. (2013), and (9) aspatial racial/ethnic Local Exposure and Isolation metric based on Bemanian & Beyer (2017). Also using data from the ACS-5 (2005-2009 onward), the ndi package can retrieve the aspatial Gini Index based on Gini (1921).

Installation

To install the release version from CRAN:

install.packages("ndi")

To install the development version from GitHub:

devtools::install_github("idblr/ndi")

Available functions

Function Description
anthopolos Compute the spatial Racial Isolation Index (RI) based on Anthopolos et al. (2011)
atkinson Compute the aspatial Atkinson Index (AI) based on Atkinson (1970)
bell Compute the aspatial racial/ethnic Isolation Index (II) based on Shevky & Williams (1949; ISBN-13:978-0-837-15637-8) and Bell (1954)
bemanian_beyer Compute the aspatial racial/ethnic Local Exposure and Isolation (LEx/Is) metric based on Bemanian & Beyer (2017)
bravo Compute the spatial Educational Isolation Index (EI) based on Bravo et al. (2021)
duncan Compute the aspatial racial/ethnic Dissimilarity Index (DI) based on Duncan & Duncan (1955)
gini Retrieve the aspatial Gini Index based on Gini (1921)
krieger Compute the aspatial Index of Concentration at the Extremes (ICE) based on Feldman et al. (2015) and Krieger et al. (2016)
messer Compute the aspatial Neighborhood Deprivation Index (NDI) based on Messer et al. (2006)
powell_wiley Compute the aspatial Neighborhood Deprivation Index (NDI) based on Andrews et al. (2020) and Slotman et al. (2022) with variables chosen by Roux and Mair (2010)
sudano Compute the aspatial racial/ethnic Location Quotient (LQ) based on Merton (1938) and Sudano et al. (2013)
white Compute the aspatial racial/ethnic Correlation Ratio (V) based on Bell (1954) and White (1986)

The repository also includes the code to create the project hexagon sticker.

Available sample dataset

Data Description
DCtracts2020 A sample data set containing information about U.S. Census American Community Survey 5-year estimate data for the District of Columbia census tracts (2020). The data are obtained from the tidycensus package and formatted for the messer() and powell_wiley() functions input.

Author

See also the list of contributors who participated in this package, including:

Thank you to those who suggested additional metrics, including:

Getting Started

Usage

# ------------------ #
# Necessary packages #
# ------------------ #

library(ndi)
library(ggplot2)
library(sf) # dependency fo the "ndi" package
library(tidycensus) # a dependency for the "ndi" package
library(tigris)

# -------- #
# Settings #
# -------- #

## Access Key for census data download
### Obtain one at http://api.census.gov/data/key_signup.html
tidycensus::census_api_key("...") # INSERT YOUR OWN KEY FROM U.S. CENSUS API

# ---------------------- #
# Calculate NDI (Messer) #
# ---------------------- #

# Compute the NDI (Messer) values (2016-2020 5-year ACS) for Washington, D.C. census tracts
messer2020DC <- messer(state = "DC", year = 2020)

# ------------------------------ #
# Outputs from messer() function #
# ------------------------------ #

# A tibble containing the identification, geographic name, NDI (Messer) values, NDI (Messer) quartiles, and raw census characteristics for each tract
messer2020DC$ndi

# The results from the principal component analysis used to compute the NDI (Messer) values
messer2020DC$pca

# A tibble containing a breakdown of the missingingness of the census characteristics used to compute the NDI (Messer) values
messer2020DC$missing

# -------------------------------------- #
# Visualize the messer() function output #
# -------------------------------------- #

# Obtain the 2020 census tracts from the "tigris" package
tract2020DC <- tigris::tracts(state = "DC", year = 2020, cb = TRUE)

# Join the NDI (Messer) values to the census tract geometry
DC2020messer <- dplyr::left_join(tract2020DC, messer2020DC$ndi, by = "GEOID")

# Visualize the NDI (Messer) values (2016-2020 5-year ACS) for Washington, D.C. census tracts

## Continuous Index
ggplot2::ggplot() + 
  ggplot2::geom_sf(data = DC2020messer, 
                   ggplot2::aes(fill = NDI),
                   color = "white") +
  ggplot2::theme_bw() +
  ggplot2::scale_fill_viridis_c() +
  ggplot2::labs(fill = "Index (Continuous)",
                caption = "Source: U.S. Census ACS 2016-2020 estimates")+
  ggplot2::ggtitle("Neighborhood Deprivation Index\nContinuous (Messer, non-imputed)",
                   subtitle = "Washington, D.C. tracts as the referent")

## Categorical Index (Quartiles)
### Rename "9-NDI not avail" level as NA for plotting
DC2020messer$NDIQuartNA <- factor(replace(as.character(DC2020messer$NDIQuart),
                                          DC2020messer$NDIQuart == "9-NDI not avail",
                                          NA),
                                  c(levels(DC2020messer$NDIQuart)[-5], NA))

ggplot2::ggplot() + 
  ggplot2::geom_sf(data = DC2020messer, 
                   ggplot2::aes(fill = NDIQuartNA),
                   color = "white") +
  ggplot2::theme_bw() + 
  ggplot2::scale_fill_viridis_d(guide = ggplot2::guide_legend(reverse = TRUE),
                                na.value = "grey50") +
  ggplot2::labs(fill = "Index (Categorical)",
                caption = "Source: U.S. Census ACS 2016-2020 estimates") +
  ggplot2::ggtitle("Neighborhood Deprivation Index\nQuartiles (Messer, non-imputed)",
                   subtitle = "Washington, D.C. tracts as the referent")

# ---------------------------- #
# Calculate NDI (Powell-Wiley) #
# ---------------------------- #

# Compute the NDI (Powell-Wiley) values (2016-2020 5-year ACS) for Washington, D.C. census tracts
powell_wiley2020DC <- powell_wiley(state = "DC", year = 2020)
powell_wiley2020DCi <- powell_wiley(state = "DC", year = 2020, imp = TRUE) # impute missing values

# ------------------------------------ #
# Outputs from powell_wiley() function #
# ------------------------------------ #

# A tibble containing the identification, geographic name, NDI (Powell-Wiley) value, and raw census characteristics for each tract
powell_wiley2020DC$ndi

# The results from the principal component analysis used to compute the NDI (Powell-Wiley) values
powell_wiley2020DC$pca

# A tibble containing a breakdown of the missingingness of the census characteristics used to compute the NDI (Powell-Wiley) values
powell_wiley2020DC$missing

# -------------------------------------------- #
# Visualize the powell_wiley() function output #
# -------------------------------------------- #

# Obtain the 2020 census tracts from the "tigris" package
tract2020DC <- tigris::tracts(state = "DC", year = 2020, cb = TRUE)

# Join the NDI (powell_wiley) values to the census tract geometry
DC2020powell_wiley <- dplyr::left_join(tract2020DC, powell_wiley2020DC$ndi, by = "GEOID")
DC2020powell_wiley <- dplyr::left_join(DC2020powell_wiley, powell_wiley2020DCi$ndi, by = "GEOID")

# Visualize the NDI (Powell-Wiley) values (2016-2020 5-year ACS) for Washington, D.C. census tracts

## Non-imputed missing tracts (Continuous)
ggplot2::ggplot() + 
  ggplot2::geom_sf(data = DC2020powell_wiley, 
                   ggplot2::aes(fill = NDI.x),
                   color = "white") +
  ggplot2::theme_bw() + 
  ggplot2::scale_fill_viridis_c() +
  ggplot2::labs(fill = "Index (Continuous)",
                caption = "Source: U.S. Census ACS 2016-2020 estimates")+
  ggplot2::ggtitle("Neighborhood Deprivation Index\nContinuous (Powell-Wiley, non-imputed)",
                   subtitle = "Washington, D.C. tracts as the referent")

## Non-imputed missing tracts (Categorical quintiles)
### Rename "9-NDI not avail" level as NA for plotting
DC2020powell_wiley$NDIQuintNA.x <- factor(replace(as.character(DC2020powell_wiley$NDIQuint.x),
                                                  DC2020powell_wiley$NDIQuint.x == "9-NDI not avail",
                                                  NA),
                                          c(levels(DC2020powell_wiley$NDIQuint.x)[-6], NA))

ggplot2::ggplot() + 
  ggplot2::geom_sf(data = DC2020powell_wiley, 
                   ggplot2::aes(fill = NDIQuintNA.x),
                   color = "white") +
  ggplot2::theme_bw() + 
  ggplot2::scale_fill_viridis_d(guide = ggplot2::guide_legend(reverse = TRUE),
                                na.value = "grey50") +
  ggplot2::labs(fill = "Index (Categorical)",
                caption = "Source: U.S. Census ACS 2016-2020 estimates")+
  ggplot2::ggtitle("Neighborhood Deprivation Index\nPopulation-weighted Quintiles (Powell-Wiley, non-imputed)",
                   subtitle = "Washington, D.C. tracts as the referent")

## Imputed missing tracts (Continuous)
ggplot2::ggplot() + 
  ggplot2::geom_sf(data = DC2020powell_wiley, 
                   ggplot2::aes(fill = NDI.y),
                   color = "white") +
  ggplot2::theme_bw() + 
  ggplot2::scale_fill_viridis_c() +
  ggplot2::labs(fill = "Index (Continuous)",
                caption = "Source: U.S. Census ACS 2016-2020 estimates")+
  ggplot2::ggtitle("Neighborhood Deprivation Index\nContinuous (Powell-Wiley, imputed)",
                   subtitle = "Washington, D.C. tracts as the referent")

## Imputed missing tracts (Categorical quintiles)
### Rename "9-NDI not avail" level as NA for plotting
DC2020powell_wiley$NDIQuintNA.y <- factor(replace(as.character(DC2020powell_wiley$NDIQuint.y), 
                                                  DC2020powell_wiley$NDIQuint.y == "9-NDI not avail",
                                                  NA), 
                                          c(levels(DC2020powell_wiley$NDIQuint.y)[-6], NA))

ggplot2::ggplot() + 
  ggplot2::geom_sf(data = DC2020powell_wiley, 
                   ggplot2::aes(fill = NDIQuintNA.y),
                   color = "white") +
  ggplot2::theme_bw() + 
  ggplot2::scale_fill_viridis_d(guide = ggplot2::guide_legend(reverse = TRUE),
                                na.value = "grey50") +
  ggplot2::labs(fill = "Index (Categorical)",
                caption = "Source: U.S. Census ACS 2016-2020 estimates")+
  ggplot2::ggtitle("Neighborhood Deprivation Index\nPopulation-weighted Quintiles (Powell-Wiley, imputed)",
                   subtitle = "Washington, D.C. tracts as the referent")

# --------------------------- #
# Compare the two NDI metrics #
# --------------------------- #

# Merge the two NDI metrics (Messer and Powell-Wiley, imputed)
ndi2020DC <- dplyr::left_join(messer2020DC$ndi, powell_wiley2020DCi$ndi, by = "GEOID", suffix = c(".messer", ".powell_wiley"))

# Check the correlation the two NDI metrics (Messer and Powell-Wiley, imputed) as continuous values
cor(ndi2020DC$NDI.messer, ndi2020DC$NDI.powell_wiley, use = "complete.obs") # Pearsons r = 0.975

# Check the similarity of the two NDI metrics (Messer and Powell-Wiley, imputed) as quartiles
table(ndi2020DC$NDIQuart, ndi2020DC$NDIQuint)
# ---------------------------- #
# Retrieve aspatial Gini Index #
# ---------------------------- #

# Gini Index based on Gini (1921) from the ACS-5
gini2020DC <- gini(state = "DC", year = 2020)

# Obtain the 2020 census tracts from the "tigris" package
tract2020DC <- tigris::tracts(state = "DC", year = 2020, cb = TRUE)

# Join the Gini Index values to the census tract geometry
gini2020DC <- dplyr::left_join(tract2020DC, gini2020DC$gini, by = "GEOID")

ggplot2::ggplot() + 
  ggplot2::geom_sf(data = gini2020DC, 
                   ggplot2::aes(fill = gini),
                   color = "white") +
  ggplot2::theme_bw() + 
  ggplot2::scale_fill_viridis_c() +
  ggplot2::labs(fill = "Index (Continuous)",
                caption = "Source: U.S. Census ACS 2016-2020 estimates")+
  ggplot2::ggtitle("Gini Index\nGrey color denotes no data",
                   subtitle = "Washington, D.C. tracts")

# ---------------------------------------------------- #
# Compute spatial Racial Isoliation Index (Anthopolos) #
# ---------------------------------------------------- #

# Racial Isolation Index based on Anthopolos et al. (2011)
## Selected subgroup: Not Hispanic or Latino, Black or African American alone
ri2020DC <- anthopolos(state = "DC", year = 2020, subgroup = "NHoLB")

# Obtain the 2020 census tracts from the "tigris" package
tract2020DC <- tigris::tracts(state = "DC", year = 2020, cb = TRUE)

# Join the RI (Anthopolos) values to the census tract geometry
ri2020DC <- dplyr::left_join(tract2020DC, ri2020DC$ri, by = "GEOID")

ggplot2::ggplot() + 
  ggplot2::geom_sf(data = ri2020DC, 
                   ggplot2::aes(fill = RI),
                   color = "white") +
  ggplot2::theme_bw() + 
  ggplot2::scale_fill_viridis_c() +
  ggplot2::labs(fill = "Index (Continuous)",
                caption = "Source: U.S. Census ACS 2016-2020 estimates")+
  ggplot2::ggtitle("Racial Isolation Index\nNot Hispanic or Latino, Black or African American alone (Anthopolos)",
                   subtitle = "Washington, D.C. tracts (not corrected for edge effects)")

# ---------------------------------------------------- #
# Compute spatial Educational Isoliation Index (Bravo) #
# ---------------------------------------------------- #

# Educational Isolation Index based on Bravo et al. (2021)
## Selected subgroup: without four-year college degree
ei2020DC <- bravo(state = "DC", year = 2020, subgroup = c("LtHS", "HSGiE", "SCoAD"))

# Obtain the 2020 census tracts from the "tigris" package
tract2020DC <- tigris::tracts(state = "DC", year = 2020, cb = TRUE)

# Join the EI (Bravo) values to the census tract geometry
ei2020DC <- dplyr::left_join(tract2020DC, ei2020DC$ei, by = "GEOID")

ggplot2::ggplot() + 
  ggplot2::geom_sf(data = ei2020DC, 
                   ggplot2::aes(fill = EI),
                   color = "white") +
  ggplot2::theme_bw() + 
  ggplot2::scale_fill_viridis_c() +
  ggplot2::labs(fill = "Index (Continuous)",
                caption = "Source: U.S. Census ACS 2016-2020 estimates")+
  ggplot2::ggtitle("Educational Isolation Index\nWithout a four-year college degree (Bravo)",
                   subtitle = "Washington, D.C. tracts (not corrected for edge effects)")

# ----------------------------------------------------------------- #
# Compute aspatial Index of Concentration at the Extremes (Krieger) #
# ----------------------------------------------------------------- #

# Five Indices of Concentration at the Extremes based on Feldman et al. (2015) and Krieger et al. (2016)

ice2020DC <- krieger(state = "DC", year = 2020)

# Obtain the 2020 census tracts from the "tigris" package
tract2020DC <- tigris::tracts(state = "DC", year = 2020, cb = TRUE)

# Join the ICEs (Krieger) values to the census tract geometry
ice2020DC <- dplyr::left_join(tract2020DC, ice2020DC$ice, by = "GEOID")

# Plot ICE for Income
ggplot2::ggplot() + 
  ggplot2::geom_sf(data = ice2020DC, 
                   ggplot2::aes(fill = ICE_inc),
                   color = "white") +
  ggplot2::theme_bw() + 
  ggplot2::scale_fill_gradient2(low = "#998ec3", mid = "#f7f7f7", high = "#f1a340", limits = c(-1,1)) +
  ggplot2::labs(fill = "Index (Continuous)",
                caption = "Source: U.S. Census ACS 2016-2020 estimates")+
  ggplot2::ggtitle("Index of Concentration at the Extremes\nIncome (Krieger)",
                   subtitle = "80th income percentile vs. 20th income percentile")

# Plot ICE for Education
ggplot2::ggplot() + 
  ggplot2::geom_sf(data = ice2020DC, 
                   ggplot2::aes(fill = ICE_edu),
                   color = "white") +
  ggplot2::theme_bw() + 
  ggplot2::scale_fill_gradient2(low = "#998ec3", mid = "#f7f7f7", high = "#f1a340", limits = c(-1,1)) +
  ggplot2::labs(fill = "Index (Continuous)",
                caption = "Source: U.S. Census ACS 2016-2020 estimates")+
  ggplot2::ggtitle("Index of Concentration at the Extremes\nEducation (Krieger)",
                   subtitle = "less than high school vs. four-year college degree or more")

# Plot ICE for Race/Ethnicity
ggplot2::ggplot() + 
  ggplot2::geom_sf(data = ice2020DC, 
                   ggplot2::aes(fill = ICE_rewb),
                   color = "white") +
  ggplot2::theme_bw() + 
  ggplot2::scale_fill_gradient2(low = "#998ec3", mid = "#f7f7f7", high = "#f1a340", limits = c(-1, 1)) +
  ggplot2::labs(fill = "Index (Continuous)",
                caption = "Source: U.S. Census ACS 2016-2020 estimates")+
  ggplot2::ggtitle("Index of Concentration at the Extremes\nRace/Ethnicity (Krieger)",
                   subtitle = "white non-Hispanic vs. black non-Hispanic")

# Plot ICE for Income and Race/Ethnicity Combined
## white non-Hispanic in 80th income percentile vs. black (including Hispanic) in 20th income percentile
ggplot2::ggplot() + 
  ggplot2::geom_sf(data = ice2020DC, 
                   ggplot2::aes(fill = ICE_wbinc),
                   color = "white") +
  ggplot2::theme_bw() + 
  ggplot2::scale_fill_gradient2(low = "#998ec3", mid = "#f7f7f7", high = "#f1a340", limits = c(-1, 1)) +
  ggplot2::labs(fill = "Index (Continuous)",
                caption = "Source: U.S. Census ACS 2016-2020 estimates")+
  ggplot2::ggtitle("Index of Concentration at the Extremes\nIncome and race/ethnicity combined (Krieger)",
                   subtitle = "white non-Hispanic in 80th income percentile vs. black (incl. Hispanic) in 20th inc. percentile")

# Plot ICE for Income and Race/Ethnicity Combined
## white non-Hispanic in 80th income percentile vs. white non-Hispanic in 20th income percentile
ggplot2::ggplot() + 
  ggplot2::geom_sf(data = ice2020DC, 
                   ggplot2::aes(fill = ICE_wpcinc),
                   color = "white") +
  ggplot2::theme_bw() + 
  ggplot2::scale_fill_gradient2(low = "#998ec3", mid = "#f7f7f7", high = "#f1a340", limits = c(-1, 1)) +
  ggplot2::labs(fill = "Index (Continuous)",
                caption = "Source: U.S. Census ACS 2016-2020 estimates")+
  ggplot2::ggtitle("Index of Concentration at the Extremes\nIncome and race/ethnicity combined (Krieger)",
                   subtitle = "white non-Hispanic in 80th income percentile vs. white non-Hispanic in 20th income percentile")

# -------------------------------------------------------------------- #
# Compute aspatial racial/ethnic Dissimilarity Index (Duncan & Duncan) #
# -------------------------------------------------------------------- #

# Dissimilarity Index based on Duncan & Duncan (1955)
## Selected subgroup comparison: Not Hispanic or Latino, Black or African American alone
## Selected subgroup reference: Not Hispanic or Latino, white alone
## Selected large geography: census tract
## Selected small geography: census block group
di2020DC <- duncan(geo_large = "tract", geo_small = "block group", state = "DC",
                   year = 2020, subgroup = "NHoLB", subgroup_ref = "NHoLW")

# Obtain the 2020 census tracts from the "tigris" package
tract2020DC <- tigris::tracts(state = "DC", year = 2020, cb = TRUE)

# Join the DI (Duncan & Duncan) values to the census tract geometry
di2020DC <- dplyr::left_join(tract2020DC, di2020DC$di, by = "GEOID")

ggplot2::ggplot() + 
  ggplot2::geom_sf(data = di2020DC, 
                   ggplot2::aes(fill = DI),
                   color = "white") +
  ggplot2::theme_bw() + 
  ggplot2::scale_fill_viridis_c(limits = c(0, 1)) +
  ggplot2::labs(fill = "Index (Continuous)",
                caption = "Source: U.S. Census ACS 2016-2020 estimates")+
  ggplot2::ggtitle("Dissimilarity Index (Duncan & Duncan)\nWashington, D.C. census block groups to tracts",
                   subtitle = "Black non-Hispanic vs. white non-Hispanic")

# -------------------------------------------------------- #
# Compute aspatial racial/ethnic Atkinson Index (Atkinson) #
# -------------------------------------------------------- #

# Atkinson Index based on Atkinson (1970)
## Selected subgroup: Not Hispanic or Latino, Black or African American alone
## Selected large geography: census tract
## Selected small geography: census block group
## Default epsilon (0.5 or over- and under-representation contribute equally)
ai2020DC <- atkinson(geo_large = "tract", geo_small = "block group", state = "DC",
                     year = 2020, subgroup = "NHoLB")

# Obtain the 2020 census tracts from the "tigris" package
tract2020DC <- tigris::tracts(state = "DC", year = 2020, cb = TRUE)

# Join the AI (Atkinson) values to the census tract geometry
ai2020DC <- dplyr::left_join(tract2020DC, ai2020DC$ai, by = "GEOID")

ggplot2::ggplot() + 
  ggplot2::geom_sf(data = ai2020DC, 
                   ggplot2::aes(fill = AI),
                   color = "white") +
  ggplot2::theme_bw() + 
  ggplot2::scale_fill_viridis_c(limits = c(0, 1)) +
  ggplot2::labs(fill = "Index (Continuous)",
                caption = "Source: U.S. Census ACS 2016-2020 estimates") +
  ggplot2::ggtitle("Atkinson Index (Atkinson)\nWashington, D.C. census block groups to tracts",
                   subtitle = expression(paste("Black non-Hispanic (", epsilon, " = 0.5)")))

# ----------------------------------------------------- #
# Compute aspatial racial/ethnic Isolation Index (Bell) #
# ----------------------------------------------------- #

# Isolation Index based on Shevky & Williams (1949; ISBN-13:978-0-837-15637-8) and Bell (1954)
## Selected subgroup: Not Hispanic or Latino, Black or African American alone
## Selected interaction subgroup: Not Hispanic or Latino, Black or African American alone
## Selected large geography: census tract
## Selected small geography: census block group
ii2020DC <- bell(geo_large = "tract", geo_small = "block group", state = "DC",
                 year = 2020, subgroup = "NHoLB", subgroup_ixn = "NHoLW")

# Obtain the 2020 census tracts from the "tigris" package
tract2020DC <- tigris::tracts(state = "DC", year = 2020, cb = TRUE)

# Join the II (Bell) values to the census tract geometry
ii2020DC <- dplyr::left_join(tract2020DC, ii2020DC$ii, by = "GEOID")

ggplot2::ggplot() + 
  ggplot2::geom_sf(data = ii2020DC, 
                   ggplot2::aes(fill = II),
                   color = "white") +
  ggplot2::theme_bw() + 
  ggplot2::scale_fill_viridis_c(limits = c(0, 1)) +
  ggplot2::labs(fill = "Index (Continuous)",
                caption = "Source: U.S. Census ACS 2016-2020 estimates") +
  ggplot2::ggtitle("Isolation Index (Bell)\nWashington, D.C. census block groups to tracts",
                   subtitle = "Black non-Hispanic vs. white non-Hispanic")

# -------------------------------------------------------- #
# Compute aspatial racial/ethnic Correlation Ratio (White) #
# -------------------------------------------------------- #

# Correlation Ratio based on Bell (1954) and White (1986)
## Selected subgroup: Not Hispanic or Latino, Black or African American alone
## Selected large geography: census tract
## Selected small geography: census block group
v2020DC <- white(geo_large = "tract", geo_small = "block group", state = "DC",
                 year = 2020, subgroup = "NHoLB")

# Obtain the 2020 census tracts from the "tigris" package
tract2020DC <- tigris::tracts(state = "DC", year = 2020, cb = TRUE)

# Join the V (White) values to the census tract geometry
v2020DC <- dplyr::left_join(tract2020DC, v2020DC$v, by = "GEOID")

ggplot2::ggplot() + 
  ggplot2::geom_sf(data = v2020DC, 
                   ggplot2::aes(fill = V),
                   color = "white") +
  ggplot2::theme_bw() + 
  ggplot2::scale_fill_viridis_c(limits = c(0, 1)) +
  ggplot2::labs(fill = "Index (Continuous)",
                caption = "Source: U.S. Census ACS 2016-2020 estimates") +
  ggplot2::ggtitle("Correlation Ratio (White)\nWashington, D.C. census block groups to tracts",
                   subtitle = "Black non-Hispanic")

# --------------------------------------------------------- #
# Compute aspatial racial/ethnic Location Quotient (Sudano) #
# --------------------------------------------------------- #

# Location Quotient based on Merton (1938) and Sudano (2013)
## Selected subgroup: Not Hispanic or Latino, Black or African American alone
## Selected large geography: state
## Selected small geography: census tract
lq2020DC <- sudano(geo_large = "state", geo_small = "tract", state = "DC",
                   year = 2020, subgroup = "NHoLB")

# Obtain the 2020 census tracts from the "tigris" package
tract2020DC <- tigris::tracts(state = "DC", year = 2020, cb = TRUE)

# Join the LQ (Sudano) values to the census tract geometry
lq2020DC <- dplyr::left_join(tract2020DC, lq2020DC$lq, by = "GEOID")

ggplot2::ggplot() + 
  ggplot2::geom_sf(data = lq2020DC, 
                   ggplot2::aes(fill = LQ),
                   color = "white") +
  ggplot2::theme_bw() + 
  ggplot2::scale_fill_viridis_c() +
  ggplot2::labs(fill = "Index (Continuous)",
                caption = "Source: U.S. Census ACS 2016-2020 estimates") +
  ggplot2::ggtitle('Location Quotient (Sudano)\nWashington, D.C. census tracts vs. "state"',
                   subtitle = "Black non-Hispanic")

# ------------------------------------------------------------------------------------- #
# Compute aspatial racial/ethnic Local Exposure and Isolation (Bemanian & Beyer) metric #
# ------------------------------------------------------------------------------------- #

# Local Exposure and Isolation metric based on Bemanian & Beyer (2017)
## Selected subgroup: Not Hispanic or Latino, Black or African American alone
## Selected interaction subgroup: Not Hispanic or Latino, Black or African American alone
## Selected large geography: state
## Selected small geography: census tract
lexis2020DC <- bemanian_beyer(geo_large = "state", geo_small = "tract", state = "DC",
                              year = 2020, subgroup = "NHoLB", subgroup_ixn = "NHoLW")

# Obtain the 2020 census tracts from the "tigris" package
tract2020DC <- tigris::tracts(state = "DC", year = 2020, cb = TRUE)

# Join the LEx/Is (Bemanian & Beyer) values to the census tract geometry
lexis2020DC <- dplyr::left_join(tract2020DC, lexis2020DC$lexis, by = "GEOID")

ggplot2::ggplot() + 
  ggplot2::geom_sf(data = lexis2020DC, 
                   ggplot2::aes(fill = LExIs),
                   color = "white") +
  ggplot2::theme_bw() + 
  ggplot2::scale_fill_viridis_c() +
  ggplot2::labs(fill = "Index (Continuous)",
                caption = "Source: U.S. Census ACS 2016-2020 estimates") +
  ggplot2::ggtitle('Local Exposure and Isolation (Bemanian & Beyer) metric\nWashington, D.C. census block groups to tracts',
                   subtitle = "Black non-Hispanic vs. white non-Hispanic")

Funding

This package was originally developed while the author was a postdoctoral fellow supported by the Cancer Prevention Fellowship Program at the National Cancer Institute. Any modifications since December 05, 2022 were made while the author was an employee of Social & Scientific Systems, Inc., a division of DLH Corporation.

Acknowledgments

The messer() function functionalizes the code found in Hruska et al. (2022) available on an OSF repository, but with percent with income less than $30K added to the computation based on Messer et al. (2006). The messer() function also allows for the computation of NDI (Messer) for each year between 2010-2020 (when the U.S. census characteristics are available to date). There was no code companion to compute NDI (Powell-Wiley) included in Andrews et al. (2020) or Slotman et al. (2022), but the package author worked directly with the latter manuscript authors to replicate their SAS code in R for the powell_wiley() function. Please note: the NDI (Powell-Wiley) values will not exactly match (but will highly correlate with) those found in Andrews et al. (2020) and Slotman et al. (2022) because the two studies used a different statistical platform (i.e., SPSS and SAS, respectively) that intrinsically calculate the principal component analysis differently from R. The internal function to calculate the Atkinson Index is based on the Atkinson() function in the DescTools package.

When citing this package for publication, please follow:

citation("ndi")

Questions? Feedback?

For questions about the package, please contact the maintainer Dr. Ian D. Buller or submit a new issue. Confirmation of the computation, feedback, and feature collaboration is welcomed, especially from the authors of the references cited above.