Sunday, December 22, 2024

U.S. Families (Census Data)


The fundamental driver of human civilization is family. This post will consider U.S. Census data on the demographic and geographic distribution of marital status and children at the household level.

In a prior article, we looked at American Community Survey (ACS) data from the U.S. Census, establishing a baseline population for the United States split by age and sex. Marital status is reported in table S1201, and for this next step, I chose to focus on household child data from table B09002. My years of interest were every other (odd) year from 2011-2023, because that gives me a glimpse of related household family trends over a dozen years -- certainly enough to give a flavor.

# Load necessary libraries
library(tidycensus)
library(dplyr)
library(tidyr)

# Set parameters
# Set your Census API key
census_api_key("YOUR_CENSUS_API_KEY")

# Define the year and survey type
survey <- "acs1"
# Define the years you want-pull data for
years <- c(2011, 2013, 2015, 2017, 2019, 2021, 2023)


# Define variables to pull from the Census API
variables <- c(

# S1201 marital status
  "B12002_004E", # male, never married, 15-17 years
  "B12002_005E", # male, never married, 18-19 years
  "B12002_006E", # male, never married, 20-24 years
  "B12002_007E", # male, never married, 25-29 years
  "B12002_008E", # male, never married, 30-34 years
  "B12002_009E", # male, never married, 35-39 years
  "B12002_010E", # male, never married, 40-44 years
  "B12002_011E", # male, never married, 45-49 years
  "B12002_012E", # male, never married, 50-54 years
  "B12002_013E", # male, never married, 55-59 years
  "B12002_014E", # male, never married, 60-64 years
  "B12002_015E", # male, never married, 65-74 years
  "B12002_016E", # male, never married, 75-84 years
  "B12002_017E", # male, never married, 85+ years
  "B12002_020E", # male, married, 15-17 years
  "B12002_021E", # male, married, 18-19 years
  "B12002_022E", # male, married, 20-24 years
  "B12002_023E", # male, married, 25-29 years
  "B12002_024E", # male, married, 30-34 years
  "B12002_025E", # male, married, 35-39 years
  "B12002_026E", # male, married, 40-44 years
  "B12002_027E", # male, married, 45-49 years
  "B12002_028E", # male, married, 50-54 years
  "B12002_029E", # male, married, 55-59 years
  "B12002_030E", # male, married, 60-64 years
  "B12002_031E", # male, married, 65-74 years
  "B12002_032E", # male, married, 75-84 years
  "B12002_033E", # male, married, 85+ years
  "B12002_036E", # male, separated, 15-17 years
  "B12002_037E", # male, separated, 18-19 years
  "B12002_038E", # male, separated, 20-24 years
  "B12002_039E", # male, separated, 25-29 years
  "B12002_040E", # male, separated, 30-34 years
  "B12002_041E", # male, separated, 35-39 years
  "B12002_042E", # male, separated, 40-44 years
  "B12002_043E", # male, separated, 45-49 years
  "B12002_044E", # male, separated, 50-54 years
  "B12002_045E", # male, separated, 55-59 years
  "B12002_046E", # male, separated, 60-64 years
  "B12002_047E", # male, separated, 65-74 years
  "B12002_048E", # male, separated, 75-84 years
  "B12002_049E", # male, separated, 85+ years
  "B12002_066E", # male, widowed, 15-17 years
  "B12002_067E", # male, widowed, 18-19 years
  "B12002_068E", # male, widowed, 20-24 years
  "B12002_069E", # male, widowed, 25-29 years
  "B12002_070E", # male, widowed, 30-34 years
  "B12002_071E", # male, widowed, 35-39 years
  "B12002_072E", # male, widowed, 40-44 years
  "B12002_073E", # male, widowed, 45-49 years
  "B12002_074E", # male, widowed, 50-54 years
  "B12002_075E", # male, widowed, 55-59 years
  "B12002_076E", # male, widowed, 60-64 years
  "B12002_077E", # male, widowed, 65-74 years
  "B12002_078E", # male, widowed, 75-84 years
  "B12002_079E", # male, widowed, 85+ years
  "B12002_081E", # male, divorced, 15-17 years
  "B12002_082E", # male, divorced, 18-19 years
  "B12002_083E", # male, divorced, 20-24 years
  "B12002_084E", # male, divorced, 25-29 years
  "B12002_085E", # male, divorced, 30-34 years
  "B12002_086E", # male, divorced, 35-39 years
  "B12002_087E", # male, divorced, 40-44 years
  "B12002_088E", # male, divorced, 45-49 years
  "B12002_089E", # male, divorced, 50-54 years
  "B12002_090E", # male, divorced, 55-59 years
  "B12002_091E", # male, divorced, 60-64 years
  "B12002_092E", # male, divorced, 65-74 years
  "B12002_093E", # male, divorced, 75-84 years
  "B12002_094E", # male, divorced, 85+ years
  "B12002_097E", # female, never married, 15-17 years
  "B12002_098E", # female, never married, 18-19 years
  "B12002_099E", # female, never married, 20-24 years
  "B12002_100E", # female, never married, 25-29 years
  "B12002_101E", # female, never married, 30-34 years
  "B12002_102E", # female, never married, 35-39 years
  "B12002_103E", # female, never married, 40-44 years
  "B12002_104E", # female, never married, 45-49 years
  "B12002_105E", # female, never married, 50-54 years
  "B12002_106E", # female, never married, 55-59 years
  "B12002_107E", # female, never married, 60-64 years
  "B12002_108E", # female, never married, 65-74 years
  "B12002_109E", # female, never married, 75-84 years
  "B12002_110E", # female, never married, 85+ years
  "B12002_113E", # female, married, 15-17 years
  "B12002_114E", # female, married, 18-19 years
  "B12002_115E", # female, married, 20-24 years
  "B12002_116E", # female, married, 25-29 years
  "B12002_117E", # female, married, 30-34 years
  "B12002_118E", # female, married, 35-39 years
  "B12002_119E", # female, married, 40-44 years
  "B12002_120E", # female, married, 45-49 years
  "B12002_121E", # female, married, 50-54 years
  "B12002_122E", # female, married, 55-59 years
  "B12002_123E", # female, married, 60-64 years
  "B12002_124E", # female, married, 65-74 years
  "B12002_125E", # female, married, 75-84 years
  "B12002_126E", # female, married, 85+ years
  "B12002_129E", # female, separated, 15-17 years
  "B12002_130E", # female, separated, 18-19 years
  "B12002_131E", # female, separated, 20-24 years
  "B12002_132E", # female, separated, 25-29 years
  "B12002_133E", # female, separated, 30-34 years
  "B12002_134E", # female, separated, 35-39 years
  "B12002_135E", # female, separated, 40-44 years
  "B12002_136E", # female, separated, 45-49 years
  "B12002_137E", # female, separated, 50-54 years
  "B12002_138E", # female, separated, 55-59 years
  "B12002_139E", # female, separated, 60-64 years
  "B12002_140E", # female, separated, 65-74 years
  "B12002_141E", # female, separated, 75-84 years
  "B12002_142E", # female, separated, 85+ years
  "B12002_159E", # female, widowed, 15-17 years
  "B12002_160E", # female, widowed, 18-19 years
  "B12002_161E", # female, widowed, 20-24 years
  "B12002_162E", # female, widowed, 25-29 years
  "B12002_163E", # female, widowed, 30-34 years
  "B12002_164E", # female, widowed, 35-39 years
  "B12002_165E", # female, widowed, 40-44 years
  "B12002_166E", # female, widowed, 45-49 years
  "B12002_167E", # female, widowed, 50-54 years
  "B12002_168E", # female, widowed, 55-59 years
  "B12002_169E", # female, widowed, 60-64 years
  "B12002_170E", # female, widowed, 65-74 years
  "B12002_171E", # female, widowed, 75-84 years
  "B12002_172E", # female, widowed, 85+ years
  "B12002_174E", # female, divorced, 15-17 years
  "B12002_175E", # female, divorced, 18-19 years
  "B12002_176E", # female, divorced, 20-24 years
  "B12002_177E", # female, divorced, 25-29 years
  "B12002_178E", # female, divorced, 30-34 years
  "B12002_179E", # female, divorced, 35-39 years
  "B12002_180E", # female, divorced, 40-44 years
  "B12002_181E", # female, divorced, 45-49 years
  "B12002_182E", # female, divorced, 50-54 years
  "B12002_183E", # female, divorced, 55-59 years
  "B12002_184E", # female, divorced, 60-64 years
  "B12002_185E", # female, divorced, 65-74 years
  "B12002_186E", # female, divorced, 75-84 years
  "B12002_187E", # female, divorced, 85+ years

# B09002 marital status by presence and age of own children under 18 years
  "B09002_003E", # married-couple, child(ren) 0-2 years
  "B09002_004E", # married-couple, child(ren) 3-4 years
  "B09002_005E", # married-couple, child(ren) 5 years
  "B09002_006E", # married-couple, child(ren) 6-11 years
  "B09002_007E", # married-couple, child(ren) 12-17 years
  "B09002_010E", # spdad, child(ren) 0-2 years
  "B09002_011E", # spdad, child(ren) 3-4 years
  "B09002_012E", # spdad, child(ren) 5 years
  "B09002_013E", # spdad, child(ren) 6-11 years
  "B09002_014E", # spdad, child(ren) 12-17 years
  "B09002_016E", # spmom, child(ren) 0-2 years
  "B09002_017E", # spmom, child(ren) 3-4 years
  "B09002_018E", # spmom, child(ren) 5 years
  "B09002_019E", # spmom, child(ren) 6-11 years
  "B09002_020E" # spmom, child(ren) 12-17 years

# Create function-get the data
get_acs_data <- function(state, year, variables, survey) {
    get_acs(
        geography = ifelse(state == "US", "us", "state"),
        variables = variables,
        survey = survey,
        state = ifelse(state == "US", NULL, state), # If state is "US", set state-NULL
        year = year
    )
}

# Load fips_codes data
data(fips_codes, package = "tidycensus")

# Create the list of desired states for output
states <- unique(fips_codes$state)[01:51]
states <- c(states, "US") # Add "US"-the list of states-get national totals

# Pull data for each state and year in a tryCatch block-skip unavailable # state-year combinations)
acs_data <- lapply(states, function(state) {
  lapply(years, function(year) {
    tryCatch({
      data <- get_acs_data(state, year, variables, survey)
      data <- data %>% mutate(year = year) # Add year column-the data
    }, error = function(e) {
      message(paste("Skipping state", state, "year", year, "due-error:", e$message))
      data.frame()
    })
  }) %>%
  bind_rows()
}) %>%
bind_rows() # Combine the data into a single data frame

This pulls the raw data we require from the Census API. Now let's corral the data into two useable tibbles:

# Split the data into two dataframes: marital status and marital status by presence and # age of own children under 18 years
marital_status_data <- acs_data %>% filter(substr(variable, 1, 6) == "B12002")
marital_status_children_data <- acs_data %>% filter(substr(variable, 1, 6) == "B09002")

Then, I created a few variable maps to map variables to age group, sex, and marital status column values:

# Add sex column to marital_status_data
sex_map <- list(
  Male = c("B12002_004", "B12002_005", "B12002_006",
           "B12002_007", "B12002_008", "B12002_009", "B12002_010",
           "B12002_011", "B12002_012", "B12002_013", "B12002_014",
           "B12002_015", "B12002_016", "B12002_017", "B12002_020",
           "B12002_021", "B12002_022", "B12002_023", "B12002_024",
           "B12002_025", "B12002_026", "B12002_027", "B12002_028",
           "B12002_029", "B12002_030", "B12002_031", "B12002_032",
           "B12002_033", "B12002_036", "B12002_037", "B12002_038",
           "B12002_039", "B12002_040", "B12002_041", "B12002_042",
           "B12002_043", "B12002_044", "B12002_045", "B12002_046",
           "B12002_047", "B12002_048", "B12002_049", "B12002_066",
           "B12002_067", "B12002_068", "B12002_069", "B12002_070",
           "B12002_071", "B12002_072", "B12002_073", "B12002_074",
           "B12002_075", "B12002_076", "B12002_077", "B12002_078",
           "B12002_079", "B12002_081", "B12002_082", "B12002_083",
           "B12002_084", "B12002_085", "B12002_086", "B12002_087",
           "B12002_088", "B12002_089", "B12002_090", "B12002_091",
           "B12002_092", "B12002_093", "B12002_094"),
  Female = c("B12002_097", "B12002_098", "B12002_099",
             "B12002_100", "B12002_101", "B12002_102", "B12002_103",
             "B12002_104", "B12002_105", "B12002_106", "B12002_107",
             "B12002_108", "B12002_109", "B12002_110", "B12002_113",
             "B12002_114", "B12002_115", "B12002_116", "B12002_117",
             "B12002_118", "B12002_119", "B12002_120", "B12002_121",
             "B12002_122", "B12002_123", "B12002_124", "B12002_125",
             "B12002_126", "B12002_129", "B12002_130", "B12002_131",
             "B12002_132", "B12002_133", "B12002_134", "B12002_135",
             "B12002_136", "B12002_137", "B12002_138", "B12002_139",
             "B12002_140", "B12002_141", "B12002_142", "B12002_159",
             "B12002_160", "B12002_161", "B12002_162", "B12002_163",
             "B12002_164", "B12002_165", "B12002_166", "B12002_167",
             "B12002_168", "B12002_169", "B12002_170", "B12002_171",
             "B12002_172", "B12002_174", "B12002_175", "B12002_176",
             "B12002_177", "B12002_178", "B12002_179", "B12002_180",
             "B12002_181", "B12002_182", "B12002_183", "B12002_184",
             "B12002_185", "B12002_186", "B12002_187")
)

# Define the marital status categories for marital_status_data
marital_map <- list (
  neverMarried = c("B12002_004", "B12002_005", "B12002_006", "B12002_007",                  "B12002_008", "B12002_009", "B12002_010", "B12002_011",
                 "B12002_012", "B12002_013", "B12002_014", "B12002_015",
                 "B12002_016", "B12002_017", "B12002_097", "B12002_098",
                 "B12002_099", "B12002_100", "B12002_101", "B12002_102",
                 "B12002_103", "B12002_104", "B12002_105", "B12002_106",
                   "B12002_107", "B12002_108", "B12002_109", "B12002_110"),
  married = c("B12002_020", "B12002_021", "B12002_022", "B12002_023", "B12002_024",
              "B12002_025", "B12002_026", "B12002_027", "B12002_028", "B12002_029",
             "B12002_030", "B12002_031", "B12002_032", "B12002_033", "B12002_113",
             "B12002_114", "B12002_115", "B12002_116", "B12002_117", "B12002_118",
             "B12002_119", "B12002_120", "B12002_121", "B12002_122", "B12002_123",
             "B12002_124", "B12002_125", "B12002_126"),
  separated = c("B12002_036", "B12002_037", "B12002_038", "B12002_039", "B12002_040",
             "B12002_041", "B12002_042", "B12002_043", "B12002_044", "B12002_045",
                "B12002_046", "B12002_047", "B12002_048", "B12002_049", "B12002_129",
                "B12002_130", "B12002_131", "B12002_132", "B12002_133", "B12002_134",
                "B12002_135", "B12002_136", "B12002_137", "B12002_138", "B12002_139",
                "B12002_140", "B12002_141", "B12002_142"),
  widowed = c("B12002_066", "B12002_067", "B12002_068", "B12002_069", "B12002_070",
             "B12002_071", "B12002_072", "B12002_073", "B12002_074", "B12002_075",
             "B12002_076", "B12002_077", "B12002_078", "B12002_079", "B12002_159",
             "B12002_160", "B12002_161", "B12002_162", "B12002_163", "B12002_164",
             "B12002_165", "B12002_166", "B12002_167", "B12002_168", "B12002_169",
             "B12002_170", "B12002_171", "B12002_172"),
  divorced = c("B12002_081", "B12002_082", "B12002_083", "B12002_084", "B12002_085",
             "B12002_086", "B12002_087", "B12002_088", "B12002_089", "B12002_090",
             "B12002_091", "B12002_092", "B12002_093", "B12002_094", "B12002_174",
             "B12002_175", "B12002_176", "B12002_177", "B12002_178", "B12002_179",
             "B12002_180", "B12002_181", "B12002_182", "B12002_183", "B12002_184",
             "B12002_185", "B12002_186", "B12002_187")
)

# Add age column to marital_status_data
age_map <- list(
  "15-17" = c("B12002_004", "B12002_020", "B12002_036", "B12002_066", "B12002_081",
             "B12002_097", "B12002_113", "B12002_129", "B12002_159", "B12002_174"),
  "18-19" = c("B12002_005", "B12002_021", "B12002_037", "B12002_067", "B12002_082",
             "B12002_098", "B12002_114", "B12002_130", "B12002_160", "B12002_175"),
  "20-24" = c("B12002_006", "B12002_022", "B12002_038", "B12002_068", "B12002_083",
             "B12002_099", "B12002_115", "B12002_131", "B12002_161", "B12002_176"),
  "25-29" = c("B12002_007", "B12002_023", "B12002_039", "B12002_069", "B12002_084",
             "B12002_100", "B12002_116", "B12002_132", "B12002_162", "B12002_177"),
  "30-34" = c("B12002_008", "B12002_024", "B12002_040", "B12002_070", "B12002_085",
             "B12002_101", "B12002_117", "B12002_133", "B12002_163", "B12002_178"),
  "35-39" = c("B12002_009", "B12002_025", "B12002_041", "B12002_071", "B12002_086",
             "B12002_102", "B12002_118", "B12002_134", "B12002_164", "B12002_179"),
  "40-44" = c("B12002_010", "B12002_026", "B12002_042", "B12002_072", "B12002_087",
             "B12002_103", "B12002_119", "B12002_135", "B12002_165", "B12002_180"),
  "45-49" = c("B12002_011", "B12002_027", "B12002_043", "B12002_073", "B12002_088",
             "B12002_104", "B12002_120", "B12002_136", "B12002_166", "B12002_181"),
  "50-54" = c("B12002_012", "B12002_028", "B12002_044", "B12002_074", "B12002_089",
             "B12002_105", "B12002_121", "B12002_137", "B12002_167", "B12002_182"),
  "55-59" = c("B12002_013", "B12002_029", "B12002_045", "B12002_075", "B12002_090",
             "B12002_106", "B12002_122", "B12002_138", "B12002_168", "B12002_183"),
  "60-64" = c("B12002_014", "B12002_030", "B12002_046", "B12002_076", "B12002_091",
             "B12002_107", "B12002_123", "B12002_139", "B12002_169", "B12002_184"),
  "65-74" = c("B12002_015", "B12002_031", "B12002_047", "B12002_077", "B12002_092",
             "B12002_108", "B12002_124", "B12002_140", "B12002_170", "B12002_185"),
  "75-84" = c("B12002_016", "B12002_032", "B12002_048", "B12002_078", "B12002_093",
             "B12002_109", "B12002_125", "B12002_141", "B12002_171", "B12002_186"),
  "85+" = c("B12002_017", "B12002_033", "B12002_049", "B12002_079", "B12002_094",
             "B12002_110", "B12002_126", "B12002_142", "B12002_172", "B12002_187")
)

child_age_map <- list(
  "0-2" = c("B09002_003", "B09002_010", "B09002_016"),
  "3-4" = c("B09002_004", "B09002_011", "B09002_017"),
  "5" = c("B09002_005", "B09002_012", "B09002_018"),
  "6-11" = c("B09002_006", "B09002_013", "B09002_019"),
  "12-17" = c("B09002_007", "B09002_014", "B09002_020")
)

child_parent_map <- list(
  "Married Couple" = c("B09002_003", "B09002_004", "B09002_005", "B09002_006",
                     "B09002_007"),
  "Single Parent (Dad)" = c("B09002_010", "B09002_011", "B09002_012", "B09002_013",
                     "B09002_014"),
  "Single Parent (Mom)" = c("B09002_016", "B09002_017", "B09002_018", "B09002_019",
                            "B09002_020")
)
 
Now, we apply these variable maps to each tibble and drop the variable column.

# Add the 'sex' column to marital_status_data based on the variable values
marital_status_data <- marital_status_data %>%
  mutate(sex = case_when(
    variable %in% sex_map$Male ~ "M",
    variable %in% sex_map$Female ~ "F",
    TRUE ~ NA_character_
  ))

# Add the 'marital_status' column to marital_status_data based on the # variable values
marital_status_data <- marital_status_data %>%
  mutate(marital_status = case_when(
    variable %in% marital_map$neverMarried ~ "Never Married",
    variable %in% marital_map$married ~ "Married",
    variable %in% marital_map$separated ~ "Separated",
    variable %in% marital_map$widowed ~ "Widowed",
    variable %in% marital_map$divorced ~ "Divorced",
    TRUE ~ NA_character_
  ))

# Add the 'age' column to marital_status_data based on the # variable values
marital_status_data <- marital_status_data %>%
  mutate(age = sapply(variable, function(v) age_map[[v]]))

# Add the 'child_age' column to marital_status_children_data based # on the variable values
marital_status_children_data <- marital_status_children_data %>%
  mutate(child_age = sapply(variable, function(v) child_age_map[[v]]))

# Add the 'child_parent' column to marital_status_children_data based # on the variable values
marital_status_children_data <- marital_status_children_data %>%
  mutate(child_parent = sapply(variable, function(v) child_parent_map[[v]]))

# Rename GEOID columns to st_fips
marital_status_data <- acs_data %>% rename(st_fips = GEOID)
marital_status_children_data <- acs_data %>% rename(st_fips = GEOID)

# Rename NAME column to state
marital_status_data <- acs_data %>% rename(state = NAME)
marital_status_children_data <- acs_data %>% rename(state = NAME)

# Drop variable column
marital_status_data <- marital_status_data %>% select(-variable, -X)
marital_status_children_data <- marital_status_children_data     %>% select(-variable, -X)

This gives us the following dataframes to work with:
    marital_status_data

    marital_status_children_data

Whereas marital_status_data gives us the state of people being in various martial states for that year, marital_status_children_data provides a glimpse into distributions of children by age and family configuration (i.e., married couple or single parent mom/dad). The cleanest overlap between these two dataframes is for married couples, such that we could look at Census-reported children by age for married households. Single-parent households for any given year could fall into any of the other martial status categories (i.e., never married, separated, widowed, or divorced).

To explore these data visually, I made a simple R Shiny app with 2 primary tabs, one to explore Marital Status Data and the second to look at the distribution of children by age and year. 

"This product uses the Census Bureau Data API but is not endorsed or certified by the Census Bureau."


Super Admin

Jimmy Fisher



you may also like

  • by Jimmy Fisher
  • Nov 02, 2024
U.S. Population (Census)
  • by Jimmy Fisher
  • Nov 16, 2024
U.S. Mortality Data (NCHS)
  • by Jimmy Fisher
  • Dec 01, 2024
Wrangling BRFSS (2011-2023)
  • by Jimmy Fisher
  • Dec 17, 2024
Chi-Square Tests & BRFSS Weights