California cities from CDTFA

At least based on the dates listed on https://lab.data.ca.gov/dataset/california-city-boundaries-and-identifiers — still says “03/24/25” — I’m guessing the city boundaries haven’t been updated yet.

For a current project, I want the best boundaries available so let’s work directly with the source pointed out by the CIO, GIS: https://gis.data.ca.gov/datasets/CDTFA::city-and-county-boundary-line-changes/explore?layer=0&location=33.924386%2C-118.008294%2C11.67.

> library(sf)
> d <- read_sf("~/Downloads/City_and_County_Boundary_Line_Changes_217120214122415134.gpkg")

It turns out to have quite have a few more rows than the 483 incorporated cities in California but maybe just one more “thing” when counting unique names:

> nrow(d)
[1] 1512
> length(unique(d$CITY))
[1] 484

Getting a list of those cities in convenient form — see https://github.com/fadend/ca_cities_data:

> cities <- read.csv(url("https://raw.githubusercontent.com/fadend/ca_cities_data/refs/heads/main/ca_cities.csv"), stringsAsFactors=FALSE)

> setdiff(d$CITY, cities$city_name)
[1] "Unincorporated" "Angels" "California" "Industry"

Agh. I’d forgotten about the inconsistencies around the naming for Angels Camp, California City, and City of Industry. So, we need to get rid of “Unincorporated” and rename those others.

mapping <- list(`Angels`="Angels Camp", `California`="California City", `Industry`="City of Industry")

# Do you have a nicer way to do this string replacement?
replaceStrings <- function(x, mapping) {
replacement <- mapping[x]
x[!is.na(names(replacement))] <- unlist(mapping, use.names=FALSE)
return(x)
}

> d$CITY <- replaceStrings(d$CITY, mapping)
> d <- subset(d, CITY != "Unincorporated")
> nrow(d)
[1] 799

> setequal(cities$city_name, d$CITY)
[1] TRUE

So, with this transform, we end up with the same set of cities but have some extra rows due to the geometries being split across multiple rows. Let’s finally combine the geometries so that each city gets one row:

library(dplyr)
d <- d %>% group_by(CITY) %>% summarise(SHAPE=st_union(SHAPE))

> nrow(d)
[1] 483

Tada!

https://github.com/fadend/ca_cities_data/blob/main/ca_city_boundaries_from_cdtfa.R has code embodying the above. https://github.com/fadend/ca_cities_data/blob/main/ca_cities_wiki_names.gpkg has the output.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *