I was really impressed to get an almost immediate reply from the CIO, GIS to my note about Whittier appearing to some extra stuff in the geometries provided here: https://lab.data.ca.gov/dataset/california-city-boundaries-and-identifiers.
They pointed to the California Department of Tax and Fee Administration as the original source of the geometries, which has been updated in the meantime: https://gis.data.ca.gov/datasets/CDTFA::city-and-county-boundary-line-changes/explore?layer=0&location=33.924502%2C-118.008294%2C11.67. The downstream version should change soon too.
Taking a quick look at the CDTFA version:
> library(fillpattern)
> library(dplyr)
> library(ggplot2)
> library(sf)
> d <- read_sf("~/Downloads/City_and_County_Boundary_Line_Changes_217120214122415134.gpkg")
> ggplot(subset(d, CITY == "Whittier")) + geom_sf()

Looks good! But the area again doesn’t match Wikipedia (14.7 mi^2)?!
> as_mi2 <- function(x) { units(x) <- "mi2"; return(x) }
> as_mi2(st_area(subset(d, CITY == "Whittier")))
Units: [mi^2]
[1] 0.08453412 21.48431463
Let’s look at the geometries:
> whittier <- read_sf("~/Downloads/Whittier_City_Boundary.geojson")
> stackSf <- function(d1, d2, geometryName="SHAPE") {
common <- intersect(names(d1), names(d2))
# Make sure geometries agree on naming and projection
st_geometry(d1) <- geometryName
st_geometry(d2) <- geometryName
d2 <- st_transform(d2, st_crs(d1))
return(bind_rows(list(d1[, common], d2[, common])))
}
> combined <- stackSf(cbind(subset(d, CITY == "Whittier"), source="CDTFA"), cbind(whittier, source="City"))
> ggplot(combined, aes(fill=source)) + geom_sf() + scale_fill_pattern(c("stripe45", "stripe135"), width=unit(1/100, 'npc'), alpha=0.5)

They seem to overlap perfectly. So, what gives with the differences in area? Turns out I misunderstood what st_area is doing.
Helpful discussion on Stack Overflow: User ifeanyi588 asks, “Shouldn’t the area of polygons remain the same when you transform the shapefile projection?” Jindra Lacko replies, “No, the area of polygons is not guaranteed to remain constant when transforming between different CRSes. The Earth is round, maps & computer screens are flat – something has to give; either area or shape has to be distorted somewhat.”
I had imagined — as I guess had this other person — that st_area was estimating surface area of the polygon on the curved surface of the Earth. Instead, it’s finding the area *after* projecting the polygon onto another surface.
I noticed this because bringing the city version of the data into the same CRS, the areas match:
> as_mi2(st_area(whittier))
Units: [mi^2]
[1] 0.05792207 14.74530709
> as_mi2(st_area(combined))
Units: [mi^2]
[1] 0.08453412 21.48431463 0.08451496 21.48561899
If we instead normalize the CDTFA data to use the CRS from the city’s data, the areas agree:
> as_mi2(st_area(st_transform(subset(d, CITY == "Whittier"), st_crs(whittier))))
Units: [mi^2]
[1] 0.05793522 14.74441109
What are those two transforms by the way?
> st_crs(d)
Coordinate Reference System:
User input: WGS 84 / Pseudo-Mercator
wkt:
PROJCRS["WGS 84 / Pseudo-Mercator",
...
ID["EPSG",3857]]
> st_crs(whittier)
Coordinate Reference System:
User input: WGS 84
wkt:
GEOGCRS["WGS 84",
...
ID["EPSG",4326]]
- https://epsg.io/3857: “WGS 84 / Pseudo-Mercator — Spherical Mercator, Google Maps, OpenStreetMap, Bing, ArcGIS, ESRI… Not a recognised geodetic system. Uses spherical development of ellipsoidal coordinates. Relative to WGS 84 / World Mercator (CRS code 3395) gives errors of 0.7 percent in scale and differences in northing of up to 43km in the map (21km on the ground).”
- https://epsg.io/4326: “WGS 84 — WGS84 – World Geodetic System 1984, used in GPS”
EPSG: 4326 uses a coordinate system on the surface of a sphere or ellipsoid of reference.
EPSG: 3857 uses a coordinate system PROJECTED from the surface of the sphere or ellipsoid to a flat surface.
Think of it as this way:
EPSG 4326 uses a coordinate system the same as a GLOBE (curved surface). EPSG 3857 uses a coordinate system the same as a MAP (flat surface).
https://gis.stackexchange.com/a/48958 (2013)
So, it looks like Whittier will soon be fixed. I’m also reminded of how much I have to learn and will come out of this at least being a little more mindful of thinking about projections.
And again, I’m really impressed with the CIO, GIS, for the quick, helpful reply. And I’m reminded of how appreciative I should be towards of all these other folks sharing valuable info online. Thank you!
Leave a Reply