Fraction north of Fresno

I had a Google Sheet with data like the following:

CityCount
Los Angeles10000
San Diego9000
San Jose1000
Tiburon10
Palo Alto5

I wanted to understand what fraction of the count was in cities lying north of Fresno — “Northern California”.

My first thought was to look up latitude using a geocoding service; the solution here using Google’s Geocoder API worked pretty well for me: https://www.googlecloudcommunity.com/gc/Exploring-Curating-Data/Get-latitude-longitude-for-any-location-through-Google-Sheets/m-p/570196. Also helpful for this was this list of bounding boxes for the US states (CSV here). But then I hit the daily limit for calls to the API.

So, instead I got the list of California cities and longitude + latitude from https://www.kaggle.com/datasets/camnugent/california-housing-feature-engineering and inserted it into a new worksheet “CACities”:

NameLatitudeLongitude
Adelanto34.582769-117.409214
Agoura Hills34.153339-118.761675
Alameda37.765206-122.241636
Albany37.886869-122.297747

Then in the first sheet, we can look up the latitude for a given city with a formula like the following:

=VLOOKUP($A2, CACities!$A$2:$B$460, 2, FALSE)

Looking in CACities, we see that Fresno has latitude 36.746842. Latitude increases as you go north (hitting 90 at the north pole) so we just need to look for latitudes greater than this.

To find the sum of counts north of Fresno, we can use the following:

=SUMIF(C:C, "> 36.746842", B:B)

To find the total sum of counts with a known latitude:

=SUMIF(C:C, "<>#N/A", B:B)

And putting these together, we can find the fraction (weighted by count) north of Fresno as:

=SUMIF(C:C, "> 36.746842", B:B) / SUMIF(C:C, "<>#N/A", B:B)

Is there a better way to do this?

A test spreadsheet with the above.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *