Link Search Menu Expand Document

P3D10: Spatial Calculations and Manipulations with GeoPandas

These slides map to the R example slides on Day 18 or P3D4

  • Make sure you have GeoPandas installed based on the guide from day P3D9
import pandas as pd
import numpy as np
import geopandas as gpd

import folium
import rtree

from plotnine import *

First lets get our Safegraph data usable

You can use your local file.

url_loc = "https://github.com/KSUDS/p3_spatial/raw/main/SafeGraph%20-%20Patterns%20and%20Core%20Data%20-%20Chipotle%20-%20July%202021/Core%20Places%20and%20Patterns%20Data/chipotle_core_poi_and_patterns.csv"

dat = pd.read_csv(url_loc)

Now we can subset the Chipotles to California and build a goemetry column.

dat_cal = dat.query("region=='CA'")
dat_cal = gpd.GeoDataFrame(
    dat_cal.filter(["placekey", "latitude", "longitude", "median_dwell", "region"]),
        geometry=gpd.points_from_xy(dat_cal.longitude, dat_cal.latitude),
    crs='EPSG:4326')

Lets parse our county polygons

county = gpd.read_parquet("personal/usa_counties.parquet")

Now we can build out some spatial calculations on our California counties. In the example below we want to calculate the distance of each county center to KSU.

The code to get our KSU point.

from shapely.geometry import Point
ksu_df = pd.DataFrame({"lat":[34.037876],
        "long":[-84.58102]})
ksu = gpd.GeoDataFrame(ksu_df,
    geometry=gpd.points_from_xy(ksu_df.long, ksu_df.lat),
    crs='EPSG:4326')
point = Point(
    ksu.geometry.to_crs(epsg = 3310).x,
    ksu.geometry.to_crs(epsg = 3310).y)

Our new wrangled California, calw.

calw = (cal
    .assign(
        gp_area = lambda x: x.geometry.to_crs(epsg = 3310).area,
        gp_acres = lambda x: x.gp_area * 0.000247105,
        aland_acres = lambda x: x.aland * 0.000247105,
        percent_water = lambda x: x.awater / x.aland,
        gp_center = lambda x: x.geometry.to_crs(epsg = 3310).centroid,
        gp_length = lambda x: x.geometry.to_crs(epsg = 3310).length,
        gp_distance = lambda x: x.gp_center.distance(point),
        gp_buffer = lambda x: x.geometry.to_crs(epsg = 3310).buffer(24140.2)      
))

Plotting Chipotle stores in California

We can plot our spatial series that we created in calw.

calw.gp_buffer.plot()
calw.gp_center.plot(color= "black")

With dat_cal from above we can plot the locations on the county map.

base = calw.plot(color="white", edgecolor="darkgrey")
dat_cal.plot(ax=base, color="red", markersize=5)

Counting Chipotle stores by county

To leverage the spatial join functions of GeoPandas we need to make sure that we have rtree. We can leverage gpd.sjoin() for our task.

# Now count stores by county
dat_join_s1 = gpd.sjoin(dat_cal, calw)

dat_join_merge = (dat_join_count
    .groupby("name")
    .agg(counts = ('percent_water', 'size'))
    .reset_index())

calw_join = (calw
    .merge(dat_join_merge, on="name", how="left")
    .fillna(value={"counts":0}))

Now we can plot the counties colored by their respective number of stores with the store locations.

base = calw_join.plot(
    edgecolor="darkgrey",
    column = "counts")
dat_cal.plot(ax=base, color="red", markersize=4)