Subsetting and Spatial Filtering#
This page describes how to subset datasets spatially and temporally using eoforeststac.providers.subset.subset.
Spatial subsetting#
subset() clips a dataset to a polygon geometry. Geometries must be in EPSG:4326 (geographic coordinates); they are reprojected to match the dataset CRS automatically.
import geopandas as gpd
from eoforeststac.providers.subset import subset
roi = gpd.read_file("DE-Hai.geojson")
geometry = roi.to_crs("EPSG:4326").geometry.union_all()
ds_subset = subset(ds, geometry=geometry)
The returned dataset has its spatial extent clipped to the bounding box of the geometry. Values outside the geometry are masked as NaN.
Temporal subsetting#
Pass a time= tuple of ISO date strings to restrict the time dimension:
ds_subset = subset(
ds,
geometry=geometry,
time=("2010-01-01", "2020-12-31"),
)
Combined spatial and temporal subsetting#
ds_subset = subset(
ds,
geometry=geometry,
time=("2015-01-01", "2020-12-31"),
)
Triggering computation#
All operations are lazy (Dask-backed). Call .compute() to load data into memory:
ds_loaded = ds_subset.compute()
For large datasets, load only the variables and time steps you need before calling .compute():
ds_loaded = ds_subset[["agb"]].sel(time="2020-01-01").compute()
Notes#
The geometry must be in EPSG:4326. If your data is in a different CRS, use
geopandasto reproject it first.subset()clips to the bounding box of the geometry plus polygon masking. Pixels outside the polygon are set toNaN.If the dataset has no
timedimension, thetime=argument is silently ignored.