CQL2 Filtering#
This notebook demonstrates the use of pystac-client to use CQL2 filtering. The server needs to support this and advertise conformance as the https://api.stacspec.org/v1.0.0-rc.1/item-search#filter
class in the conformsTo
attribute of the root API.
This should be considered an experimental feature. This notebook uses the Microsoft Planetary Computer API, as it is currently the only public CQL2 implementation.
[1]:
# set pystac_client logger to DEBUG to see API calls
import logging
from copy import deepcopy
import geopandas as gpd
import pandas as pd
from shapely.geometry import shape
from pystac_client import Client
logging.basicConfig()
logger = logging.getLogger("pystac_client")
logger.setLevel(logging.INFO)
# convert a list of STAC Items into a GeoDataFrame
def items_to_geodataframe(items):
_items = []
for i in items:
_i = deepcopy(i)
_i["geometry"] = shape(_i["geometry"])
_items.append(_i)
gdf = gpd.GeoDataFrame(pd.json_normalize(_items))
for field in ["properties.datetime", "properties.created", "properties.updated"]:
if field in gdf:
gdf[field] = pd.to_datetime(gdf[field])
gdf.set_index("properties.datetime", inplace=True)
return gdf
[2]:
# STAC API root URL
URL = "https://planetarycomputer.microsoft.com/api/stac/v1"
# custom headers
headers = []
cat = Client.open(URL, headers=headers)
Initial Search Parameters#
Here we perform a search with the Client.search
function, providing a geometry (intersects
) a datetime range (datetime
), and filtering by Item properties (filter
) using CQL2-JSON.
[3]:
import json
import hvplot.pandas # noqa: F401
# AOI around Delfzijl, in the north of The Netherlands
geom = {
"type": "Polygon",
"coordinates": [
[
[6.42425537109375, 53.174765470134616],
[7.344360351562499, 53.174765470134616],
[7.344360351562499, 53.67393435835391],
[6.42425537109375, 53.67393435835391],
[6.42425537109375, 53.174765470134616],
]
],
}
params = {
"max_items": 100,
"collections": "landsat-8-c2-l2",
"intersects": geom,
"datetime": "2018-01-01/2020-12-31",
}
# reusable search function
def search_fetch_plot(params, filt):
# limit sets the # of items per page so we can see multiple pages getting fetched
params["filter"] = filt
search = cat.search(**params)
items = list(search.items_as_dicts()) # safe b/c we set max_items = 100
# DataFrame
items_df = pd.DataFrame(items_to_geodataframe(items))
print(f"{len(items_df.index)} items found")
field = "properties.eo:cloud_cover"
return items_df.hvplot(
y=field, label=json.dumps(filt), frame_height=500, frame_width=800
)
CQL2 Filters#
Below are examples of several different CQL2 filters on the eo:cloud_cover
property. Up to 100 Items are fetched and the eo:cloud_cover values plotted.
[4]:
filt = {"op": "lte", "args": [{"property": "eo:cloud_cover"}, 10]}
search_fetch_plot(params, filt)
33 items found
[4]:
[5]:
filt = {"op": "gte", "args": [{"property": "eo:cloud_cover"}, 80]}
search_fetch_plot(params, filt)
92 items found
[5]:
[6]:
filt = {
"op": "and",
"args": [
{"op": "lte", "args": [{"property": "eo:cloud_cover"}, 60]},
{"op": "gte", "args": [{"property": "eo:cloud_cover"}, 40]},
],
}
search_fetch_plot(params, filt)
41 items found
[6]: