CQL2 Filtering#
This notebook demonstrates the use of pystac-client to use CQL2 filtering. The server needs to support this and advertise conformance as the https://api.stacspec.org/v1.0.0-rc.1/item-search#filter
class in the conformsTo
attribute of the root API.
This should be considered an experimental feature. This notebook uses the Microsoft Planetary Computer API, as it is currently the only public CQL2 implementation.
[1]:
from pystac_client import Client
# set pystac_client logger to DEBUG to see API calls
import logging
logging.basicConfig()
logger = logging.getLogger('pystac_client')
logger.setLevel(logging.INFO)
# function for creating GeoDataFrame from Items
from copy import deepcopy
import geopandas as gpd
import pandas as pd
from shapely.geometry import shape
# convert a list of STAC Items into a GeoDataFrame
def items_to_geodataframe(items):
_items = []
for i in items:
_i = deepcopy(i)
_i['geometry'] = shape(_i['geometry'])
_items.append(_i)
gdf = gpd.GeoDataFrame(pd.json_normalize(_items))
for field in ['properties.datetime', 'properties.created', 'properties.updated']:
if field in gdf:
gdf[field] = pd.to_datetime(gdf[field])
gdf.set_index('properties.datetime', inplace=True)
return gdf
[2]:
# STAC API root URL
URL = 'https://planetarycomputer.microsoft.com/api/stac/v1'
# custom headers
headers = []
cat = Client.open(URL, headers=headers)
Initial Search Parameters#
Here we perform a search with the Client.search
function, providing a geometry (intersects
) a datetime range (datetime
), and filtering by Item properties (filter
) using CQL2-JSON.
[3]:
# AOI around Delfzijl, in the north of The Netherlands
geom = {
"type": "Polygon",
"coordinates": [
[
[
6.42425537109375,
53.174765470134616
],
[
7.344360351562499,
53.174765470134616
],
[
7.344360351562499,
53.67393435835391
],
[
6.42425537109375,
53.67393435835391
],
[
6.42425537109375,
53.174765470134616
]
]
]
}
params = {
"max_items": 100,
"collections": "landsat-8-c2-l2",
"intersects": geom,
"datetime": "2018-01-01/2020-12-31",
}
import hvplot.pandas
import json
# reusable search function
def search_fetch_plot(params, filt):
# limit sets the # of items per page so we can see multiple pages getting fetched
params['filter'] = filt
search = cat.search(**params)
items = list(search.items_as_dicts()) # safe b/c we set max_items = 100
# DataFrame
items_df = pd.DataFrame(items_to_geodataframe(items))
print(f"{len(items_df.index)} items found")
field = 'properties.eo:cloud_cover'
return items_df.hvplot(y=field, label=json.dumps(filt), frame_height=500, frame_width=800)
CQL2 Filters#
Below are examples of several different CQL2 filters on the eo:cloud_cover
property. Up to 100 Items are fetched and the eo:cloud_cover values plotted.
[4]:
filt = {
"op": "lte",
"args": [{"property": "eo:cloud_cover"}, 10]
}
search_fetch_plot(params, filt)
33 items found
/home/docs/checkouts/readthedocs.org/user_builds/pystac-client/conda/stable/lib/python3.9/site-packages/holoviews/core/data/pandas.py:39: FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`
return dataset.data.dtypes[idx].type
/home/docs/checkouts/readthedocs.org/user_builds/pystac-client/conda/stable/lib/python3.9/site-packages/holoviews/core/data/pandas.py:39: FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`
return dataset.data.dtypes[idx].type
/home/docs/checkouts/readthedocs.org/user_builds/pystac-client/conda/stable/lib/python3.9/site-packages/holoviews/core/data/pandas.py:362: FutureWarning: The behavior of DatetimeProperties.to_pydatetime is deprecated, in a future version this will return a Series containing python datetime objects instead of an ndarray. To retain the old behavior, call `np.array` on the result
dts = [dt.replace(tzinfo=None) for dt in data.dt.to_pydatetime()]
[4]:
[5]:
filt = {
"op": "gte",
"args" : [{"property": "eo:cloud_cover"}, 80]
}
search_fetch_plot(params, filt)
92 items found
/home/docs/checkouts/readthedocs.org/user_builds/pystac-client/conda/stable/lib/python3.9/site-packages/holoviews/core/data/pandas.py:39: FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`
return dataset.data.dtypes[idx].type
/home/docs/checkouts/readthedocs.org/user_builds/pystac-client/conda/stable/lib/python3.9/site-packages/holoviews/core/data/pandas.py:39: FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`
return dataset.data.dtypes[idx].type
/home/docs/checkouts/readthedocs.org/user_builds/pystac-client/conda/stable/lib/python3.9/site-packages/holoviews/core/data/pandas.py:362: FutureWarning: The behavior of DatetimeProperties.to_pydatetime is deprecated, in a future version this will return a Series containing python datetime objects instead of an ndarray. To retain the old behavior, call `np.array` on the result
dts = [dt.replace(tzinfo=None) for dt in data.dt.to_pydatetime()]
[5]:
[6]:
filt = {
"op": "and",
"args": [
{
"op":"lte",
"args": [{"property": "eo:cloud_cover"}, 60]
},
{
"op": "gte",
"args": [{"property": "eo:cloud_cover"}, 40]
}
]
}
search_fetch_plot(params, filt)
41 items found
/home/docs/checkouts/readthedocs.org/user_builds/pystac-client/conda/stable/lib/python3.9/site-packages/holoviews/core/data/pandas.py:39: FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`
return dataset.data.dtypes[idx].type
/home/docs/checkouts/readthedocs.org/user_builds/pystac-client/conda/stable/lib/python3.9/site-packages/holoviews/core/data/pandas.py:39: FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`
return dataset.data.dtypes[idx].type
/home/docs/checkouts/readthedocs.org/user_builds/pystac-client/conda/stable/lib/python3.9/site-packages/holoviews/core/data/pandas.py:362: FutureWarning: The behavior of DatetimeProperties.to_pydatetime is deprecated, in a future version this will return a Series containing python datetime objects instead of an ndarray. To retain the old behavior, call `np.array` on the result
dts = [dt.replace(tzinfo=None) for dt in data.dt.to_pydatetime()]
[6]: