CQL2 Filtering#

This notebook demonstrates the use of pystac-client to use CQL2 filtering. The server needs to support this and advertise conformance as the https://api.stacspec.org/v1.0.0-rc.1/item-search#filter class in the conformsTo attribute of the root API.

This should be considered an experimental feature. This notebook uses the Microsoft Planetary Computer API, as it is currently the only public CQL2 implementation.

[1]:
from pystac_client import Client

# set pystac_client logger to DEBUG to see API calls
import logging
logging.basicConfig()
logger = logging.getLogger('pystac_client')
logger.setLevel(logging.INFO)

# function for creating GeoDataFrame from Items
from copy import deepcopy
import geopandas as gpd
import pandas as pd
from shapely.geometry import shape

# convert a list of STAC Items into a GeoDataFrame
def items_to_geodataframe(items):
    _items = []
    for i in items:
        _i = deepcopy(i)
        _i['geometry'] = shape(_i['geometry'])
        _items.append(_i)
    gdf = gpd.GeoDataFrame(pd.json_normalize(_items))
    for field in ['properties.datetime', 'properties.created', 'properties.updated']:
        if field in gdf:
            gdf[field] = pd.to_datetime(gdf[field])
    gdf.set_index('properties.datetime', inplace=True)
    return gdf
[2]:
# STAC API root URL
URL = 'https://planetarycomputer.microsoft.com/api/stac/v1'

# custom headers
headers = []

cat = Client.open(URL, headers=headers)

Initial Search Parameters#

Here we perform a search with the Client.search function, providing a geometry (intersects) a datetime range (datetime), and filtering by Item properties (filter) using CQL2-JSON.

[3]:
# AOI around Delfzijl, in the north of The Netherlands
geom = {
    "type": "Polygon",
    "coordinates": [
      [
        [
          6.42425537109375,
          53.174765470134616
        ],
        [
          7.344360351562499,
          53.174765470134616
        ],
        [
          7.344360351562499,
          53.67393435835391
        ],
        [
          6.42425537109375,
          53.67393435835391
        ],
        [
          6.42425537109375,
          53.174765470134616
        ]
      ]
    ]
}

params = {
    "max_items": 100,
    "collections": "landsat-8-c2-l2",
    "intersects": geom,
    "datetime": "2018-01-01/2020-12-31",
}

import hvplot.pandas
import json

# reusable search function
def search_fetch_plot(params, filt):
    # limit sets the # of items per page so we can see multiple pages getting fetched
    params['filter'] = filt
    search = cat.search(**params)
    items = list(search.items_as_dicts()) # safe b/c we set max_items = 100
    # DataFrame
    items_df = pd.DataFrame(items_to_geodataframe(items))
    print(f"{len(items_df.index)} items found")
    field = 'properties.eo:cloud_cover'
    return items_df.hvplot(y=field, label=json.dumps(filt), frame_height=500, frame_width=800)

CQL2 Filters#

Below are examples of several different CQL2 filters on the eo:cloud_cover property. Up to 100 Items are fetched and the eo:cloud_cover values plotted.

[4]:
filt = {
    "op": "lte",
    "args": [{"property": "eo:cloud_cover"}, 10]
}

search_fetch_plot(params, filt)
33 items found
[4]:
[5]:
filt = {
    "op": "gte",
    "args" : [{"property": "eo:cloud_cover"}, 80]
}

search_fetch_plot(params, filt)
92 items found
[5]:
[6]:
filt = {
    "op": "and",
    "args": [
      {
          "op":"lte",
          "args": [{"property": "eo:cloud_cover"}, 60]
      },
      {
          "op": "gte",
          "args": [{"property": "eo:cloud_cover"}, 40]
      }
    ]
}

search_fetch_plot(params, filt)
41 items found
[6]: