Tooling to enable generic keyset pagination across sources by expressing the keyset in CQL2.#720
Draft
Tooling to enable generic keyset pagination across sources by expressing the keyset in CQL2.#720
Conversation
gadomski
reviewed
Apr 29, 2025
Member
gadomski
left a comment
There was a problem hiding this comment.
Core concept seems fine on first glance. One bit of weirdness with the "fetch one extra" model is that I fetch to arrow record batches in stac-duckdb, so peeling off the last item feels a little weird? ... I guess it's fine if we're returning JSON, but if we add geoparquet responses that breaks down a bit?
| /// This trait defines methods for checking JSON operations, combining JSON values, | ||
| /// and creating JSON filters for various conditions. | ||
| /// A trait for performing JSON-based operations. | ||
| pub trait JsonOps { |
Member
There was a problem hiding this comment.
Since this is Jason+cql2 should it live in cql2-rs?
Collaborator
Author
|
you kind of need to do the one extra in some way or another, because you
need to know if there are anymore records.
definitely would want to wrap that json stuff into cql2-rs (using Expr
directly, rather than json). it was just quicker to mock things up this way.
************************************
David William Bitner
…On Tue, Apr 29, 2025, 5:46 PM Pete Gadomski ***@***.***> wrote:
***@***.**** commented on this pull request.
Core concept seems fine on first glance. One bit of weirdness with the
"fetch one extra" model is that I fetch to arrow record batches in
stac-duckdb, so peeling off the last item feels a little weird? ... I guess
it's fine if we're returning JSON, but if we add geoparquet responses that
breaks down a bit?
------------------------------
In crates/api/src/items.rs
<#720 (comment)>:
> use stac::{Bbox, Item};
+/// A trait providing utility methods for JSON operations.
+///
+/// This trait defines methods for checking JSON operations, combining JSON values,
+/// and creating JSON filters for various conditions.
+/// A trait for performing JSON-based operations.
+pub trait JsonOps {
Since this is Jason+cql2 should it live in cql2-rs?
—
Reply to this email directly, view it on GitHub
<#720 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABIHXGKMFLYRT4YHXWNWF32376MHAVCNFSM6AAAAAB4EDDE6WVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDQMBVGI2DCMRZGQ>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PROOF OF CONCEPT, NOT FOR MERGING
This exposes a next_page_cql2(Item) function on the Items struct that uses Items.sortby to generate a CQL2 Expression that can be appended to the existing CQL2 Expression that will start fetching data beginnning with the passed in Item.
Anticipated workflow would be that the Client would try to fetch one more row than the requested limit. If this extra row exists, it will create a next link that has the additional expression applied. If the Client began with a next link being passed to it, that would then be returned as the prev link.
I think if we architect this right in the Client that this could allow keyset based pagination for any STAC API regardless of backend.
As we look at using rustac as the primary entry point for creating an API for pgstac / stac-geoparquet, this lets us add in this functionality in a reusable way.
Additionally, as we chatted before, I think that moving all the logic for parsing other parameters (items, collections, datetime, ...) so that those just get mapped into CQL2 expressions. Then we can centralize the logic for either "solving" or converting into SQL expressions to CQL2-rs. Basically, I'm proposing that anything that filters data should get converted to CQL2.