Skip to content

fix(waterdata): raise RequestTooLarge for an unchunkable over-budget request (not a silent 414)#309

Draft
thodson-usgs wants to merge 1 commit into
DOI-USGS:mainfrom
thodson-usgs:fix/chunker-size-check-unchunkable
Draft

fix(waterdata): raise RequestTooLarge for an unchunkable over-budget request (not a silent 414)#309
thodson-usgs wants to merge 1 commit into
DOI-USGS:mainfrom
thodson-usgs:fix/chunker-size-check-unchunkable

Conversation

@thodson-usgs
Copy link
Copy Markdown
Collaborator

Problem

ChunkPlan's "no chunkable axes" branch returned immediately without sizing the request — its comment even said "if that produces an over-budget URL, the server (or httpx itself) rejects." So a single large CQL-text filter with one big IN (...) clause — which has no top-level OR, hence yields no chunk axis — was shipped verbatim and failed with an opaque HTTP 414, not even a RequestTooLarge:

ids = ", ".join(f"'USGS-{i:08d}'" for i in range(1000))
get_daily(filter=f"monitoring_location_id IN ({ids})", filter_lang="cql-text")
# 17 KB request -> 414, no actionable error
# (the equivalent get_daily(monitoring_location_id=[...]) chunks fine)

Fix

Size-check the no-axes path: if the single request fits the byte limit, pass through exactly as before (the common hot path); if it's over budget there's nothing to split, so raise RequestTooLarge with actionable guidance (narrow the query / simplify the filter / split manually) instead of shipping it.

Verification

(a) 1000-id CQL IN filter -> RequestTooLarge ✓   (was: shipped -> 414)
(b) small scalar query    -> passthrough (0 axes), no raise ✓
(c) monitoring_location_id=[2000 ids] -> chunked (1 axis), no raise ✓

The chunking suite passes (the old test that asserted "pass an over-budget request through (the server may 414)" is updated to expect RequestTooLarge, plus a fits→passthrough case). ruff clean.

🤖 Generated with Claude Code

…request

ChunkPlan's "no chunkable axes" branch returned immediately without sizing the
request, deliberately leaving an over-budget URL for the server to reject. So a
single large CQL-text `filter` with one big `IN (...)` clause (no top-level
`OR`, hence no chunk axis) was shipped verbatim and failed with an opaque HTTP
414 — and not even RequestTooLarge. (The equivalent
monitoring_location_id=[...] chunks fine.)

Size-check the no-axes path: if the single request fits, pass through as
before; if it's over budget there's nothing to split, so raise RequestTooLarge
with actionable guidance instead of shipping it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant