fix(waterdata): get_ratings — don't abort on one bad feature, don't silently truncate#304
Draft
thodson-usgs wants to merge 1 commit into
Draft
Conversation
Two reliability issues in the ratings getter: 1. The per-feature download loop caught only (httpx.HTTPError, ValueError, OSError), but _download_and_parse -> _raise_for_non_200 raises the module's typed errors (RateLimited / ServiceUnavailable / RuntimeError, all RuntimeError subclasses), and a feature missing its data asset raises LookupError. So one rate-limited / failed / malformed feature aborted the entire multi-site call instead of being logged and skipped. Broadened the except to cover RuntimeError and LookupError. 2. _search sent `limit` verbatim and returned only the first page, silently truncating large result sets despite the docstring. It now clamps the page size to the service max (10,000) and follows the STAC `next` link until exhausted, returning all matching features. (Behavior change for the narrow case it fixes: >1-page queries, or small explicit `limit`s, now return all matches; common default queries are unchanged.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Two reliability issues in
get_ratings:1. One bad feature aborts the whole batch. The per-feature download loop catches
(httpx.HTTPError, ValueError, OSError), but_download_and_parse→_raise_for_non_200raises the module's typed errors —RateLimited(429),ServiceUnavailable(5xx),RuntimeError(other 4xx) — which are allRuntimeErrorsubclasses, nothttpx.HTTPError. (A feature missing itsassets.data.hrefalso raisesLookupError.) So a single rate-limited/failed/malformed feature in a multi-site request escapes the loop and kills the whole call instead of being logged and skipped as intended.2. Large result sets are silently truncated.
_searchsentlimitverbatim and returned only the first page — no pagination, no clamp — despite the docstring describinglimitas a "page size … (capped at 10000)." A query matching more than one page lost the remainder with no indication.Fix
exceptto also catchRuntimeError(covers the typed errors) andLookupError.nextlink until exhausted.Verification (live API + mock)
_search(bbox=[-95, 40, -92, 42], limit=2)now returns 177 features (previously 2)._download_and_parsepatched to raiseRateLimited,get_ratings(...)logsFailed to download / parse …and returns{}— skipped, not raised (pre-fix it raised).get_ratings("USGS-01104475", download_and_parse=False)→ 1 feature.ruffclean.🤖 Generated with Claude Code