fix(wqp): preserve leading zeros on code columns (HUCs, parameter codes, FIPS) by thodson-usgs · Pull Request #311 · DOI-USGS/dataretrieval-python

thodson-usgs · 2026-05-30T19:21:01Z

Problem

All nine WQP getters read the response with a bare
pd.read_csv(StringIO(text), delimiter=",", low_memory=False), which infers code columns as int/float and silently drops their significant leading zeros:

USGS parameter code  "00060"     -> 60
HUC8                 "07090002"  -> 7090002
FIPS / qualifier codes           -> numeric, zeros lost

R dataRetrieval reads these as character. (The NWIS RDB path is unaffected — it pins site_no/parm_cd to str already.)

Fix

Add a _read_wqp_csv helper (used by all nine read sites): read the header, then re-read with dtype=str for any column whose name is a code/identifier — ends with "code", or contains "identifier"/"huc"/"fips". This covers both the legacy and WQX3.0 column schemas while leaving value columns (e.g. ResultMeasureValue) numeric.

Verification

csv = "Location_HUCEightDigitCode,USGSpcode,ResultMeasureValue\n07090002,00060,1.5\n"
_read_wqp_csv(csv)
#   HUC8 -> "07090002"   (was np.int64(7090002))
#   pcode-> "00060"      (was np.int64(60))
#   ResultMeasureValue -> 1.5 (float, unchanged)

Added a regression test; the full wqp suite (15) passes — df.shape/df.size and the derived *DateTime columns are unchanged by the dtype shift. ruff clean.

Note: the committed WQX3 fixture (wqp3_results.txt) was itself generated post-corruption (its HUC cell is already 7090002), so the regression test uses a constructed row that actually carries a leading zero.

🤖 Generated with Claude Code

…es, FIPS) The nine WQP getters read responses with a bare `pd.read_csv(StringIO(text), delimiter=",", low_memory=False)`, which infers code columns as int/float and silently drops their significant leading zeros: a USGS parameter code "00060" became 60, HUC8 "07090002" became 7090002. (R dataRetrieval reads these as character.) Add a `_read_wqp_csv` helper that reads the header, then re-reads with `dtype=str` for any column whose name is a code/identifier (ends with "code", or contains "identifier"/"huc"/"fips") — covering both the legacy and WQX3.0 column schemas — while leaving value columns numeric. All nine read sites use it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(wqp): preserve leading zeros on code columns (HUCs, parameter codes, FIPS)#311

fix(wqp): preserve leading zeros on code columns (HUCs, parameter codes, FIPS)#311
thodson-usgs wants to merge 1 commit into
DOI-USGS:mainfrom
thodson-usgs:fix/wqp-preserve-leading-zeros

thodson-usgs commented May 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

thodson-usgs commented May 30, 2026

Problem

Fix

Verification

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant