T-303: sector heuristic on HSJ/XFORM — guard verification + regression test#247
Merged
Conversation
…sion test build_reduced_dataset() omits hierarchy_blocks/tip_labels/n_orig_chars/ hsj_alpha/sankoff_* fields, so rss_search/xss_search are already guarded to fall back under HSJ/XFORM (commit e5ff294, same class as the T-275 guard). css_search needs no guard: it never builds a reduced dataset — it runs tbr_search() with a sector_mask against the full ds, so score_tree() dispatches hsj_score()/Sankoff with complete data and the sector-internal heuristic is correct for every scoring mode. Documented this with an in-code comment so it is not re-flagged. Approach (b) (copy the fields into the reduced dataset) is not tractable: the HTU pseudo-tip is a Fitch from_above state-set with no valid HSJ tip_labels or Sankoff tip_costs representation, so the reduced dataset cannot be made correct for those modes without new from-above machinery in both scoring kernels. Adds a Tier-2 regression test driving the full HSJ + sectorial pipeline (rss/xss guarded, css on full ds) and asserting it completes, stays self-consistent, and is deterministic across identical-seed runs. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
ms609
added a commit
that referenced
this pull request
Jun 12, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
T-303 (P2): Sector heuristic degrades silently on HSJ/XFORM datasets
build_reduced_dataset()insrc/ts_sector.cppdoes not copyhierarchy_blocks,tip_labels,n_orig_chars,hsj_alpha, orsankoff_*into the reduced dataset. Sincerd.data.scoring_modeis copied, an unguarded sector would dispatchhsj_score()/Sankoff against empty hierarchy/Sankoff data and silently degrade to Fitch-only — a wrong internal accept/reject heuristic (missed improvements, accept-then-revert churn). Final acceptance scores are unaffected (recomputed on the full dataset). EW, IW, PROFILE are fine.What this PR does
rss_searchandxss_search(the only two routines that callbuild_reduced_dataset) are already oncpp-search(commite5ff2942), mirroring the existing T-275 guard. No change needed there.css_searchdocumented as safe. It is the third sector routine but is not affected: it never builds a reduced dataset — it runstbr_search()with asector_maskagainst the fullds, soscore_tree()dispatcheshsj_score()/Sankoff with complete data and its internal heuristic is correct for every scoring mode. Added an in-code comment so it is not re-flagged. (Guarding it would wrongly disable a working HSJ/XFORM path.)Why not approach (b) (copy the fields)
Not tractable here. The HTU pseudo-tip is a Fitch
from_abovestate-set with no valid HSJtip_labels(original-character tokens) or Sankofftip_costsrepresentation, so the reduced dataset cannot be made correct for those modes without new from-above machinery in both scoring kernels. Approach (a) is the conservative, T-275-consistent fix.Note on testability
T-303 is silent by construction: final scores are always recomputed on the full dataset, so a guard regression cannot be caught by an absolute-score assertion. The test therefore locks in pipeline stability/determinism (it would catch a crash or score desync introduced by the guarded sector path).
Dispatched agent t303. GHA agent-check: run 27398557784. Found by /red-team area 5 (2026-05-26). PROFILE+IW are fine.
🤖 Generated with Claude Code