Red-team CRAN hardening: T-309–T-315 + CRAN gates + UB (stable surface) by ms609 · Pull Request #252 · ms609/TreeSearch

ms609 · 2026-06-15T16:25:41Z

Pre-CRAN-2.0.0 red-team hardening pass (dispatched via /dispatch), scoped to the stable public surface. Morphy access points were deliberately left untouched (owned by the parallel internalization work).

Confirmed fixes

Correctness / safety

T-309 (P1): driven_search() seeded its bandit RNG via raw GetRNGstate/unif_rand/PutRNGstate, which executed on parallel_resample worker threads — R API off-thread + a race on .Random.seed. Now seeded via ts::make_rng() (thread-local seed in parallel, R's RNG in serial).
T-311 / T-314 (P1/P2): LeastSquares* silently returned garbage with RSS = 0 on NA/Inf/asymmetric/duplicate-label distance matrices (the residual_ss NaN→0 clamp defeated the ok flag). .LSMatrix() now validates finite + symmetric + unique labels; LeastSquaresFit() guards < 3 tips (also closes a reserve((size_t)-1) underflow).
DAT-001 (P2, UBSAN): 1u << 32 undefined behaviour in simplify_patterns() when n_states == 32.

Frozen-API correctness (permanent at 2.0.0)

T-310 (P1): SearchControl()$pruneReinsertNni was stored as integer; now logical as documented (C++ as<int>() still coerces it).
T-313 (P2): Ratchet(returnAll = TRUE) now consistently returns a multiPhylo with a per-tree "score" attribute (matches the documented convention; previously a bare phylo or an unclassed list).

CRAN gates

T-312 / CRAN-003: drop the CRAN-forbidden Remotes: field; WideSample() degrades gracefully (runtime guard + guarded examples) when the non-CRAN MaxMin Suggest is absent.
CRAN-001: ScoreSpectrum example loaded Lobo from TreeSearch (it lives in TreeTools) — an R CMD check ERROR.
CRAN-006: declare graphics/utils in DESCRIPTION Imports.
T-315: AdditionTree() guards non-phyDat input; remove dead grad vector.

Validation

Compiles + installs clean; targeted tests (SearchControl, Ratchet/CustomSearch, LeastSquares, AdditionTree) pass — 0 failures.
Full R CMD check / UBSAN-ASAN / parallel reproducibility deferred to CI.

Not included (filed for follow-up)

RSP-001/002 (parallel-Resample interrupt contaminates support — resample R layer), DAT-002/003/004, CRAN-004 (\value gaps), CRAN-005 (NEWS heading), CRAN-009 (spelling), CRAN-010/011 (\donttest on flagship examples). Full ledger: dev/red-team/2026-06-15-cran-redteam-findings.md.

🤖 Generated with Claude Code

@return

- T-309: seed driven_search bandit RNG via ts::make_rng() instead of raw GetRNGstate/unif_rand/PutRNGstate, so parallel Resample workers never call R's RNG API off-thread (data race on .Random.seed). - T-310: SearchControl()$pruneReinsertNni stored as.logical not as.integer (documented Logical; frozen at 2.0.0). C++ as<int>() still coerces it. - T-311/T-314: validate the least-squares dist matrix (finite, symmetric, unique labels) in .LSMatrix() and guard LeastSquaresFit() at <3 tips, turning silent garbage (RSS=0, no warning) into informative errors. - T-312: drop the CRAN-forbidden Remotes: field; WideSample() now degrades gracefully with an informative error when the Suggested MaxMin is absent. - T-313: Ratchet(returnAll=TRUE) now consistently returns a multiPhylo with a per-tree "score" attribute (matches the documented convention), instead of a bare phylo / unclassed list; @return documented. - T-315: AdditionTree() guards non-phyDat input; remove dead `grad` vector. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

- CRAN-001: ScoreSpectrum example loads `Lobo` from TreeTools (not TreeSearch, where it does not exist) — was an R CMD check ERROR. - CRAN-002: guard the WideSample examples behind requireNamespace("MaxMin") so they don't error at check time when the GitHub-only Suggest is absent (pairs with the WideSample() runtime guard from the previous commit). - CRAN-006: declare `graphics` and `utils` in DESCRIPTION Imports (both used via importFrom in NAMESPACE). - DAT-001: avoid `1u << 32` undefined behaviour in simplify_patterns() when n_states == MAX_STATES (32); build the all-ones mask directly (UBSAN). - Regenerate Ratchet/ScoreSpectrum/WideSample .Rd (roxygen2 8.0.0). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Removing the CRAN-forbidden Remotes: field left setup-r-dependencies unable to resolve the MaxMin Suggest (GitHub-only). List it in Config/Needs/check so pak installs it from GitHub for every workflow using `needs: check`, without reintroducing Remotes. MaxMin remains a true optional Suggest (WideSample degrades gracefully and its examples are guarded). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

In tbr_search the post-move constraint-violation check runs map_constraint_nodes() on the rearranged topology. The equal-score tabu-rejection branch restored the topology but, unlike the general !accepted reject path, did not re-map constraint nodes / DFS timestamps, leaving cd describing the rejected post-move tree. The next clip's regraft_violates_constraint() then read stale DFS timestamps, mis-filtering MUST_INSIDE / MUST_OUTSIDE candidates during constrained search (live by default: tabuSize = 100, equal-score drift/plateau phases). Mirror the re-sync the !accepted path already performs. Found by /dispatch red-team 2026-06-15 (topology-invariants re-run). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

An unknown taxon name (character sequence) or out-of-range index (numeric sequence) made match()/taxa[] yield NA, which propagated to the C++ kernel as NA_INTEGER (INT_MIN); `addition_order[i] - 1` then underflowed to INT_MAX and build_three_taxon_tree() wrote tree.parent[INT_MAX] out of bounds. Validate the resolved sequence against the dataset's taxa before calling the kernel. Found by /dispatch red-team 2026-06-15 (Wagner/AdditionTree, WGN-01). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

ms609 · 2026-06-15T16:52:07Z

Three fixes landed after the description above (see commits):

T-316 — stale constraint DFS metadata after an equal-score tabu rejection in tbr_search (the general !accepted path already guards this exact hazard; the tabu path didn't). Constrained search, default-path; possible P1.
WGN-01 (P1) — AdditionTree(sequence=) out-of-bounds write when an unknown taxon name / out-of-range index becomes NA → INT_MIN underflow in the kernel; R-side guard added.
CI — Config/Needs/check: ms609/MaxMin (dependency install broke when Remotes: was dropped; sources the GitHub-only MaxMin without reintroducing the forbidden field).

All validated locally (compile + targeted tests, 0 failures; WGN-01 invalid input now errors instead of OOB). Follow-ups filed in to-do.md T-317–T-323; full ledger at dev/red-team/2026-06-15-cran-redteam-findings.md.

Config/Needs/check does not let pak resolve the bare `MaxMin` Suggests dependency -- it only installs listed packages, it doesn't map the Suggests name to the GitHub source -- so dependency install still failed. Restore the Remotes field with only ms609/MaxMin (TreeTools is now on CRAN and no longer needs it). Per maintainer: drop Remotes at the last moment before CRAN submission. The WideSample() graceful-degradation guard from T-312 stays. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

… through messages - WideSample() tier 4 now takes the bare index vector ExactMaxMin() returns (the list/$indices return no longer exists), fixing the test-WideSample '$ operator is invalid for atomic vectors' errors. - PrepareDataProfile() emits its 'inapplicable tokens' and 'no informative characters' notices via cli::cli_inform() (a message condition) instead of cli_alert() (uncatchable stdout), so callers can suppressMessages() and tests can expect_message() them. Add a capture test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…straint fallback warning - T-318: XPIWE eff_k/phi loop guards obs==0 (all-missing pattern); previously obs in the denominator gave f=Inf capped to xpiwe_max_f, a spurious finite penalty on a pattern with no observed tips. Now neutral (eff_k=concavity, phi=1) so its zero step count stays unweighted. - T-323/WGN-04: AdditionTree() rejects NaN/NA/non-scalar concavity, which previously slipped past 'is.finite() && <= 0' and silently selected equal weights in the kernel. - T-323/WGN-02: ts_wagner_tree warns once when the constraint filter exhausts every legal edge and falls back to the unconstrained root edge, so a constraint-violating addition tree is never returned silently. - T-323/WGN-03: drop dead 'n_added' parameter from wagner_edge_violates_constraint. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…lengths Follow-up to the WGN-01 OOB class, from a red-team sweep of the Rcpp boundary: - AdditionTree(sequence=): reject non-positive, fractional, out-of-range or duplicated numeric indices. R's taxa[i] would otherwise silently drop (i <= 0), truncate (fractional) or recycle, returning a tree that ignores the requested addition order instead of erroring. - make_dataset() and the three inline dataset-building entry points (ts_*_search) now validate weight/levels/min_steps/obs_count lengths against n_patterns/n_states before build_dataset() indexes them by raw pointer. A short vector was an out-of-bounds read reachable by a direct internal call; public wrappers always size them correctly, so this only converts UB into a clean error. Verified: WGN-10 (Sankoff ch_tip stride) was a FALSE POSITIVE — sankoff_score_char steps t*stride from a base already offset by ch*max_states, so indexing is correct. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

… fix null-tree warning - CC-NORM-01: ClusteringConcordance(return='char') discarded the normalize-aware ret and always returned charInfo/charMax, so normalize=TRUE (the default) silently behaved identically to normalize=FALSE. Return ret, which equals charInfo/charMax when normalize=FALSE (zero=0) and subtracts the random expectation baseline when normalize=TRUE. Found by /dispatch red-team. - CC-WARN-06: null-tree branch warned 'without `dataset`' (copy-paste); now '`tree`'. Note: QC-ISAMBIG-02 (QuartetConcordance crash on ?/- tokens) was REFUTED - colnames(contrast) is the level names on every standard/bundled path, never NULL; QuartetConcordance scores ?, - and real morphological data cleanly. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…gests CRAN gate fixes (found by /dispatch red-team): - Wrap the MaximizeParsimony()/TaxonInfluence() searches in \donttest{} so the example does not exceed CRAN's example-runtime budget. - Guard the PlotTools::SpectrumLegend() call (PlotTools is a Suggests) behind requireNamespace(), so --run-examples does not error when PlotTools is absent. - Drop a stray '#'' that rendered a literal comment as the first example line. (TAXI-EX-001 'congreveLamsdellMatrices[[42]][1:8, ] subscript error' was REFUTED: [.phyDat accepts [taxa, sites]; the call returns a valid 8-taxon phyDat.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@return

…value gate) CRAN-004: exported Carter1(), Log2Carter1(), LogCarter1(), MaddisonSlatkin() (all in Carter1.Rd) and WithOneExtraStep() had no \value section, a R CMD check warning. Add accurate @return descriptions. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@return

…dom-normalized My earlier commit (b6063c3) was wrong. ClusteringConcordance(return='char') has returned charInfo/charMax (entropy-scaled, no random-expectation baseline) for the logical-normalize case since the original implementation (#205), and that is deliberate: - charInfo is MutualClusteringInfo() against the whole tree, whereas the analytic "zero" baseline is per-single-split expected MI -- subtracting it mixes incompatible quantities. - The author gated the random-normalized "ret" behind is.numeric(normalize) (Monte-Carlo only, where the baseline is same-scale empirical randMean+MCSE). - The entropy-weighted normalized variant was explicitly abandoned (see the note below the @return docs: "imperfect matches come to dominate"). Restore the shipped charInfo/charMax and add a comment so it is not re-"fixed". The unrelated CC-WARN-06 null-tree warning-text fix (same earlier commit) stays. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

ClusteringConcordance(return = "char", normalize = <integer>) errored with "Old and new labels must match" whenever any character carried ambiguous tokens. Such characters drop different tips, so `charSplits` holds Splits over heterogeneous tip sets; scoring a *list* of random trees against that list in a single MutualClusteringInfo() call cannot reconcile a common label set. Score each random tree separately (as the working `charInfo` call already does), then stack. Adds a regression test covering ambiguous characters. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

When RecodeHierarchy() encodes a hierarchy character, a "?" in a controlling character becomes tip state -1 (fully ambiguous) and an unknown secondary combination becomes -2 (any present state). The ts_driven_search() path (MaximizeParsimony) interprets these correctly, but the ts_sankoff_test() path used by TreeLength(inapplicable = "xform") skipped any state < 0 when building the Sankoff tip-cost array, leaving every state at INF. The result: any "?" in a hierarchy character inflated the whole-tree length to Inf. Mirror the driven-search handling: -1 -> all states cost 0; -2 -> present states (1..ns-1) cost 0. No out-of-bounds access was possible (both the old guard and the new -2 loop are range-safe), so this is a correctness fix, not a memory-safety one. Adds an end-to-end regression test. Found by red-team scoring/weighting sweep 2026-06-15; verified by live exec. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…cluster) Five robustness fixes from the scoring/weighting red-team sweep (2026-06-15), each verified by live execution and covered by a regression test: - RTS-001 ParsSim(): a tree with all-zero edge lengths left sample.int(prob = ) with no positive probability ("too few positive probabilities"). .safe_sample_idx() now falls back to a uniform draw when every candidate weight is zero/NA, as it already did for absent edge lengths. - RTS-003 CharacterHierarchy(): a sub-controller listed both explicitly and as a named sub-block (e.g. the documented list(2, 3, 4, 5, "3" = 9:10)) was added to `dependents` twice; deduplicate so ValidateHierarchy() does not later flag it as appearing in multiple blocks. - RTS-004 .ScaleWeight(): a negative weight reached the C++ scorer as a negative int (undefined behaviour) and NA/NaN/Inf surfaced only as an opaque "missing value where TRUE/FALSE needed". Reject non-finite/negative weights up front; zero weights remain valid. - RTS-005 ParsSim(): concavity = -Inf or NaN slipped past the is.finite() guard and silently ran as equal weights. Reject them explicitly; +Inf (equal weights) and "profile" remain valid. - RTS-007 MaximumLength(): the generic documented compress = TRUE but S3 dispatch used the method default (FALSE), so the per-character result was returned. Align the generic signature with the actual behaviour and with MinimumLength(). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The sectorial-search partition counts (`xssPartitions`/`cssPartitions`) divide the tip count in the C++ kernel: `int target = tree.n_tip / n_partitions` (`xss_partition()`). SearchControl() coerced them with as.integer() and applied no floor, so `MaximizeParsimony(dat, control = SearchControl(xssPartitions = 0L))` reached `xss_partition(tree, 0)` -> integer division by zero -> an uncatchable SIGFPE that killed the whole R session (reproduced: R exit code 127 on a 24-tip dataset, since XSS runs once n_tips >= 2 * sectorMinSize). Two layers: (a) SearchControl() now rejects a non-positive/NA partition count with a clear message; (b) xss_partition() defensively clamps n_partitions to >= 1 so no caller (including a direct internal call or the css path) can trigger the SIGFPE. Regression test in test-SearchControl.R. Found by red-team C++ rearrangement-kernel sweep 2026-06-15; verified by reproducing the crash. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

TreePool::add() enters its eviction branch as soon as `entries_.size() >= max_size`; with max_size <= 0 that is true on the very first insert, so it dereferences `entries_[0]` on an empty vector -- an out-of-bounds read that segfaults (reproduced: R exit 139 on a 24-tip search with poolMaxSize = 0L). SearchControl() applied no floor, and the resample / parallel paths build the pool from the same value. Two layers, mirroring the xssPartitions fix: (a) SearchControl() rejects a non-positive/NA poolMaxSize; (b) the TreePool constructor clamps max_size to >= 1 so no caller (resample, parallel, or a raw list control) can segfault. Regression tests in test-SearchControl.R (R-side rejection + a C-side clamp canary that runs a search with a raw poolMaxSize = 0 list control). Found by extending the RT-CPP-01 lead with a SearchControl parameter fuzz (2026-06-15); crash reproduced and fix verified. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The "TBR search correct with IW scoring" test built min_steps via apply(at$contrast, 2, ...), iterating over contrast *columns* (states) and so yielding a length-nStates vector instead of one value per character. That formerly read out of bounds in the C++ scorer; the min_steps length guard added in a8fab90 now correctly rejects it, surfacing the latent test bug as a hard error ("`min_steps` length (5) must equal the number of characters (50)"). Compute min_steps the canonical way -- MinimumLength(ds, compress = TRUE), mirroring MaximizeParsimony()'s own IW setup -- so it has one entry per pattern. Only this test was affected; every sibling IW test already uses MinimumLength(). Engine and guard are unchanged and correct. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

In MaximizeParsimony(inapplicable = "xform") with two or more recoded hierarchy blocks of differing n_states, the live search path (score_tree, ScoringMode::XFORM) built each block's SankoffChar.cost_matrix by copying the whole [max_states x max_states] storage block verbatim while setting n_states to the per-character count. sankoff_score_char reads rows at the per-character stride (cost_matrix[s * n_states + t]); for any block with n_states < max_states every row s > 0 was then read from the zero-padded gap, silently treating loss/transition costs as 0 and undercounting the tree score. MaximizeParsimony reported a too-low score and optimised toward the wrong tree, disagreeing with TreeLength() (which scores via the independent, correct ts_sankoff_test kernel). Compact the top-left n_states x n_states block down to the per-character stride when populating SankoffChar, exactly as ts_sankoff_test does. With a single hierarchy block, or blocks of equal n_states, max_states == n_states and the bug was masked, so existing single-block xform tests were unaffected. Verified end-to-end: a 6-taxon, two-block (n_states 3 and 5) dataset reported score 6 before the fix and 8 after, with TreeLength on the returned tree giving 8, restoring MaximizeParsimony/TreeLength agreement. Regression test added (test-ts-xform.R). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

ms609 and others added 5 commits June 15, 2026 17:07

ms609 and others added 15 commits June 15, 2026 17:54

ms609 merged commit 55e304b into cpp-search Jun 16, 2026
5 of 7 checks passed

ms609 deleted the fix/cran-redteam-2026-06-15 branch June 16, 2026 05:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Red-team CRAN hardening: T-309–T-315 + CRAN gates + UB (stable surface)#252

Red-team CRAN hardening: T-309–T-315 + CRAN gates + UB (stable surface)#252
ms609 merged 20 commits into
cpp-searchfrom
fix/cran-redteam-2026-06-15

ms609 commented Jun 15, 2026

Uh oh!

ms609 commented Jun 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ms609 commented Jun 15, 2026

Confirmed fixes

Validation

Not included (filed for follow-up)

Uh oh!

ms609 commented Jun 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant