Skip to content

Red-team CRAN hardening: T-309–T-315 + CRAN gates + UB (stable surface)#252

Merged
ms609 merged 20 commits into
cpp-searchfrom
fix/cran-redteam-2026-06-15
Jun 16, 2026
Merged

Red-team CRAN hardening: T-309–T-315 + CRAN gates + UB (stable surface)#252
ms609 merged 20 commits into
cpp-searchfrom
fix/cran-redteam-2026-06-15

Conversation

@ms609

@ms609 ms609 commented Jun 15, 2026

Copy link
Copy Markdown
Owner

Pre-CRAN-2.0.0 red-team hardening pass (dispatched via /dispatch), scoped to the stable public surface. Morphy access points were deliberately left untouched (owned by the parallel internalization work).

Confirmed fixes

Correctness / safety

  • T-309 (P1): driven_search() seeded its bandit RNG via raw GetRNGstate/unif_rand/PutRNGstate, which executed on parallel_resample worker threads — R API off-thread + a race on .Random.seed. Now seeded via ts::make_rng() (thread-local seed in parallel, R's RNG in serial).
  • T-311 / T-314 (P1/P2): LeastSquares* silently returned garbage with RSS = 0 on NA/Inf/asymmetric/duplicate-label distance matrices (the residual_ss NaN→0 clamp defeated the ok flag). .LSMatrix() now validates finite + symmetric + unique labels; LeastSquaresFit() guards < 3 tips (also closes a reserve((size_t)-1) underflow).
  • DAT-001 (P2, UBSAN): 1u << 32 undefined behaviour in simplify_patterns() when n_states == 32.

Frozen-API correctness (permanent at 2.0.0)

  • T-310 (P1): SearchControl()$pruneReinsertNni was stored as integer; now logical as documented (C++ as<int>() still coerces it).
  • T-313 (P2): Ratchet(returnAll = TRUE) now consistently returns a multiPhylo with a per-tree "score" attribute (matches the documented convention; previously a bare phylo or an unclassed list).

CRAN gates

  • T-312 / CRAN-003: drop the CRAN-forbidden Remotes: field; WideSample() degrades gracefully (runtime guard + guarded examples) when the non-CRAN MaxMin Suggest is absent.
  • CRAN-001: ScoreSpectrum example loaded Lobo from TreeSearch (it lives in TreeTools) — an R CMD check ERROR.
  • CRAN-006: declare graphics/utils in DESCRIPTION Imports.
  • T-315: AdditionTree() guards non-phyDat input; remove dead grad vector.

Validation

  • Compiles + installs clean; targeted tests (SearchControl, Ratchet/CustomSearch, LeastSquares, AdditionTree) pass — 0 failures.
  • Full R CMD check / UBSAN-ASAN / parallel reproducibility deferred to CI.

Not included (filed for follow-up)

RSP-001/002 (parallel-Resample interrupt contaminates support — resample R layer), DAT-002/003/004, CRAN-004 (\value gaps), CRAN-005 (NEWS heading), CRAN-009 (spelling), CRAN-010/011 (\donttest on flagship examples). Full ledger: dev/red-team/2026-06-15-cran-redteam-findings.md.

🤖 Generated with Claude Code

ms609 and others added 5 commits June 15, 2026 17:07
- T-309: seed driven_search bandit RNG via ts::make_rng() instead of raw
  GetRNGstate/unif_rand/PutRNGstate, so parallel Resample workers never
  call R's RNG API off-thread (data race on .Random.seed).
- T-310: SearchControl()$pruneReinsertNni stored as.logical not as.integer
  (documented Logical; frozen at 2.0.0). C++ as<int>() still coerces it.
- T-311/T-314: validate the least-squares dist matrix (finite, symmetric,
  unique labels) in .LSMatrix() and guard LeastSquaresFit() at <3 tips,
  turning silent garbage (RSS=0, no warning) into informative errors.
- T-312: drop the CRAN-forbidden Remotes: field; WideSample() now degrades
  gracefully with an informative error when the Suggested MaxMin is absent.
- T-313: Ratchet(returnAll=TRUE) now consistently returns a multiPhylo with
  a per-tree "score" attribute (matches the documented convention), instead
  of a bare phylo / unclassed list; @return documented.
- T-315: AdditionTree() guards non-phyDat input; remove dead `grad` vector.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- CRAN-001: ScoreSpectrum example loads `Lobo` from TreeTools (not
  TreeSearch, where it does not exist) — was an R CMD check ERROR.
- CRAN-002: guard the WideSample examples behind requireNamespace("MaxMin")
  so they don't error at check time when the GitHub-only Suggest is absent
  (pairs with the WideSample() runtime guard from the previous commit).
- CRAN-006: declare `graphics` and `utils` in DESCRIPTION Imports (both used
  via importFrom in NAMESPACE).
- DAT-001: avoid `1u << 32` undefined behaviour in simplify_patterns() when
  n_states == MAX_STATES (32); build the all-ones mask directly (UBSAN).
- Regenerate Ratchet/ScoreSpectrum/WideSample .Rd (roxygen2 8.0.0).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Removing the CRAN-forbidden Remotes: field left setup-r-dependencies unable
to resolve the MaxMin Suggest (GitHub-only). List it in Config/Needs/check so
pak installs it from GitHub for every workflow using `needs: check`, without
reintroducing Remotes. MaxMin remains a true optional Suggest (WideSample
degrades gracefully and its examples are guarded).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
In tbr_search the post-move constraint-violation check runs
map_constraint_nodes() on the rearranged topology. The equal-score
tabu-rejection branch restored the topology but, unlike the general
!accepted reject path, did not re-map constraint nodes / DFS timestamps,
leaving cd describing the rejected post-move tree. The next clip's
regraft_violates_constraint() then read stale DFS timestamps, mis-filtering
MUST_INSIDE / MUST_OUTSIDE candidates during constrained search (live by
default: tabuSize = 100, equal-score drift/plateau phases). Mirror the
re-sync the !accepted path already performs.

Found by /dispatch red-team 2026-06-15 (topology-invariants re-run).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
An unknown taxon name (character sequence) or out-of-range index (numeric
sequence) made match()/taxa[] yield NA, which propagated to the C++ kernel as
NA_INTEGER (INT_MIN); `addition_order[i] - 1` then underflowed to INT_MAX and
build_three_taxon_tree() wrote tree.parent[INT_MAX] out of bounds. Validate the
resolved sequence against the dataset's taxa before calling the kernel.

Found by /dispatch red-team 2026-06-15 (Wagner/AdditionTree, WGN-01).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@ms609

ms609 commented Jun 15, 2026

Copy link
Copy Markdown
Owner Author

Three fixes landed after the description above (see commits):

  • T-316 — stale constraint DFS metadata after an equal-score tabu rejection in tbr_search (the general !accepted path already guards this exact hazard; the tabu path didn't). Constrained search, default-path; possible P1.
  • WGN-01 (P1) — AdditionTree(sequence=) out-of-bounds write when an unknown taxon name / out-of-range index becomes NAINT_MIN underflow in the kernel; R-side guard added.
  • CIConfig/Needs/check: ms609/MaxMin (dependency install broke when Remotes: was dropped; sources the GitHub-only MaxMin without reintroducing the forbidden field).

All validated locally (compile + targeted tests, 0 failures; WGN-01 invalid input now errors instead of OOB). Follow-ups filed in to-do.md T-317–T-323; full ledger at dev/red-team/2026-06-15-cran-redteam-findings.md.

ms609 and others added 15 commits June 15, 2026 17:54
Config/Needs/check does not let pak resolve the bare `MaxMin` Suggests
dependency -- it only installs listed packages, it doesn't map the Suggests
name to the GitHub source -- so dependency install still failed. Restore the
Remotes field with only ms609/MaxMin (TreeTools is now on CRAN and no longer
needs it). Per maintainer: drop Remotes at the last moment before CRAN
submission. The WideSample() graceful-degradation guard from T-312 stays.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… through messages

- WideSample() tier 4 now takes the bare index vector ExactMaxMin() returns
  (the list/$indices return no longer exists), fixing the test-WideSample
  '$ operator is invalid for atomic vectors' errors.
- PrepareDataProfile() emits its 'inapplicable tokens' and 'no informative
  characters' notices via cli::cli_inform() (a message condition) instead of
  cli_alert() (uncatchable stdout), so callers can suppressMessages() and tests
  can expect_message() them. Add a capture test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…straint fallback warning

- T-318: XPIWE eff_k/phi loop guards obs==0 (all-missing pattern); previously
  obs in the denominator gave f=Inf capped to xpiwe_max_f, a spurious finite
  penalty on a pattern with no observed tips. Now neutral (eff_k=concavity,
  phi=1) so its zero step count stays unweighted.
- T-323/WGN-04: AdditionTree() rejects NaN/NA/non-scalar concavity, which
  previously slipped past 'is.finite() && <= 0' and silently selected equal
  weights in the kernel.
- T-323/WGN-02: ts_wagner_tree warns once when the constraint filter exhausts
  every legal edge and falls back to the unconstrained root edge, so a
  constraint-violating addition tree is never returned silently.
- T-323/WGN-03: drop dead 'n_added' parameter from
  wagner_edge_violates_constraint.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…lengths

Follow-up to the WGN-01 OOB class, from a red-team sweep of the Rcpp boundary:

- AdditionTree(sequence=): reject non-positive, fractional, out-of-range or
  duplicated numeric indices. R's taxa[i] would otherwise silently drop
  (i <= 0), truncate (fractional) or recycle, returning a tree that ignores
  the requested addition order instead of erroring.
- make_dataset() and the three inline dataset-building entry points
  (ts_*_search) now validate weight/levels/min_steps/obs_count lengths against
  n_patterns/n_states before build_dataset() indexes them by raw pointer. A
  short vector was an out-of-bounds read reachable by a direct internal call;
  public wrappers always size them correctly, so this only converts UB into a
  clean error.

Verified: WGN-10 (Sankoff ch_tip stride) was a FALSE POSITIVE — sankoff_score_char
steps t*stride from a base already offset by ch*max_states, so indexing is correct.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… fix null-tree warning

- CC-NORM-01: ClusteringConcordance(return='char') discarded the normalize-aware
  ret and always returned charInfo/charMax, so normalize=TRUE (the default)
  silently behaved identically to normalize=FALSE. Return ret, which equals
  charInfo/charMax when normalize=FALSE (zero=0) and subtracts the random
  expectation baseline when normalize=TRUE. Found by /dispatch red-team.
- CC-WARN-06: null-tree branch warned 'without `dataset`' (copy-paste); now
  '`tree`'.

Note: QC-ISAMBIG-02 (QuartetConcordance crash on ?/- tokens) was REFUTED -
colnames(contrast) is the level names on every standard/bundled path, never
NULL; QuartetConcordance scores ?, - and real morphological data cleanly.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…gests

CRAN gate fixes (found by /dispatch red-team):
- Wrap the MaximizeParsimony()/TaxonInfluence() searches in \donttest{} so the
  example does not exceed CRAN's example-runtime budget.
- Guard the PlotTools::SpectrumLegend() call (PlotTools is a Suggests) behind
  requireNamespace(), so --run-examples does not error when PlotTools is absent.
- Drop a stray '#'' that rendered a literal comment as the first example line.

(TAXI-EX-001 'congreveLamsdellMatrices[[42]][1:8, ] subscript error' was REFUTED:
[.phyDat accepts [taxa, sites]; the call returns a valid 8-taxon phyDat.)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…value gate)

CRAN-004: exported Carter1(), Log2Carter1(), LogCarter1(), MaddisonSlatkin()
(all in Carter1.Rd) and WithOneExtraStep() had no \value section, a
R CMD check warning. Add accurate @return descriptions.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…dom-normalized

My earlier commit (b6063c3) was wrong. ClusteringConcordance(return='char')
has returned charInfo/charMax (entropy-scaled, no random-expectation baseline)
for the logical-normalize case since the original implementation (#205), and
that is deliberate:

- charInfo is MutualClusteringInfo() against the whole tree, whereas the
  analytic "zero" baseline is per-single-split expected MI -- subtracting it
  mixes incompatible quantities.
- The author gated the random-normalized "ret" behind is.numeric(normalize)
  (Monte-Carlo only, where the baseline is same-scale empirical randMean+MCSE).
- The entropy-weighted normalized variant was explicitly abandoned (see the
  note below the @return docs: "imperfect matches come to dominate").

Restore the shipped charInfo/charMax and add a comment so it is not re-"fixed".
The unrelated CC-WARN-06 null-tree warning-text fix (same earlier commit) stays.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ClusteringConcordance(return = "char", normalize = <integer>) errored with
"Old and new labels must match" whenever any character carried ambiguous
tokens. Such characters drop different tips, so `charSplits` holds Splits over
heterogeneous tip sets; scoring a *list* of random trees against that list in a
single MutualClusteringInfo() call cannot reconcile a common label set.

Score each random tree separately (as the working `charInfo` call already
does), then stack. Adds a regression test covering ambiguous characters.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
When RecodeHierarchy() encodes a hierarchy character, a "?" in a controlling
character becomes tip state -1 (fully ambiguous) and an unknown secondary
combination becomes -2 (any present state). The ts_driven_search() path
(MaximizeParsimony) interprets these correctly, but the ts_sankoff_test()
path used by TreeLength(inapplicable = "xform") skipped any state < 0 when
building the Sankoff tip-cost array, leaving every state at INF. The result:
any "?" in a hierarchy character inflated the whole-tree length to Inf.

Mirror the driven-search handling: -1 -> all states cost 0; -2 -> present
states (1..ns-1) cost 0. No out-of-bounds access was possible (both the old
guard and the new -2 loop are range-safe), so this is a correctness fix, not
a memory-safety one. Adds an end-to-end regression test.

Found by red-team scoring/weighting sweep 2026-06-15; verified by live exec.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…cluster)

Five robustness fixes from the scoring/weighting red-team sweep (2026-06-15),
each verified by live execution and covered by a regression test:

- RTS-001 ParsSim(): a tree with all-zero edge lengths left
  sample.int(prob = ) with no positive probability ("too few positive
  probabilities"). .safe_sample_idx() now falls back to a uniform draw when
  every candidate weight is zero/NA, as it already did for absent edge lengths.

- RTS-003 CharacterHierarchy(): a sub-controller listed both explicitly and as
  a named sub-block (e.g. the documented list(2, 3, 4, 5, "3" = 9:10)) was
  added to `dependents` twice; deduplicate so ValidateHierarchy() does not
  later flag it as appearing in multiple blocks.

- RTS-004 .ScaleWeight(): a negative weight reached the C++ scorer as a
  negative int (undefined behaviour) and NA/NaN/Inf surfaced only as an opaque
  "missing value where TRUE/FALSE needed". Reject non-finite/negative weights
  up front; zero weights remain valid.

- RTS-005 ParsSim(): concavity = -Inf or NaN slipped past the is.finite()
  guard and silently ran as equal weights. Reject them explicitly; +Inf
  (equal weights) and "profile" remain valid.

- RTS-007 MaximumLength(): the generic documented compress = TRUE but S3
  dispatch used the method default (FALSE), so the per-character result was
  returned. Align the generic signature with the actual behaviour and with
  MinimumLength().

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The sectorial-search partition counts (`xssPartitions`/`cssPartitions`) divide
the tip count in the C++ kernel: `int target = tree.n_tip / n_partitions`
(`xss_partition()`). SearchControl() coerced them with as.integer() and applied
no floor, so `MaximizeParsimony(dat, control = SearchControl(xssPartitions = 0L))`
reached `xss_partition(tree, 0)` -> integer division by zero -> an uncatchable
SIGFPE that killed the whole R session (reproduced: R exit code 127 on a
24-tip dataset, since XSS runs once n_tips >= 2 * sectorMinSize).

Two layers: (a) SearchControl() now rejects a non-positive/NA partition count
with a clear message; (b) xss_partition() defensively clamps n_partitions to
>= 1 so no caller (including a direct internal call or the css path) can
trigger the SIGFPE. Regression test in test-SearchControl.R.

Found by red-team C++ rearrangement-kernel sweep 2026-06-15; verified by
reproducing the crash.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
TreePool::add() enters its eviction branch as soon as
`entries_.size() >= max_size`; with max_size <= 0 that is true on the very
first insert, so it dereferences `entries_[0]` on an empty vector -- an
out-of-bounds read that segfaults (reproduced: R exit 139 on a 24-tip search
with poolMaxSize = 0L). SearchControl() applied no floor, and the resample /
parallel paths build the pool from the same value.

Two layers, mirroring the xssPartitions fix: (a) SearchControl() rejects a
non-positive/NA poolMaxSize; (b) the TreePool constructor clamps max_size to
>= 1 so no caller (resample, parallel, or a raw list control) can segfault.
Regression tests in test-SearchControl.R (R-side rejection + a C-side clamp
canary that runs a search with a raw poolMaxSize = 0 list control).

Found by extending the RT-CPP-01 lead with a SearchControl parameter fuzz
(2026-06-15); crash reproduced and fix verified.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The "TBR search correct with IW scoring" test built min_steps via
apply(at$contrast, 2, ...), iterating over contrast *columns* (states)
and so yielding a length-nStates vector instead of one value per
character. That formerly read out of bounds in the C++ scorer; the
min_steps length guard added in a8fab90 now correctly rejects it,
surfacing the latent test bug as a hard error
("`min_steps` length (5) must equal the number of characters (50)").

Compute min_steps the canonical way -- MinimumLength(ds, compress = TRUE),
mirroring MaximizeParsimony()'s own IW setup -- so it has one entry per
pattern. Only this test was affected; every sibling IW test already uses
MinimumLength(). Engine and guard are unchanged and correct.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
In MaximizeParsimony(inapplicable = "xform") with two or more recoded
hierarchy blocks of differing n_states, the live search path (score_tree,
ScoringMode::XFORM) built each block's SankoffChar.cost_matrix by copying the
whole [max_states x max_states] storage block verbatim while setting n_states
to the per-character count. sankoff_score_char reads rows at the per-character
stride (cost_matrix[s * n_states + t]); for any block with n_states <
max_states every row s > 0 was then read from the zero-padded gap, silently
treating loss/transition costs as 0 and undercounting the tree score.
MaximizeParsimony reported a too-low score and optimised toward the wrong
tree, disagreeing with TreeLength() (which scores via the independent,
correct ts_sankoff_test kernel).

Compact the top-left n_states x n_states block down to the per-character
stride when populating SankoffChar, exactly as ts_sankoff_test does. With a
single hierarchy block, or blocks of equal n_states, max_states == n_states
and the bug was masked, so existing single-block xform tests were unaffected.

Verified end-to-end: a 6-taxon, two-block (n_states 3 and 5) dataset reported
score 6 before the fix and 8 after, with TreeLength on the returned tree
giving 8, restoring MaximizeParsimony/TreeLength agreement. Regression test
added (test-ts-xform.R).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@ms609 ms609 merged commit 55e304b into cpp-search Jun 16, 2026
5 of 7 checks passed
@ms609 ms609 deleted the fix/cran-redteam-2026-06-15 branch June 16, 2026 05:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant