Skip to content

C++ search#210

Draft
ms609 wants to merge 696 commits into
mainfrom
cpp-search
Draft

C++ search#210
ms609 wants to merge 696 commits into
mainfrom
cpp-search

Conversation

@ms609

@ms609 ms609 commented Mar 19, 2026

Copy link
Copy Markdown
Owner
  • other optimizations + features

Manual testing underway; shiny app in particular has some usability issues.

@ms609 ms609 marked this pull request as draft March 25, 2026 14:21
ms609 added 29 commits March 27, 2026 12:05
roxygen2::roxygenise() generates man/ScoreSpectrum.Rd and updates
MaximizeParsimony.Rd with replicate_scores return-value entry.
inst/WORDLIST: +Chao (author name in references), -config/-warmup
(no longer present in checked text).
…ad, zero time-adjusted benefit

Per-replicate benchmark (20 seeds, Zhu2013/Giles2015/Dikow2009, 75-88t):
- NNI-perturb adds 59-69% per-replicate time overhead
- Time-adjusted expected best at 30s/60s: ≤0.1 steps difference (within noise)
- Decision: disable in thorough preset; available via SearchControl() for manual use

Script: dev/benchmarks/bench_t274_nni_perturb.R
Results: dev/benchmarks/results_t274_nni_perturb.csv
c(10, NA, Inf, -Inf, 20, 10) has 3 finite values (10, 20, 10),
not 4. is.finite(NA) = FALSE, is.finite(Inf) = FALSE.
…patched GHA 23646972365; T-274 GHA dispatched
…perturb_stop flag; fix stale NNI-perturb comment

- ts_driven.cpp: reset unsuccessful_reps when fuse improves best score.
  Previously the perturb-stop counter accumulated across fuse improvements,
  creating a false-positive stopping condition inconsistent with
  last_improved_rep (which WAS reset by fuse). Low severity in practice
  (perturb_stop_factor defaults to 0; limit = n_tips * factor is high),
  but logically wrong.

- ts_driven.h + ts_driven.cpp: add DrivenResult::perturb_stop field and set
  it when the perturb_stop_factor rule fires. Needed by T-276 (convergence
  summary output); was listed as a required indicator but the field was
  missing from the struct.

- ts_driven.cpp: replace stale NNI-perturb step-4b comment. Old text said
  'Skip when constraints are active' but the code passes cd through
  nni_perturb_search() and has been safe under constraints since
  nni_perturb was wired in. Replaced with accurate one-liner.

- AGENTS.md: commit pending coordination edits (task ID format, lock
  instructions, Shiny triage format). Already reflected in project memory.

S-RED focus 4 (Agent E). No other code changes.
- Add perturb_stop to ts_driven_search Rcpp bridge return (both early-exit
  and normal return paths)
- Include perturb_stop in structure() attributes returned by MaximizeParsimony()
- Enhance verbosity > 0 console summary: score, replicates, last_improved_rep,
  hits_to_best, n_topologies, stop reason (timeout/consensus stable/perturb
  limit/replicate limit), elapsed seconds
- Document perturb_stop in \describe block and Rd
- Tests: perturb_stop=TRUE in test-ts-driven.R; R-level attr test + verbosity
  smoke test in test-MaximizeParsimony-features.R
B-003: ScoreSpectrum() — Chao1-style landscape coverage estimator
…devel

rlang <=1.1.7 (CRAN 2026-01-09) calls PREXPR() in src/capture.c.
PREXPR was removed from the R public API in R-devel (gcc 15.2.1,
2026-01-23). The rlang main branch already rewrites capture.c using
their own r_obj* abstraction (no PREXPR calls), but this fix is not yet
released to CRAN.

Add a pre-install step in ASan.yml before ms609/actions/asan@main so
pak installs the GitHub development version before the action's
local_install_deps() runs. pak will not downgrade a more recent
installed version.

Also: update to-do.md with S-PR round 37 status (ubuntu jobs passing,
InfoConsensus.Rd codoc mismatch noted for T-150 / PR #213).

S-PR round 37 (Agent E, 2026-03-27).
T-252: Hamilton MorphoBank 25-matrix training benchmark done.
Results in dev/benchmarks/t252_mbank_*.csv. Key finding: ≤35t
converges at 30s; 36-65t near-optimal; 66-135t still improving;
project4284 (4062t) can't finish 1 replicate. T-253 unblocked.

T-150: new GHA 23648267378 (InfoConsensus.Rd + SearchControl.Rd fix).
T-204: new GHA 23648401936 (Rd example suppressWarnings fix).
DrivenResult::perturb_stop was not initialized (UB) and not set to
true when the perturbation-stop rule fired in parallel_driven_search().

Serial path (ts_driven.cpp) correctly initializes perturb_stop=false
and sets it to true at the stop site. Parallel path missed both.

Found during S-RED review of ts_parallel.cpp.
…mpat

The previous fix (pak::pak('r-lib/rlang')) failed because the GitHub dev
version (1.1.7.9000 @ 01c296f) also embeds PREXPR in its vendored
src/rlang/rlang-types.h — the fix has not yet reached rlang's C headers.

New approach: download CRAN rlang source, prepend a one-line shim
  #define PREXPR(x) R_PromiseExpr(x)  (guarded by #ifndef PREXPR)
to rlang-types.h, then install with R CMD INSTALL.  The #ifndef guard
makes this a no-op on older R where PREXPR is still a defined macro,
while activating R_PromiseExpr() on R-devel where PREXPR was removed.

Also: remove T-277 (ScoreSpectrum merged PR #236), update S-COORD/S-PR
standing task notes (3 unblocked OPEN → standing P2).
ntax is dominant predictor (rho~0.63). T-245 (TBR batching) top priority.
Results in dev/benchmarks/t253_gap_characterization.md.
…date NEWS.md

ASan.yml: patch-and-install approach can't fix capture.c direct PREXPR
calls (staged install sees MD5 mismatch, but both rlang-types.h and
capture.c call sites must be patched and the include-chain shim doesn't
cover capture.c). Simplest correct fix: continue-on-error: true on the
ASAN jobs. Remove the broken pre-install step. Comment explains blocker.
Remove once rlang >= 1.1.8 hits CRAN.

NEWS.md: add SearchControl parameters missing since 2026-03-18:
nniFirst, postRatchetSectorial, outerCycles/maxOuterResets, wagnerBias/
wagnerBiasTemp, perturbStopFactor, pruneReinsertCycles/Drop/Selection,
nniPerturbCycles/Fraction, annealCycles/Phases/TStart/TEnd/MovesPerPhase,
adaptiveLevel, adaptiveStart, enumTimeFraction, intraFuse, ratchetTaper,
consensusConstrain, consensusStableReps. Add Search output section for
convergence summary (T-276). Fix Consensus-stability bullet (opt-in only).

Also: S-COORD round 34, E-001 ASSIGNED (this commit), B-003 completed-tasks.
S-PR: ASAN fix now uses continue-on-error (rlang capture.c uses PREXPR
directly — header shim doesn't cover call sites outside rlang-types.h).
T-150 GHA 23648875258 running (ubuntu passed). T-204 GHA 23648401936
failed Windows: 18 warnings from UnloadMorphy in test-pp-random-tree.R —
wrap lines 51,67,86,91,103,127 in suppressWarnings().

S-COORD: T-277 merged, T-276 done; 3 unblocked OPEN → standing P2.
E-001 DONE: NEWS.md updated with 18 missing SearchControl params.
In drift_phase(), when a constrained suboptimal move is applied and
passes the constraint-violation check (so map_constraint_nodes updates
cd->constraint_node), then rejected by the RFD test, the topology is
correctly restored via snap but cd is left reflecting the post-move
topology. The next clip's classify_clip_constraints() then uses stale
constraint_node mapping.

Fix: call update_constraint(tree, *cd) after topology restore in both
the IW and EW suboptimal-reject paths.

Also eliminates redundant drift_full_rescore() in the EW reject path:
the return value at line 657 was previously discarded, causing a second
full rescore at line 689. Capture the return once and remove the second.

Only affects constrained drift (cd->active && driftCycles > 0). Drift
is disabled in all presets (driftCycles=0 since T-255), so production
impact is minimal. Same bug class as T-278 (ts_tbr.cpp).
ms609 and others added 30 commits June 1, 2026 08:10
WideSample() now selects topologically spread trees by solving the Max-Min
Diversity Problem with the MaxMin package, dispatching by a quality tier
auto-chosen from length(trees):

- exact node-packing IP (highs) for very small sets,
- DropAdd-TS while the dense distance matrix is affordable,
- matrix-free anchored Gonzalez (on-demand distance-column oracle) beyond
  that, so the largest tree sets never materialise an N x N matrix.

- Imports: MaxMin (Remotes: ms609/MaxMin) instead of FurthestPoint; the
  GonzalezColumn / DropAddTS / ExactMaxMin calls dispatch to MaxMin.
- Add Porumbel (2011) and Sayyady & Fathi (2016) references.
- dev/smoke_40k.R: 40,000-tree matrix-free smoke test.
- test-WideSample: 16/16 green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Until available on CRAN
Resize n0/n1 only when a new character has more states than current
buffer capacity (once per character rather than once per taxon × split).
Also compute max_state in the existing char-column pass (no extra loop).

Adds dev/benchmarks/bench_quartet_concordance.R for timing at
small/medium/large scales and profvis flame-graph support.

Benchmarks (Windows, R 4.5): small (<1ms), medium (14ms), large (186ms)
per call at 25/100/300 taxa respectively.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…n test (#247)

* T-303: document css_search HSJ/XFORM safety; add sectorial HSJ regression test

build_reduced_dataset() omits hierarchy_blocks/tip_labels/n_orig_chars/
hsj_alpha/sankoff_* fields, so rss_search/xss_search are already guarded to
fall back under HSJ/XFORM (commit e5ff294, same class as the T-275 guard).

css_search needs no guard: it never builds a reduced dataset — it runs
tbr_search() with a sector_mask against the full ds, so score_tree() dispatches
hsj_score()/Sankoff with complete data and the sector-internal heuristic is
correct for every scoring mode. Documented this with an in-code comment so it
is not re-flagged.

Approach (b) (copy the fields into the reduced dataset) is not tractable: the
HTU pseudo-tip is a Fitch from_above state-set with no valid HSJ tip_labels or
Sankoff tip_costs representation, so the reduced dataset cannot be made correct
for those modes without new from-above machinery in both scoring kernels.

Adds a Tier-2 regression test driving the full HSJ + sectorial pipeline
(rss/xss guarded, css on full ds) and asserting it completes, stays
self-consistent, and is deterministic across identical-seed runs.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* paramrename

* -progress

* Suggest `highs` for MaxMin solver

---------

Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
Edition 3 treats any unexpected warning as a test failure.  Three sources
of leakage, all introduced or exposed by the edition-2→3 migration:

* R/tree_length.R: `TreeLength()` on a multiPhylo emitted one
  "Unrooted tree rooted on tip 1" warning per tree inside `lapply`.
  A 2-tree call produced 2 warnings; `expect_warning` consumed the
  first, the second leaked.  Fix: hoist the warning outside the loop
  (emit once if any tree is unrooted) while keeping the per-tree
  `RootTree` logic intact.

* tests/testthat/test-Morphy.R: `Morphy()` emits a separate R warning
  for each mismatch type (tree-only taxa, dataset-only taxa, constraint-
  only taxa).  Calls with overlapping taxon sets fired 2-3 warnings;
  `expect_warning` only consumed one.  Fix: replace `expect_warning`
  with a `check_warns` helper (using `withCallingHandlers`) that muffles
  every warning from the expression and asserts at least one was raised.

* tests/testthat/test-ts-profile.R: `ReadAsPhyDat()` emits a parse
  warning that was not suppressed before `PrepareDataProfile`.  Fix:
  wrap the call in `suppressWarnings`.

* tests/testthat/test-tree_length.R: the 2-tree unrooted case also
  leaked a cli message; add `suppressMessages` consistent with the
  single-tree case already patched in the prior 'sssshhh' commit.

Root cause: the 'sssshhh' commit fixed the cli MESSAGE layer (adding
suppressMessages wrappers) but missed the parallel R-level WARNING
leakage where multiple warnings are emitted per call.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add an enduring Tier-2 guard for the EW/IW/NA/NA-IW dirty-set incremental
rescore wired into the tbr_search SPR accept path (src/ts_tbr.cpp ~1138-1180).
The DEBUG_RESCORE / DEBUG_NA_RESCORE / DEBUG_NNI_RESCORE cross-checks were
removed (5b210fd, 44a4ebe, 2be8228), leaving no guard; an earlier
incremental attempt regressed with a systematic delta=-3 (reverted b7303ee).

The test drives many accepted SPR moves (small tips, weak signal, high
maxHits) across all four scoring regimes and asserts the search-reported
score equals an independent full recomputation (result$score ==
ts_score(result_tree, ds)). Fails if the dirty-set rescore ever drifts.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
… T-308 secondary) (#250)

* T-307: Fix HSJ primary absent_state (was "-" index, not "0")

The driven HSJ pipeline hard-coded absent_state = 0L, which for the usual
levels c("-","0","1") is the index of the inapplicable token "-", not the
primary's "absent" state "0" (index 1). Primaries coded "0" were therefore
treated as PRESENT, so gain/loss of the structure was never counted; the
HSJ score was insensitive to primary presence/absence (verified: alpha=0
single-absent-tip scored 0 instead of 1).

- Add hsj_absent_state(dataset): 0-based index of the "0" token in `levels`
  (falls back to "-"), so the value tracks the dataset's level ordering
  instead of being hard-coded. build_tip_labels is level-order generic, so a
  constant was fragile regardless of which constant.
- Use it at every driven call site: MaximizeParsimony, Morphy (resample),
  TreeLength.phylo and TreeLength.list. Reconcile the test helper to compute
  it the same way (was hard-coded 1L).
- C++ score_hierarchy_block: treat the primary as absent when its label is
  the absent_state OR the inapplicable token, mirroring recode_hierarchy's
  `pri == "0" || pri == "-"` and fixing nested hierarchies where a primary
  may itself be inapplicable.
- Regression tests: alpha=0 primary sensitivity, driven-vs-direct agreement,
  hsj_absent_state across level orderings, primary-scoring level invariance.

Separate from T-306 (PR #249); discovered during that work.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Add WideSample false positives to WORDLIST

agent-check was failing on cpp-search before this branch: the spelling
test errored on DropAdd, FarFirst, Fathi, MMDP, MaxMin, Porumbel, Resende,
Sayyady, WideSample, medoid, pkg — author surnames, algorithm acronyms and
identifiers from recent WideSample work that were never added to the
wordlist. Added via spelling::update_wordlist(); restored config/maximin/
warmup that the tool would otherwise have pruned. Unrelated to the HSJ fix
but required to get this branch's CI green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Declare 'highs' in Suggests

R/WideSample.R conditionally uses highs via requireNamespace() and in
\dontrun/guarded examples, but it was never declared, so R CMD check
--as-cran (windows job, error-on="warning") failed with "'library' or
'require' call not declared from: 'highs'". Pre-existing on cpp-search
from recent WideSample work; required to get this branch's CI green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* T-308: Make HSJ secondary dissimilarity invariant to level ordering

fitch_label_char()'s uppass resolved ambiguous internal nodes to the
LOWEST SET BIT. Which token that bit denotes depends on the phyDat
`levels` ordering, so differently-ordered levels produced different
secondary reconstructions, different per-branch mismatch counts d, and
therefore different scores (the repro dataset scored 2.5 vs 2.0 at
hsj_alpha=1). A parsimony-style score must not depend on the arbitrary
internal ordering of levels.

Resolve instead toward the best-supported token in the node's own
subtree (max tip count, ties broken by smallest supporting tip index).
Both keys are properties of the tokens and the tree, not of the bit
encoding, giving a strict, deterministic, order-independent resolution
that is still a valid MPR — so the dissimilarity stays non-zero (the
concern that motivated adding the uppass in T-134).

Only the primary term was order-safe before (T-307); this closes the
secondary term. Verified invariant across all level permutations on the
repro plus a 1200-case randomised battery (multistate secondaries, "?"
missing data, random trees, alpha in {0,0.5,1}). Existing HSJ value
assertions are unchanged. Regression test extended to hsj_alpha in
{0.5, 1} across all orderings, plus a multistate-secondary case.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* T-308: make HSJ absent-state helper private (.HSJAbsentState)

`hsj_absent_state` (added with T-307) breached the package naming
conventions: R functions are BigCamelCase and private helpers are
dot-prefixed (r-conventions). It was also @export-ed with a public man
page despite being internal C++-bridge plumbing that returns a 0-based
token index.

Rename to `.HSJAbsentState`, drop @export and the man page, and document
it with a plain comment like the package's other private helpers
(.ParseOneBlock, .HierarchicalResampleWeights). Call sites in
MaximizeParsimony.R, Morphy.R, tree_length.R and the test bridge updated.

No behaviour change; the function is still reachable via ::: for tests.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Rename hierarchy/HSJ module to package naming conventions

The whole hierarchical-scoring module was written in snake_case, breaching
the package convention (functions BigCamelCase; private functions dot-
prefixed; variables camelCase).  Following the .HSJAbsentState fix, sweep
the rest of the module:

Exported helpers -> BigCamelCase (snake_case names removed):
  recode_hierarchy   -> RecodeHierarchy        (public)
  hierarchy_from_names -> HierarchyFromNames    (public)
  validate_hierarchy -> ValidateHierarchy       (exported-internal)
  hierarchy_chars    -> HierarchyChars          (exported-internal)
  hierarchy_controlling -> HierarchyControlling (exported-internal)

Pure C++-bridge plumbing -> private, unexported (dot-prefixed), documented
with plain comments like the package's other private helpers:
  build_tip_labels   -> .BuildTipLabels
  hierarchy_to_blocks -> .HierarchyToBlocks
  non_hierarchy_weights -> .NonHierarchyWeights
  (.HSJAbsentState already done)

Also: nested helpers (.recode_block -> .RecodeBlock, .flatten_block ->
.FlattenBlock); the public `char_names` arg -> `charNames`; and local
variables in the dedicated module files (CharacterHierarchy.R,
recode_hierarchy.R) camelCased.  The internal resample-weights list fields
were renamed to camelCase too (`non_hierarchy_weights` -> `nonHierarchyWeights`,
`block_counts` -> `blockCounts`), updated in producer, consumer and tests.

Call sites updated across R/, tests/, the vignette and the Shiny app; man
pages and NAMESPACE regenerated via roxygen2.  NEWS records the breaking
rename.  All hierarchy/HSJ/xform/recode/resample/tree_length tests pass;
spelling clean.

Local variables in the large shared scoring functions (tree_length.R,
Morphy.R, MaximizeParsimony.R) keep their existing `tip_data`/`adj_weight`
style — those bridge locals are pervasive across non-hierarchy branches too,
so renaming them belongs to a separate cleanup.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
* T-306: gate HSJ/XFORM SPR/NNI accept-paths to full_rescore

The SPR accept path in tbr_search() (src/ts_tbr.cpp) and the accept path
in nni_search() (src/ts_search.cpp) updated best_score with a Fitch-only
(EW) or IW/profile incremental delta. For HSJ and XFORM scoring,
score_tree() additionally adds a topology-dependent hierarchy-DP (HSJ) or
Sankoff (XFORM) term that the delta omits, so the search's internal
accept/reject decisions tracked the wrong objective (best_score drifted
from the authoritative score). Final user-visible scores stayed correct
because run_single_replicate always recomputes via score_tree() before
pooling, so this was a silent search-quality regression.

Fix: gate the incremental fast path on ds.scoring_mode being one of
EW/IW/XPIWE/PROFILE; HSJ and XFORM fall back to full_rescore() in both
accept paths (same scoring-mode classification as the T-275/T-303
guards). The NNI rejection-restore path likewise rebuilds the postorder
for HSJ/XFORM instead of a stale incremental downpass.

Adds a Tier-2 regression test (test-ts-t306-accept-guard.R) driving the
full HSJ and XFORM driven search on a brute-forceable 7-tip dataset and
asserting the search reaches the true global optimum, that every returned
tree's independent score equals the reported score (no best_score
desync), and that identical seeds are deterministic. Documents the
HSJ/XFORM full-rescore fallback in vignettes/search-algorithm.Rmd.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* T-306: build test dataset via TreeTools::MatrixToPhyDat

Avoid a direct phangorn::phyDat call in the regression test helper; use
TreeTools::MatrixToPhyDat (already loaded) instead. Test still passes
(42 assertions): the dataset retains a sharp HSJ optimum that is a strict
subset of the Fitch optimum, and the search reaches the true optimum.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant