Score MorphyLength via native kernel for inapplicable data (Stage 1 of MorphyLib removal)#251
Open
ms609 wants to merge 7 commits into
Open
Score MorphyLength via native kernel for inapplicable data (Stage 1 of MorphyLib removal)#251ms609 wants to merge 7 commits into
ms609 wants to merge 7 commits into
Conversation
…f MorphyLib removal)
MorphyLength()/MorphyTreeLength() route the default inapplicable gap mode
through the native Fitch kernel (FastCharacterLength) instead of vendored
MorphyLib. Fixes a known MorphyLib bug scoring `{1-}` ambiguous-with-
inapplicable tokens (Lobo pattern 93: native 2 vs MorphyLib 3), aligning the
legacy custom-search scorers (TreeSearch, Ratchet, Jackknife, custom.Rmd)
with the already-correct MaximizeParsimony.
- PhyDat2Morphy()/SingleCharMorphy() attach precomputed native scoring data
(.NativeMorphyData/.SingleCharNative); the Morphy pointer is retained so
Morphy()'s C-loop and introspection are untouched.
- Rare ambiguous/extra gap modes keep the MorphyLib path for now.
- Not aiming for MorphyLib-equivalence: native scores are the correct v2.0
behaviour. Verified MorphyTreeLength == TreeLength across Lobo + CL
datasets and per character via SingleCharMorphy.
Stage 1 of dropping MorphyLib entirely from 2.0; later stages migrate
Morphy()'s preorder_morphy/morphy_iw/morphy_profile loop and delete the
vendored sources.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Captures the migration of Morphy()'s preorder_morphy/morphy_iw/morphy_profile loop and the ambiguous/extra gap modes to native (Stage 2), and the deletion of vendored MorphyLib sources/bindings/introspection (Stage 3). Flags the Morphy() approach (swap closures vs redirect to MaximizeParsimony) as needing maintainer review, since both change superseded-path behaviour. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Morphy() was the redundant MorphyLib R-loop search (EW/IW/profile + ratchet + constraints), never released on CRAN (not in 1.8.0), sitting between native MaximizeParsimony() and the custom-criteria TreeSearch()/Ratchet(). Everything it did is covered natively, so it is deleted rather than migrated. - Delete Morphy() + its private helpers (.EWScore/.IWScore/.ProfileScore/ .TBRSearch/.CombineResults/.ReplaceResults/.Time/.DateTime) from R/Morphy.R. - Replace MaximizeParsimony()'s legacy-parameter delegation (do.call(Morphy,..)) with an informative error. - Drop export(Morphy); delete man/Morphy.Rd + man/dot-CombineResults.Rd; fix dangling [Morphy()] doc links (WhenFirstHit, EasyTrees, MaximizeParsimony). - Delete test-Morphy.R (+ snapshots); drop Morphy() calls from test-CustomSearch. Leaves morphy_iw/morphy_profile dead (Morphy() was their only caller), to be removed in Stage 3. Full suite green: FAIL 0, PASS 11201. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Drop the vendored Morphy Phylogenetic Library entirely. All parsimony
scoring now runs through the native C++ kernel, which implements the
Brazeau-Guillerme-Smith inapplicable algorithm correctly (including the
`{1-}` token MorphyLib mis-scored). The native engine's own Wagner
builder (ts_wagner.cpp) is independent of MorphyLib, so the library
deletes wholesale - no MPLstate extraction needed.
Custom-search resampling (Jackknife/MorphyBootstrap) now resamples
character weights functionally and scores them with the native kernel,
replacing the in-place mpl_set_charac_weight mutation that the Stage-1
native MorphyLength path silently ignored (a latent regression). The
custom TreeScorer/EdgeSwapper API is unchanged; a new regression test
asserts resampled weights affect the score.
- PhyDat2Morphy/SingleCharMorphy native-only; UnloadMorphy a no-op;
RandomTreeScore native (kept RandomMorphyTree + pure C RANDOM_TREE)
- Removed mpl_* bindings, MorphyWeights/SetMorphyWeights/GapHandler/
MorphyErrorCheck/summary.morphyPtr, GetMorphyLength/C_MorphyLength,
MORPHYLENGTH/RANDOM_TREE_SCORE, morphy_iw/morphy_profile/preorder_morphy
- Deleted 17 vendored MorphyLib C/C++ sources; trimmed TreeSearch-init.c
and build_postorder.h; regenerated RcppExports/NAMESPACE/man
Only the default `inapplicable` gap treatment is supported natively.
Full suite green; the only failures are pre-existing WideSample/MaxMin
API drift, unrelated to this change.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Synthesise the best of PR #251 (this branch) and PR #216 (T-200, Agent B), which independently decoupled the R-loop search from MorphyLib. Adopted from #216: - Efficient native scorer: PrepareData() pre-extracts the scoring matrices once and EdgeListScore() calls ts_fitch_score() directly, instead of rebuilding a phylo and re-extracting tip data on every score. - `concavity` parameter on TreeSearch()/Ratchet()/Jackknife(): Inf = equal weights (default), finite = implied weights, "profile" = profile parsimony. Custom InitializeData hooks still receive the dataset unchanged. - custom.Rmd implied-weights section rewritten from ~150 lines of hand-rolled scaffolding to a few lines using `concavity` directly. Kept from #251 (and fixing #216's staleness): - MorphyLib stays fully deleted (#216 retained it as a fallback). - Score weights use .ScaleWeight() (fractional-weight aware) and Inf for the equal-weights concavity, matching current TreeLength(); #216 used as.integer(weight) and -1.0, predating those changes. - Functional resampling + a regression test (doubling weights doubles score). Naming: the layer is renamed to purpose-based names rather than #216's `*Native*` prefix, because once MorphyLib is gone "Native" no longer contrasts with anything. PhyDat2Morphy->PrepareData, UnloadMorphy->ReleaseData, is.morphyPtr->is.ParsimonyData, SingleCharMorphy->SingleCharData, MorphyLength->EdgeListScore, MorphyTreeLength->TreeScore, MorphyBootstrap->BootstrapTree, RandomMorphyTree->RandomPostorderTree. The old names remain as .Deprecated() aliases. TreeScore() == TreeLength() verified across equal weights, implied weights and profile parsimony; full suite green (only the pre-existing WideSample/MaxMin API-drift failures remain, unrelated to this change). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
MorphyLib C/C++ code was fully removed in 33dc949; the Copyright field is now factually wrong. Drop it. Brazeau and Guillerme contributed the BGS inapplicable-data algorithm, not a vendored library, so both are recredited as ctb with an accurate comment. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What & why
Routes
MorphyLength()/MorphyTreeLength()for the default inapplicable gap mode through the native Fitch kernel (FastCharacterLength) instead of vendored MorphyLib.This fixes a known MorphyLib bug scoring
{1-}(ambiguous-with-inapplicable) tokens — e.g. Lobo pattern 93 scores 2 (correct BGS) vs MorphyLib 3. Legacy custom-search scorers (TreeSearch,Ratchet,Jackknife, the custom-criteria vignette) now match the already-correctMaximizeParsimony(native kernel).Per maintainer decision: not aiming for MorphyLib bit-equivalence — native scores are the correct v2.0 behaviour.
How
PhyDat2Morphy()/SingleCharMorphy()attach precomputed native scoring data (.NativeMorphyData/.SingleCharNative). The Morphy pointer is retained, soMorphy()s C-loop and MorphyLib introspection are untouched.ambiguous/extragap modes keep the MorphyLib path for now (helper returnsNULL-> falls through).Validation
MorphyTreeLength == round(TreeLength)and per-characterSingleCharMorphysums ==TreeLength, verified across Lobo + congreveLamsdell datasets on pectinate & balanced trees (newtest-native-morphy-equivalence.R). Lobo pectinate = 273 (was 274 under the MorphyLib bug). NB CI may surface legacy tests that assert the old MorphyLib scores; those should be updated to the native-correct values.Staging (drop MorphyLib entirely from 2.0)
MorphyLengthpath.Morphy()spreorder_morphy/morphy_iw/morphy_profileloop + ambiguous/extra modes to native.mpl_*bindings, introspection helpers; update DESCRIPTION/Makevars.🤖 Generated with Claude Code