fix(tf): reject non-prefix use_spin layouts in the spin helper#5727
fix(tf): reject non-prefix use_spin layouts in the spin helper#5727wanghan-iapcm wants to merge 1 commit into
Conversation
The legacy TensorFlow spin implementation assumes spin-enabled types form a contiguous prefix of the type map. The SE-A sel extension takes the first ntypes_spin selections (sel_a[:ntypes_spin]), and the coordinate splitting (se_a.py), force splitting (model/ener.py), and bias merging (fit/ener.py) all address the virtual block with a dense real->virtual offset (i + len(use_spin)). For a non-prefix layout such as use_spin=[False, True], these read the wrong real/virtual type ranges or raise deep inside the graph, and nothing rejected the configuration up front. Rather than refactor all four sites in this legacy backend (the maintained PyTorch backend already supports the sparse layout via Spin.spin_type), guard against it: Spin now rejects a use_spin where a non-spin type precedes a spin-enabled one, with a clear message telling the user to list spin-enabled types first. This turns a silent-wrong result or an obscure crash into an actionable error and documents the invariant. Adds source/tests/tf/test_spin_prefix_guard.py: non-prefix layouts ([False, True] and [True, False, True]) raise ValueError, while prefix layouts ([True, False], [True, True]) and an all-non-spin list ([False, False]) are accepted. The prefix requirement previously had no test. Fix deepmodeling#5680
|
Warning Review limit reached
Next review available in: 3 minutes Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available. How can I continue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews. How do review limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window. Please refer docs for additional details. Review details⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #5727 +/- ##
==========================================
- Coverage 81.26% 81.14% -0.13%
==========================================
Files 988 988
Lines 110877 110890 +13
Branches 4234 4234
==========================================
- Hits 90103 89978 -125
- Misses 19249 19388 +139
+ Partials 1525 1524 -1 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Problem
Fixes #5680. The legacy TensorFlow spin implementation assumes spin-enabled types form a contiguous prefix of the type map. The SE-A
selextension takes the firstntypes_spinselections (sel_a[:ntypes_spin]), and the coordinate splitting (deepmd/tf/descriptor/se_a.py), force splitting (deepmd/tf/model/ener.py), and bias merging (deepmd/tf/fit/ener.py) all address the virtual block with a dense real-to-virtual offset (i + len(use_spin)). For a non-prefix layout such asuse_spin=[False, True], these read the wrong real/virtual type ranges or raise deep inside the graph, and nothing rejected the configuration up front. In practice all supported spin models list spin-enabled types first, so the broken layout went unnoticed.Approach
The maintained PyTorch backend already handles the sparse/non-prefix layout via
Spin.spin_type. Rather than refactor all four coupled sites in this legacy backend (which has almost no coverage and where a subtle mistake would silently corrupt training), this guards against the unsupported layout: the TFSpinhelper now rejects ause_spinwhere a non-spin type precedes a spin-enabled one, with a message telling the user to list spin-enabled types first. This turns a silent-wrong result or an obscure crash into an actionable error and documents the invariant in one place.Test
Adds
source/tests/tf/test_spin_prefix_guard.py: non-prefix layouts ([False, True]and[True, False, True]) raiseValueError, while prefix layouts ([True, False],[True, True]) and an all-non-spin list ([False, False]) are accepted. The existing TF spin model test uses a valid[True, False]prefix and continues to pass. The prefix requirement previously had no test.