Skip to content

fix(tf): reject non-prefix use_spin layouts in the spin helper#5727

Open
wanghan-iapcm wants to merge 1 commit into
deepmodeling:masterfrom
wanghan-iapcm:guard-tf-spin-prefix
Open

fix(tf): reject non-prefix use_spin layouts in the spin helper#5727
wanghan-iapcm wants to merge 1 commit into
deepmodeling:masterfrom
wanghan-iapcm:guard-tf-spin-prefix

Conversation

@wanghan-iapcm

Copy link
Copy Markdown
Collaborator

Problem

Fixes #5680. The legacy TensorFlow spin implementation assumes spin-enabled types form a contiguous prefix of the type map. The SE-A sel extension takes the first ntypes_spin selections (sel_a[:ntypes_spin]), and the coordinate splitting (deepmd/tf/descriptor/se_a.py), force splitting (deepmd/tf/model/ener.py), and bias merging (deepmd/tf/fit/ener.py) all address the virtual block with a dense real-to-virtual offset (i + len(use_spin)). For a non-prefix layout such as use_spin=[False, True], these read the wrong real/virtual type ranges or raise deep inside the graph, and nothing rejected the configuration up front. In practice all supported spin models list spin-enabled types first, so the broken layout went unnoticed.

Approach

The maintained PyTorch backend already handles the sparse/non-prefix layout via Spin.spin_type. Rather than refactor all four coupled sites in this legacy backend (which has almost no coverage and where a subtle mistake would silently corrupt training), this guards against the unsupported layout: the TF Spin helper now rejects a use_spin where a non-spin type precedes a spin-enabled one, with a message telling the user to list spin-enabled types first. This turns a silent-wrong result or an obscure crash into an actionable error and documents the invariant in one place.

Test

Adds source/tests/tf/test_spin_prefix_guard.py: non-prefix layouts ([False, True] and [True, False, True]) raise ValueError, while prefix layouts ([True, False], [True, True]) and an all-non-spin list ([False, False]) are accepted. The existing TF spin model test uses a valid [True, False] prefix and continues to pass. The prefix requirement previously had no test.

The legacy TensorFlow spin implementation assumes spin-enabled types form a
contiguous prefix of the type map. The SE-A sel extension takes the first
ntypes_spin selections (sel_a[:ntypes_spin]), and the coordinate splitting
(se_a.py), force splitting (model/ener.py), and bias merging (fit/ener.py) all
address the virtual block with a dense real->virtual offset (i + len(use_spin)).
For a non-prefix layout such as use_spin=[False, True], these read the wrong
real/virtual type ranges or raise deep inside the graph, and nothing rejected
the configuration up front.

Rather than refactor all four sites in this legacy backend (the maintained
PyTorch backend already supports the sparse layout via Spin.spin_type), guard
against it: Spin now rejects a use_spin where a non-spin type precedes a
spin-enabled one, with a clear message telling the user to list spin-enabled
types first. This turns a silent-wrong result or an obscure crash into an
actionable error and documents the invariant.

Adds source/tests/tf/test_spin_prefix_guard.py: non-prefix layouts
([False, True] and [True, False, True]) raise ValueError, while prefix layouts
([True, False], [True, True]) and an all-non-spin list ([False, False]) are
accepted. The prefix requirement previously had no test.

Fix deepmodeling#5680
@dosubot dosubot Bot added the bug label Jul 3, 2026
@coderabbitai

coderabbitai Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Warning

Review limit reached

@wanghan-iapcm, you've reached your PR review limit, so we couldn't start this review.

Next review available in: 3 minutes

Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available.
You're only billed for reviews past your plan's rate limits ($0.25/file).

How can I continue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews.

How do review limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please refer docs for additional details.

Review details
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: df1579fa-538d-4c3d-abc5-3881a21ed113

📥 Commits

Reviewing files that changed from the base of the PR and between bbc3908 and 956e4bb.

📒 Files selected for processing (2)
  • deepmd/tf/utils/spin.py
  • source/tests/tf/test_spin_prefix_guard.py
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@wanghan-iapcm wanghan-iapcm requested a review from njzjz July 3, 2026 17:51
@github-actions github-actions Bot added the Python label Jul 3, 2026
@codecov

codecov Bot commented Jul 3, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 81.14%. Comparing base (dd38b35) to head (956e4bb).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5727      +/-   ##
==========================================
- Coverage   81.26%   81.14%   -0.13%     
==========================================
  Files         988      988              
  Lines      110877   110890      +13     
  Branches     4234     4234              
==========================================
- Hits        90103    89978     -125     
- Misses      19249    19388     +139     
+ Partials     1525     1524       -1     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Code scan] Map TensorFlow spin virtual types by use_spin

1 participant