Skip to content

arena: support permuted Hadamard add/subt/mult on Tensor<ArenaTensor> ToT#555

Merged
evaleev merged 3 commits into
masterfrom
evaleev/fix/arena-tot-permuted-binary-ops
May 25, 2026
Merged

arena: support permuted Hadamard add/subt/mult on Tensor<ArenaTensor> ToT#555
evaleev merged 3 commits into
masterfrom
evaleev/fix/arena-tot-permuted-binary-ops

Conversation

@evaleev
Copy link
Copy Markdown
Member

@evaleev evaleev commented May 23, 2026

Problem

The permuted, arena ToT × arena ToT overloads of add, subt, and mult (scaled and unscaled) on TA::Tensor<ArenaTensor> threw "permuted ... of a tensor-of-tensors is not yet supported". This blocks CSV/PNO-based coupled-cluster in MPQC, whose residual evaluates permuted ToT Hadamard products at the tile-op level (the binary Mult/Add tile op calls left.mult(right, perm) directly).

Fix

By the time a permuted product reaches a tile op, the expression engine has already brought both operands to a common (congruent) layout, so the elementwise product/sum is valid and perm is purely the result permutation. Compute the unpermuted result, then apply perm as a post-pass via permute(), which already handles arena ToT — a shallow outer-cell reindex (arena_permute_shallow) plus an inner-slab rewrite (arena_inner_permute) when the bipartite permutation's inner part is non-trivial. This mirrors the existing numeric × arena permuted-mult branches.

Covers all six overloads that shared the stub: add(perm), add(factor,perm), subt(perm), subt(factor,perm), mult(perm), mult(factor,perm).

Validation

A previously-failing MPQC PNO-CCSD job (H₂O/cc-pVDZ, PaoPnoRMP2CCk) now runs to convergence (13 iterations, E = −76.23928119138472) with no exception.

evaleev added 3 commits May 23, 2026 17:02
… ToT

The permuted, arena ToT x arena ToT overloads of add, subt, and mult
(scaled and unscaled) previously threw "permuted ... of a
tensor-of-tensors is not yet supported". This blocked CSV/PNO-based
coupled-cluster, whose residual evaluates permuted ToT Hadamard
products at the tile-op level (a binary Mult/Add op calling
left.mult(right, perm) etc.).

By the time a permuted product reaches a tile op, the expression engine
has already brought both operands to a common (congruent) layout, so the
elementwise product/sum is valid and perm is purely the result
permutation. Compute the unpermuted result, then apply perm as a
post-pass via permute(), which already handles arena ToT: a shallow
outer-cell reindex (arena_permute_shallow) plus an inner-slab rewrite
(arena_inner_permute) when the bipartite permutation's inner part is
non-trivial. This mirrors the existing numeric x arena permuted-mult
branches.
…ction

A mixed inner-Scale product (Tensor<ArenaTensor> ToT x plain Tensor ->
ToT) under an outer Contraction with a non-identity inner *result*
permutation crashed in ArenaTensor::axpy_to(other, factor, perm), which
rejects in-place permutation of view cells.

The Scale path pushed the inner perm onto the per-cell op (via the
fallback axpy_to(..., perm)) and dropped it from total_perm, while
make_contraction_arena_plan bailed on any non-identity inner perm --
leaving the view result cell both unshaped and asked to permute itself.

Mirror the inner-Contraction view handling instead: for view (arena)
result cells, carry the full perm in total_perm so op_'s post-processing
permute applies the inner result perm as a slab-level rewrite, and pass
an identity inner perm to make_contraction_arena_plan so it builds the
plan (pre-shaping result cells unpermuted) and selects the perm-free
fused scale op. Owning inner cells keep applying the inner perm in the
per-cell scale op (outer-only total_perm), unchanged.
…ions

Commit 34711c8 made the ToT x scalar Scale contraction always hand an
identity inner perm to make_contraction_arena_plan, so the plan is built
and the perm-free fused scale op is selected, with the inner result perm
applied downstream by op_'s post-processing permute. That is correct for
view (arena) inner cells -- make_total_perm carries the full perm for
them -- but is_contraction_arena_tot_v is also true for owning legacy
TA::Tensor ToT inner cells, and for those make_total_perm carries only
the outer perm. So an owning ToT Scale contraction with a non-identity
inner result permutation lost the inner perm entirely: identity plan +
perm-free op + outer-only total_perm, producing a wrong-inner-layout
result (and, under distributed eval, a malformed result whose deferred
destruction aborted at a later fence).

Mirror the make_total_perm view/owning split here: pass an identity
inner perm only for view cells; for owning cells pass inner(perm_) so
the plan bails (nullopt) on a non-identity inner perm and the per-cell
op applies it, exactly as before 34711c8. Restores einsum_manual/
different_nested_ranks and einsum_tot_t/ilkj_nm_eq_ij_mn_times_kl.
@evaleev evaleev merged commit 75573cf into master May 25, 2026
9 checks passed
@evaleev evaleev deleted the evaleev/fix/arena-tot-permuted-binary-ops branch May 25, 2026 04:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant