Skip to content

evmigration(multisig): catch --nosort destination mismatch at sign time + operator-facing docs#176

Merged
mateeullahmalik merged 2 commits into
masterfrom
matee/multisig-nosort-operator-guardrails
Jun 23, 2026
Merged

evmigration(multisig): catch --nosort destination mismatch at sign time + operator-facing docs#176
mateeullahmalik merged 2 commits into
masterfrom
matee/multisig-nosort-operator-guardrails

Conversation

@mateeullahmalik

Copy link
Copy Markdown
Contributor

Summary

Eliminates the most common multisig-migration operator footgun: building the destination EVM multisig without --nosort, which breaks the mirror-source signer-index invariant and currently fails late with a cryptic CLI error.

Adds (1) a pre-check in migrate-multisig.sh sign that catches the mismatch and prints an exact rebuild recipe, and (2) prominent --nosort callouts + a new "Step 0: Build the destination EVM multisig" section in the operator docs.

No protocol / state-machine / consensus change. Script + docs only.

Background

generate-proof-payload preserves the order in which --new-sub-pub-keys is passed (does not sort). However, Cosmos SDK's lumerad keys add --multisig does sort sub-pubkeys by bytes by default. Because legacy secp256k1 and new eth_secp256k1 pubkey bytes sort differently, the default sort produces a destination multisig whose members land at different signer indices than the legacy side. The on-chain mirror-source rule then rejects the proof, and migrate-multisig.sh sign (when a co-signer holds both halves) errors out with:

legacy key "X" is signer index N, but new key "Y" is signer index M;
multisig migration requires the same signer position to approve both halves

…leaving the operator to figure out that they need to delete the destination key, rebuild it with --nosort, and have the coordinator regenerate proof.json.

This was hit live during a local rc5 rehearsal that ran the full 31-account foundation migration; 11/26 multisigs failed on sign-proof before the --nosort root cause was identified. With --nosort the same batch succeeded first-try, end to end.

Changes

1. scripts/migrate-multisig.sh — sign-time index-alignment pre-check

When a co-signer passes both --from and --new-key, compute each pubkey's index in the proof file's sub_pub_keys arrays. If they disagree, exit 11 with an actionable error including the exact lumerad keys add ... --nosort rebuild command and the full 4-step remediation. Avoids round-tripping through the binary's terse error.

2. docs/evm-integration/user-guides/migration-scripts.md

  • Adds Step 0: Build the destination EVM multisig to the Multisig Migration section with the explicit --nosort recipe (was completely missing — operators following only the scripts walkthrough would hit the trap with zero warning).
  • Adds a ⚠️ Critical callout at the top of the section pointing to Step 0.

3. docs/evm-integration/user-guides/migration.md

  • Promotes the existing --nosort sentence (previously a single line buried at the end of a bullet) to a prominent ⚠️ callout block with the copy-pasteable lumerad keys add command.

Risk / Rollback

  • Pre-check only adds an earlier exit on a path that was already failing (just less helpfully). Cannot create false positives that weren't already rejected on-chain.
  • New exit code 11 is distinct from existing codes (1, 3, 8, 9, 10) — no collision with downstream tooling.
  • Docs are additive.
  • Rollback: revert the commit.

Verification

  • bash -n clean
  • shellcheck clean
  • migrate-multisig.sh sign --help still renders
  • Live-fire validated during local rc5 rehearsal: with --nosort + canonical signer order, 26/26 batch multisig migrations succeed; without --nosort (the trap path) the pre-check now fires at the right moment with the right remediation.

Definition of Done

  • Script change is minimal and behind the existing both-sides-signing path
  • Docs callouts surface --nosort at every entry point operators read
  • shellcheck + bash -n clean
  • Validated against the same workload that originally exposed the trap

Operator-facing diff (the new error)

[ERROR] signer-index mismatch: legacy key 'team_3_2' is at index 2,
[ERROR]   but new key 'team_3_2_evm' is at index 0 in proof.json
[ERROR]
[ERROR] Multisig migration requires the same signer position on both sides.
[ERROR] The destination EVM multisig was almost certainly built WITHOUT --nosort,
[ERROR] so its sub-pub-keys were re-sorted by bytes and no longer mirror the legacy
[ERROR] member order.
[ERROR]
[ERROR] Remediation — rebuild the destination multisig in legacy member order:
[ERROR]   1. Get legacy member order from chain:
[ERROR]      lumerad query auth account <legacy-multisig-address> -o json \
[ERROR]        | jq -r '.account.value.pub_key.public_keys[].key'
[ERROR]   2. Identify each member's eth_secp256k1 sub-key (in the SAME order).
[ERROR]   3. Recreate the destination key WITH --nosort:
[ERROR]        lumerad keys delete <new-multisig-key> -y
[ERROR]        lumerad keys add <new-multisig-key> \
[ERROR]          --multisig=<eth-sub-1>,<eth-sub-2>,...,<eth-sub-N> \
[ERROR]          --multisig-threshold=2 \
[ERROR]          --nosort
[ERROR]   4. Re-run 'migrate-multisig.sh generate' with the rebuilt key, then redistribute
[ERROR]      the fresh proof.json to co-signers.

Cosmos SDK's `keys add --multisig` sorts sub-pubkeys by bytes by default.
Legacy secp256k1 and new eth_secp256k1 pubkeys sort differently, so a
destination multisig built without --nosort lands its members at different
signer indices than the legacy multisig. This breaks the mirror-source
invariant and a co-signer's both-sides partial fails with a terse 'signer
index N vs M' error from the binary - leaving the operator to figure out
that they need to rebuild the destination key with --nosort.

This change adds a pre-check in `migrate-multisig.sh sign`: when both
--from and --new-key are passed, compute each pubkey's index in the proof
file's sub_pub_keys, and if they disagree print an actionable remediation
including the exact 'lumerad keys add ... --nosort' command to rebuild the
destination. Exit 11.

Also adds:
- A '0. Build the destination EVM multisig' section to migration-scripts.md
  with the explicit --nosort recipe (was previously omitted from the
  scripts walkthrough entirely).
- A prominent --nosort callout to migration.md alongside the existing
  member-order note (was a single sentence buried in a bullet).

Discovered during the local rc5 rehearsal that re-ran the full 31-account
foundation migration; without --nosort 11/26 batch multisigs failed on
sign-proof.

No protocol or state-machine change. Script + docs only.

Signed-off-by: Matee ullah Malik <mateeullahmalik@hotmail.com>
@mateeullahmalik mateeullahmalik force-pushed the matee/multisig-nosort-operator-guardrails branch from 1bc5b29 to bd8dcbe Compare June 23, 2026 14:09
@a-ok123 a-ok123 requested review from akobrin1 and Copilot June 23, 2026 19:02

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reduces multisig-migration operator errors by detecting destination multisig member-order mismatches at sign time (commonly caused by omitting --nosort) and by making the required --nosort workflow prominent in the operator docs.

Changes:

  • Add a signer-index alignment pre-check in migrate-multisig.sh sign (when both --from and --new-key are provided) with an actionable remediation path and distinct exit code.
  • Update migration docs to prominently call out --nosort and provide copy/pasteable destination multisig build instructions.
  • Add a new “Step 0: Build the destination EVM multisig” section to the scripts guide.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
scripts/migrate-multisig.sh Adds sign-time signer-index mismatch detection and operator-facing remediation instructions.
docs/evm-integration/user-guides/migration.md Promotes --nosort requirement into a prominent callout with an example command.
docs/evm-integration/user-guides/migration-scripts.md Adds an explicit Step 0 for building the destination multisig with --nosort and adds a critical warning callout.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread scripts/migrate-multisig.sh
Comment thread scripts/migrate-multisig.sh
Comment thread docs/evm-integration/user-guides/migration-scripts.md
…ookup

Copilot review pointed out that 'sub_pub_keys | to_entries[] | select(.value==$pk) | .key'
emits multiple lines if duplicates exist, which would corrupt from_idx/new_idx as
multi-line strings and break the string-compare mismatch check. While real multisigs
don't carry duplicate sub-pubkeys, 'index($pk) // empty' is the deterministic primitive:
returns the first (single) index or empty on no match. Simpler and robust to
hypothetical degenerate input.

Behavior on the happy path is identical; touches only the two new jq lookups added
in 1bc5b29.
@mateeullahmalik mateeullahmalik self-assigned this Jun 23, 2026
@mateeullahmalik mateeullahmalik merged commit f2d1608 into master Jun 23, 2026
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants