FIX IA3 _ConvNd.unmerge bias uses torch.mul instead of torch.div by Chessing234 · Pull Request #3149 · huggingface/peft

Chessing234 · 2026-04-11T10:31:17Z

Bug

_ConvNd.unmerge in src/peft/tuners/ia3/layer.py does not undo the bias scaling that _ConvNd.merge applied. After a merge() -> unmerge() round-trip on an IA3 non-feedforward Conv2d/Conv3d layer whose base layer has bias is not None, the base layer's bias ends up multiplied by ia3_l ** 2 instead of being restored to its pre-merge value, and each subsequent disable_adapters-triggered unmerge compounds the error.

Root cause

In _ConvNd.merge (line 254–256) the bias is scaled with torch.mul:

if not self.is_feedforward and (base_layer.bias is not None):
    scaling = self.ia3_l[active_adapter].reshape(base_layer.bias.shape)
    base_layer.bias.data = torch.mul(base_layer.bias.data, scaling.data).to(orig_dtype)

and in _ConvNd.unmerge (line 280–283) the bias is "un-scaled" by also using torch.mul — exactly the same op:

if not self.is_feedforward and (base_layer.bias is not None):
    scaling = self.ia3_l[active_adapter].reshape(base_layer.bias.shape)
    orig_dtype = base_layer.bias.data.dtype
    base_layer.bias.data = torch.mul(base_layer.bias.data, scaling.data).to(orig_dtype)

So bias_merged = bias * ia3_l and then bias_unmerged = bias_merged * ia3_l = bias * ia3_l ** 2, rather than bias.

The corresponding weight path in the same function (line 278) uses torch.div correctly:

base_layer.weight.data = torch.div(base_layer.weight.data, ia3_scaling + 1e-8).to(orig_dtype)

and the sibling Linear.unmerge bias path at line 152–155 also does it correctly:

if not self.is_feedforward and (base_layer.bias is not None):
    scaling = self.ia3_l[active_adapter].reshape(base_layer.bias.shape)
    orig_dtype = base_layer.bias.data.dtype
    base_layer.bias.data = torch.div(base_layer.bias.data, scaling.data + 1e-8).to(orig_dtype)

so the intended shape of the fix is unambiguous — _ConvNd.unmerge was copy-pasted from merge for the bias branch and never flipped to a div.

Worked example

Start with bias = [1.0, 2.0] and IA3 scaling = [3.0, 0.5].

step	current code	expected
after merge	`[3.0, 1.0]`	`[3.0, 1.0]`
after unmerge	`[9.0, 0.5]`	`[1.0, 2.0]`

Fix

Mirror the Linear.unmerge pattern: divide by scaling.data + 1e-8 (the + 1e-8 guards against any IA3 scaling entry that happens to be zero — same tolerance used on the weight line just above and in Linear.unmerge).

                 if not self.is_feedforward and (base_layer.bias is not None):
                     scaling = self.ia3_l[active_adapter].reshape(base_layer.bias.shape)
                     orig_dtype = base_layer.bias.data.dtype
-                    base_layer.bias.data = torch.mul(base_layer.bias.data, scaling.data).to(orig_dtype)
+                    base_layer.bias.data = torch.div(base_layer.bias.data, scaling.data + 1e-8).to(orig_dtype)

One-line change, symmetric with the weight path right above and with Linear.unmerge in the same file. No effect on the feedforward or bias is None branches.

\`_ConvNd.unmerge\` in \`src/peft/tuners/ia3/layer.py\` is supposed to undo the corresponding \`_ConvNd.merge\` by inverting the (IA)^3 scaling applied to the base layer's weight **and** bias. The weight path correctly divides by the IA3 scaling (line 278): base_layer.weight.data = torch.div( base_layer.weight.data, ia3_scaling + 1e-8 ).to(orig_dtype) but the bias path right below it accidentally uses \`torch.mul\` again, the same op \`merge\` uses: # merge (line 256) base_layer.bias.data = torch.mul(base_layer.bias.data, scaling.data).to(orig_dtype) ... # unmerge (line 283) <-- BUG: should be torch.div base_layer.bias.data = torch.mul(base_layer.bias.data, scaling.data).to(orig_dtype) So a merge followed by an unmerge leaves the bias scaled by \`ia3_l ** 2\` instead of being restored to its original value. For any non-feedforward IA3 Conv layer (Conv2d / Conv3d) whose base layer has \`bias is not None\`, \`merge() -> unmerge()\` permanently corrupts the bias — and subsequent \`disable_adapters\`-triggered unmerges compound the error. The sibling \`Linear.unmerge\` in the same file (line 155) already uses the correct pattern: base_layer.bias.data = torch.div( base_layer.bias.data, scaling.data + 1e-8 ).to(orig_dtype) Mirror that pattern in \`_ConvNd.unmerge\`: divide by \`scaling.data + 1e-8\` so the bias unmerge is the inverse of the bias merge, and the \`+ 1e-8\` tolerance guards against division by any IA3 scaling entry that happens to be zero.

githubnemo

Hey, nice find.

Can you also reduce the tolerance in test_custom_models.py for test_merge? (to verify that the fix works)

--- a/tests/testing_common.py
+++ b/tests/testing_common.py
@@ -608,7 +608,7 @@ class PeftCommonTester:
                 atol, rtol = 1e-3, 1e-3
             if (config.peft_type in {"IA3", "LORA", "OFT"}) and (model_id in conv_ids):
                 # for some reason, the Conv introduces a larger error
-                atol, rtol = 0.3, 0.01
+                atol, rtol = 1e-2, 0.01
             if (config.peft_type == "OFT") and not is_decoder:
                 atol, rtol = 0.3, 0.01
             if model_id == "trl-internal-testing/tiny-Llama4ForCausalLM":

It would also be nice if you'd found out why test_safe_merge doesn't detect this issue.

githubnemo reviewed Apr 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FIX IA3 _ConvNd.unmerge bias uses torch.mul instead of torch.div#3149

FIX IA3 _ConvNd.unmerge bias uses torch.mul instead of torch.div#3149
Chessing234 wants to merge 1 commit intohuggingface:mainfrom
Chessing234:fix/ia3-convnd-unmerge-bias-div

Chessing234 commented Apr 11, 2026 •

edited by githubnemo

Loading

Uh oh!

githubnemo left a comment •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Chessing234 commented Apr 11, 2026 • edited by githubnemo Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bug

Root cause

Worked example

Fix

Uh oh!

githubnemo left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Chessing234 commented Apr 11, 2026 •

edited by githubnemo

Loading

githubnemo left a comment •

edited

Loading