OpenPBR - Support importing glTF surface tinting#18430
OpenPBR - Support importing glTF surface tinting#18430MiiBond wants to merge 5 commits intoBabylonJS:masterfrom
Conversation
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
Please make sure to label your PR with "bug", "new feature" or "breaking change" label(s). |
|
Snapshot stored with reference name: Test environment: To test a playground add it to the URL, for example: https://snapshots-cvgtc2eugrd3cgfd.z01.azurefd.net/refs/pull/18430/merge/index.html#WGZLGJ#4600 Links to test your changes to core in the published versions of the Babylon tools (does not contain changes you made to the tools themselves): https://playground.babylonjs.com/?snapshot=refs/pull/18430/merge To test the snapshot in the playground with a playground ID add it after the snapshot query string: https://playground.babylonjs.com/?snapshot=refs/pull/18430/merge#BCU1XR#0 If you made changes to the sandbox or playground in this PR, additional comments will be generated soon containing links to the dev versions of those tools. |
🟢 Memory Leak Test Results13 passed, 0 leaked out of 13 scenarios 🟢 All memory leak tests passed — no leaks detected. Passed Scenarios (13)
|
|
WebGL2 visualization test reporter: |
⚡ Performance Test Results🟢 All performance tests passed — no regressions detected. |
|
Visualization tests for WebGPU |
Popov72
left a comment
There was a problem hiding this comment.
Code review of the OpenPBR surface-tinting import path. Overall approach (move surface tint into the coat layer when there's volumetric attenuation, with a generic GPU texture-processor for the merge math) looks sound, and the snapshot-then-mutate sequencing is correct.
Main concerns to address before merge:
- A few texture leaks when intermediate
*TexturesAsynccalls reject mid-chain. gammaSpaceis left at the default (true) on resultProceduralTextures, which is wrong for non-color outputs (roughness, weight, normal) — they end up flagged as sRGB even though they hold linear data.- The
IMaterialLoadingAdapter.finalize→finalizeAsyncrename silently breaks any third-party adapter that implemented the old name (no shim, no deprecation cycle). - New 632-line public utility (
textureProcessor.ts) and the new OpenPBR finalize behaviour ship without unit or visual tests.
See inline comments for details.
| ), | ||
| this._material.getScene(), | ||
| TextureColorSpace.SRGB | ||
| ); |
There was a problem hiding this comment.
Texture leak on rejection mid-chain.
This pattern:
const newCoatColor = await MultiplyTexturesAsync(
name,
await LerpTexturesAsync("lerpExistingCoat", ...), // L1
await LerpTexturesAsync("lerpSurfaceColor", ...), // L2
...
);evaluates the inner awaits sequentially. If L1 resolves (allocating a ProceduralTexture with a dispose callback on the operand) and then L2 rejects — or MultiplyTexturesAsync itself rejects after consuming its arguments — L1's GPU texture is never disposed because nothing downstream gets to call a.dispose?.() / b.dispose?.().
Same leak shape exists inside MultiplyTexturesAsync / MaxTexturesAsync / LerpTexturesAsync: if await _RenderAsync(pt) throws, it disposes pt itself but the a.dispose?.() / b.dispose?.() calls below it are skipped, leaking any intermediate textures that were passed in.
Suggestion: in the three *TexturesAsync helpers, wrap the render in try/catch that disposes the input operands on the failure path:
try {
await _RenderAsync(pt);
} catch (e) {
a.dispose?.();
b.dispose?.();
t?.dispose?.();
throw e;
}
a.dispose?.();
b.dispose?.();
t?.dispose?.();(or factor that into a small helper). With that, the call-site here is safe even when await L1 succeeds and await L2 rejects.
| CreateTextureWithFactorOperand(origCoatWeightTexture, origCoatWeightCol4, TextureChannel.R), | ||
| this._material.getScene(), | ||
| TextureColorSpace.SRGB | ||
| ), |
There was a problem hiding this comment.
Sequential GPU work that could run in parallel.
The lerpExistingCoat and lerpSurfaceColor lerps are independent — they share no inputs and neither depends on the other's result — yet the await … await … arg-list pattern serializes them. For assets with many transmissive materials this measurably slows import.
Consider:
const [lerpExisting, lerpSurface] = await Promise.all([
LerpTexturesAsync("lerpExistingCoat", ...),
LerpTexturesAsync("lerpSurfaceColor", ...),
]);
const newCoatColor = await MultiplyTexturesAsync(
"newCoatColor (" + this._material.name + ")",
lerpExisting,
lerpSurface,
this._material.getScene(),
TextureColorSpace.SRGB,
);If you adopt the try/catch leak fix from the other comment, you'll want to combine the two: on Promise.all rejection, dispose any operand whose promise resolved. Something like:
const results = await Promise.allSettled([...]);
if (results.some(r => r.status === "rejected")) {
for (const r of results) if (r.status === "fulfilled") r.value.dispose?.();
throw (results.find(r => r.status === "rejected") as PromiseRejectedResult).reason;
}| }, | ||
| }; | ||
|
|
||
| const pt = new ProceduralTexture(name, outputSize, _ShaderName, scene, options); |
There was a problem hiding this comment.
gammaSpace is not configured on the result texture.
BaseTexture._gammaSpace defaults to true, so every ProceduralTexture produced here is flagged as sRGB regardless of what the operation actually wrote. That's only correct when the caller passed outputColorSpace: TextureColorSpace.SRGB and OUTPUT_SRGB re-encoded the linear result back into sRGB.
For the non-color outputs in copySurfaceToCoatAsync (coatRoughnessTexture, coatWeightTexture, geometryCoatNormalTexture), no OUTPUT_SRGB is applied — the texture holds linear data — but it's still being assigned to a material slot with gammaSpace = true. Anything downstream that linearizes based on gammaSpace (manual sRGB→linear in shaders, env conversions, screenshot pipelines, asset re-export) will get incorrectly decoded values.
Fix: set the gamma flag to mirror the chosen output space. Easiest place is here, right after the new ProceduralTexture(...):
const pt = new ProceduralTexture(name, outputSize, _ShaderName, scene, options);
pt.gammaSpace = false; // default: linear outputand then in each *TexturesAsync, after creating pt set pt.gammaSpace = outputColorSpace === TextureColorSpace.SRGB; (or pass the output color space into _CreateProcessorTexture).
| * so callers can rely on onCompleteObservable for fully processed materials. | ||
| */ | ||
| finalize?(): void; | ||
| finalizeAsync?(): Promise<void> | void; |
There was a problem hiding this comment.
Backward-compat break on a public interface.
IMaterialLoadingAdapter is exported, and renaming finalize?() → finalizeAsync?() is a silent break: any third-party adapter that previously implemented finalize() will simply stop being called — no compile error, no runtime warning, and the material just won't get its post-load fixups.
Per the repo backcompat conventions, optional public API rename should go through a deprecation cycle. Suggested shape:
export interface IMaterialLoadingAdapter {
/** @deprecated Use finalizeAsync instead. */
finalize?(): void;
finalizeAsync?(): Promise<void> | void;
...
}and on the loader side, prefer finalizeAsync but fall back to finalize (with a one-time Logger.Warn about the deprecation) so existing implementors keep working for at least one release.
| * are linearized before use. Defaults to `TextureColorSpace.Linear`. | ||
| * @returns An operand that evaluates to the sampled texture value | ||
| */ | ||
| export function CreateTextureOperand(texture: Nullable<BaseTexture>, channel?: TextureChannel, colorSpace?: TextureColorSpace): ITextureProcessOperand { |
There was a problem hiding this comment.
No tests for this 632-line new public API surface.
MultiplyTexturesAsync / MaxTexturesAsync / LerpTexturesAsync, the constant-folding fast paths, the UV-transform propagate-vs-bake decision, the channel/colorspace combinations, and the OUTPUT_SRGB round-trip all ship without coverage. Some of these (_AllTransformsMatch choosing propagate vs. bake, _EvalConstant fallbacks, the channel→define mapping) are easy to regress and easy to unit-test.
At minimum I'd want:
- All-constant fold:
Multiply(factor-only, factor-only)returns{ texture: null, factor }with the right value, no GPU pass. - Single-texture path with default channel/colorspace.
OUTPUT_SRGBround-trip:sRGB-in → linear-process → OUTPUT_SRGB → sRGB-outproduces the expected pixels at a few key values (0, 0.04, 0.5, 1.0).- Mixed transform paths: matching transforms → propagated to result; differing transforms → baked, identity on result.
There was a problem hiding this comment.
Okay, I asked Claude to generate tests as you described and it added 46 new tests. They look okay but I wonder if it's too much...
CPU constant fold (19 tests) — No GPU pass, no scene needed. Tests every operation (Multiply, Max, Lerp, Invert, ExtractMaxChannel, ExtractChannel) against exact values, verifies texture: null and no dispose, and covers all ChannelMask variants including the edge cases (alpha=1 for RGB mask, zeros for excluded channels). Verifies _capturedPTs stays empty to confirm no GPU work happened.
sRGB formula round-trip (7 tests) — Pure JS implementations of the IEC 61966-2-1 linearize/encode formulas (matching the GLSL OPERAND_X_SRGB and OUTPUT_SRGB blocks). Tests encode(linearize(v)) ≈ v at the four key values you asked for (0, 0.04045, 0.5, 1.0), plus boundary values and monotonicity. Pixel-level GPU correctness is left as a note for the visualization suite.
GPU path structure (15 tests) — Uses a lightweight FakeProceduralTexture (hoisted with vi.hoisted so it works across the mock boundary) that immediately "compiles" and "renders" without WebGL. Verifies: non-null texture + dispose on the result; dispose actually releases the underlying PT; correct shader defines emitted for OUTPUT_SRGB, OPERAND_A_TEXTURE, OPERAND_A_SRGB, channel swizzles, output-mask flags; and auto-dispose of intermediate textures when chained.
UV transform propagation (3 tests) — Verifies that matching transforms are propagated to the result (UV offset copied, no MATRIX defines), differing transforms are baked (UV stays identity, both OPERAND_A_MATRIX and OPERAND_B_MATRIX emitted), and single-operand operations (InvertTextureAsync) always propagate.
| /** | ||
| * | ||
| */ | ||
| public async finalizeAsync(): Promise<void> { |
There was a problem hiding this comment.
No visual test added for the OpenPBR surface-tinting import behavior.
Per visual-tests.instructions.md, new rendering features need visualization coverage, and this PR explicitly changes the rendered result of any glTF that combines:
KHR_materials_transmission+KHR_materials_volume(with non-white baseColor or baseColorTexture), orKHR_materials_diffuse_transmission+KHR_materials_volume, optionally with an existingKHR_materials_clearcoat.
Given you already have a test asset (the screenshot in the PR description), adding it as a Playground snippet + entry in packages/tools/tests/test/visualization/config.json with dependsOn: ["PBR", "glTF", "Loaders"] would lock in the behaviour. Both the simple case (no existing coat) and the merge case (existing coat layer) are worth covering as separate entries since they hit different code paths in copySurfaceToCoatAsync.
Also, while you're here: this method has an empty doc comment block (/**\n *\n */) — either remove it or reference the interface contract.
There was a problem hiding this comment.
I've made a PR here for an asset that I want to create a vis test with:
BabylonJS/Assets#145
| if (finalizePromise) { | ||
| this._completePromises.push(finalizePromise); | ||
| } | ||
| } |
There was a problem hiding this comment.
Promise.all(this._completePromises) has no .catch handler.
This was already true pre-PR, but finalizeAsync introduces real failure modes (shader compile failure, lost device, OOM during procedural texture allocation) that previously didn't exist on this path. With the current code at ~line 553:
Promise.all(this._completePromises).then(() => {
...
this._parent._setState(GLTFLoaderState.COMPLETE);
this._parent.onCompleteObservable.notifyObservers(undefined);
...
this.dispose();
});any rejection becomes an unhandled promise rejection, the loader never reaches COMPLETE, onCompleteObservable never fires, and dispose() is never called from this path — so consumers awaiting completion just hang.
Suggested fix: add a .catch that mirrors the outer rejection path:
Promise.all(this._completePromises).then(
() => { /* ...COMPLETE... */ },
(error) => {
if (!this._disposed) {
this._parent.onErrorObservable.notifyObservers(error);
this._parent.onErrorObservable.clear();
this.dispose();
}
},
);| for (const adapter of Array.from(this._materialAdapters)) { | ||
| adapter.finalize?.(); | ||
| } | ||
| this._materialAdapters.clear(); |
There was a problem hiding this comment.
dispose() while finalizeAsync is in flight leaks textures and writes to a disposed material.
The new flow kicks finalizeAsync off in the LOADING→READY transition and pushes its promise into _completePromises. If something disposes the loader before those promises resolve (external cancel, error elsewhere in the load), dispose() here clears _completePromises.length = 0 synchronously, but the in-flight MultiplyTexturesAsync / LerpTexturesAsync keep running. They will:
- Allocate
ProceduralTextures on the (now possibly disposed) scene. - Eventually run their continuation that does
this.coatColorTexture = newCoatColor.texture;etc. on a material whose owning adapter is gone.
Minimal mitigation: have OpenPBRMaterialLoadingAdapter track a _disposed flag (set when the loader disposes its adapter set), and have copySurfaceToCoatAsync check it before each assignment, disposing the just-produced texture instead of binding it. Also worth pt.dispose()-ing any in-flight procedural textures owned by the adapter on dispose.
There was a problem hiding this comment.
I've gone with an AbortController that I check between each textureProcessor operation. Will that address your concern?
| // scene render loop would call _shouldRender() → true, then re-render the | ||
| // PT on the next frame — potentially with already-disposed input textures, | ||
| // producing a black result. Removing it here prevents that re-render. | ||
| const scene = pt.getScene(); |
There was a problem hiding this comment.
Fragile workaround for the scene.proceduralTextures re-render issue.
The diagnosis in the comment above is spot-on, but reaching into scene.proceduralTextures to splice the entry out is a layering violation that will silently break if ProceduralTexture ever changes how it registers itself.
Worth noting: the existing textureMerger.ts (sibling file in this folder, used by glTFMaterialExporter for baseColor+opacity and metallic+roughness packing) uses a much simpler pattern and apparently does not need this workaround:
// textureMerger.ts:221-230
return await new Promise<ProceduralTexture>((resolve, reject) => {
proceduralTexture.executeWhenReady(() => {
try {
proceduralTexture.render();
resolve(proceduralTexture);
} catch (error) {
reject(error instanceof Error ? error : new Error(String(error)));
}
});
});The differences vs. _RenderAsync here:
executeWhenReady(PT-level) instead ofeffect.executeWhenCompiled(effect-level).- No manual
isReady()/getEffect()priming. - No
scene.proceduralTexturessplice — and the merger has been in production use without (to my knowledge) reports of the re-render-with-disposed-inputs bug described above.
Before adopting the splice workaround, please verify why the merger's pattern doesn't suffer from the same issue. Two possibilities:
executeWhenReadygates on a different signal thanexecuteWhenCompiledand happens to also bump_currentRefreshId, suppressing the re-render. If so, switching to that pattern is the fix — and you can delete this whole splice block.- The merger has the same latent bug but it just hasn't surfaced because its inputs (the textures from the source material) outlive the merge operation. In that case the merger should also be patched, and we should extract the fix into a shared helper rather than have two divergent versions.
Either way, the two utilities should converge on one pattern. Two more general alternatives if that investigation comes up empty:
- Don't enter the scene's PT list at all. Add an option (e.g.
IProceduralTextureCreationOptions.skipSceneRegistration) that suppresses thescene.proceduralTextures.push(this)in the constructor when set. Both this file andtextureMergerthen opt in. - Make
refreshRate = -1actually mean "never auto-render". The current_shouldRender()treats_currentRefreshId === -1as "at least render once" and renders on the first frame regardless ofrefreshRate. A small change there (treat negativerefreshRateas never auto-render) would let both this code and the merger just callpt.render()and not need any cleanup.
There was a problem hiding this comment.
Here's my thoughts on this:
textureMerger returns the ProceduralTexture to the caller and never disposes its inputs. They remain alive for the lifetime of the output texture. So even if the scene loop re-renders it once, it produces the correct result — the inputs are still there.
textureProcessor disposes its input textures immediately after _RenderAsync resolves (a.dispose?.(), b.dispose?.(), etc.). Those are intermediate textures that are no longer needed once the output is written. So when the scene loop's re-render fires, the inputs have been freed and the shader samples garbage, which is why I added the explicit splicing of the array.
I've added the skipSceneRegistration option as you suggested and that seems to work.
|
You have changed file(s) that made possible changes to the sandbox. https://snapshots-cvgtc2eugrd3cgfd.z01.azurefd.net/SANDBOX/refs/pull/18430/merge/ |
|
You have made possible changes to the playground. https://snapshots-cvgtc2eugrd3cgfd.z01.azurefd.net/PLAYGROUND/refs/pull/18430/merge/ The snapshot playground with the CDN snapshot (only when available): Note that neither Babylon scenes nor textures are uploaded to the snapshot directory, so some playgrounds won't work correctly. |
|
/azp run |
|
Azure Pipelines successfully started running 2 pipeline(s). |
|
You have changed file(s) that made possible changes to the sandbox. https://snapshots-cvgtc2eugrd3cgfd.z01.azurefd.net/SANDBOX/refs/pull/18430/merge/ |
|
You have made possible changes to the playground. https://snapshots-cvgtc2eugrd3cgfd.z01.azurefd.net/PLAYGROUND/refs/pull/18430/merge/ The snapshot playground with the CDN snapshot (only when available): Note that neither Babylon scenes nor textures are uploaded to the snapshot directory, so some playgrounds won't work correctly. |
|
You have changed file(s) that made possible changes to the sandbox. https://snapshots-cvgtc2eugrd3cgfd.z01.azurefd.net/SANDBOX/refs/pull/18430/merge/ |
|
You have made possible changes to the playground. https://snapshots-cvgtc2eugrd3cgfd.z01.azurefd.net/PLAYGROUND/refs/pull/18430/merge/ The snapshot playground with the CDN snapshot (only when available): Note that neither Babylon scenes nor textures are uploaded to the snapshot directory, so some playgrounds won't work correctly. |
|
You have changed file(s) that made possible changes to the sandbox. https://snapshots-cvgtc2eugrd3cgfd.z01.azurefd.net/SANDBOX/refs/pull/18430/merge/ |
|
You have made possible changes to the playground. https://snapshots-cvgtc2eugrd3cgfd.z01.azurefd.net/PLAYGROUND/refs/pull/18430/merge/ The snapshot playground with the CDN snapshot (only when available): Note that neither Babylon scenes nor textures are uploaded to the snapshot directory, so some playgrounds won't work correctly. |
|
You have changed file(s) that made possible changes to the sandbox. https://snapshots-cvgtc2eugrd3cgfd.z01.azurefd.net/SANDBOX/refs/pull/18430/merge/ |
|
You have made possible changes to the playground. https://snapshots-cvgtc2eugrd3cgfd.z01.azurefd.net/PLAYGROUND/refs/pull/18430/merge/ The snapshot playground with the CDN snapshot (only when available): Note that neither Babylon scenes nor textures are uploaded to the snapshot directory, so some playgrounds won't work correctly. |
OpenPBR doesn't support surface-only tinting of volumetric materials so, when we import a glTF that uses both surface tinting and volumes, we need to use the coat layer to mimic this tinting.
i.e. If the glTF has
KHR_materials_transmissionandbaseColororKHR_materials_diffuse_transmissionanddiffuseTransmissionColoralong with a volume described byKHR_materials_volume, we need to do some special handling for OpenPBR. This basically involves copying base properties like roughness, ior, etc. to the coat layer so that the resulting material looks mostly the same as it would in a native glTF renderer.If the material already has a coat layer, we need to some additional logic to merge the existing coat with the new coat and this requires some texture baking. I've also added some logic for this that also handles baking UV transforms when they don't match between textures.
I have an asset that I may submit to BabylonAssets that can be used to test this behaviour:
