WebGPURenderer: Compute `modelViewMatrix` using GPU #29299

sunag · 2024-09-02T15:06:11Z

Related issue: #28719
Related: https://lxjk.github.io/2017/10/01/Stop-Using-Normal-Matrix.html

Performance

This change is part of the integration process of #28719 for less CPU usage. These changes brought a gain of around ~25% in performance for scenes with many objects.

default - old	default - now
12.10ms	8.85ms

bundle - old	bundle - now
5.67ms	3.27ms

Precision

You can use highPrecisionModelViewMatrix for all MVPs or for some selected Materials.

Global usage:

// global
import { highPrecisionModelViewMatrix, highPrecisionModelNormalMatrix } from 'three/tsl';

const renderer = new THREE.WebGPURenderer( { antialias: true } );
renderer.nodes.modelViewMatrix = highPrecisionModelViewMatrix;
renderer.nodes.modelNormalViewMatrix = highPrecisionModelNormalViewMatrix;

Single Material / Object

import { highPrecisionModelNormalMatrix, cameraProjectionMatrix, highPrecisionModelViewMatrix, positionLocal, normalLocal } from 'three/tsl';

material = new THREE.NodeMaterial();
material.vertexNode = cameraProjectionMatrix.mul( highPrecisionModelViewMatrix ).mul( positionLocal );
material.normalNode = highPrecisionModelNormalViewMatrix.transformDirection( normalLocal );

model* will use GPU.
highPrecision* will use CPU.

modelViewMatrix using GPU is the default. Since they are all nodes, you can customize your own.

github-actions · 2024-09-02T15:09:31Z

📦 Bundle size

Full ESM build, minified and gzipped.

	Before	After	Diff
WebGL	685.1 kB 169.6 kB	685.1 kB 169.6 kB	+0 B +0 B
WebGPU	821.7 kB 220.6 kB	822.6 kB 220.8 kB	+862 B +223 B
WebGPU Nodes	821.3 kB 220.5 kB	822.1 kB 220.7 kB	+1.28 kB +315 B

🌳 Bundle size after tree-shaking

Minimal build including a renderer, camera, empty scene, and dependencies.

	Before	After	Diff
WebGL	461.9 kB 111.4 kB	461.9 kB 111.4 kB	+0 B +0 B
WebGPU	522.1 kB 140.7 kB	521.7 kB 140.6 kB	-441 B -61 B
WebGPU Nodes	478.8 kB 130.5 kB	478.3 kB 130.5 kB	-43.78 kB -57 B

WestLangley · 2024-09-02T18:09:10Z

We do something similar here for instancing, but this only works if the columns of the matrix are orthogonal, which is not true in general.

See this explanation -- especially the last sentence.

sunag · 2024-09-02T23:26:12Z

Thanks @WestLangley!

Now I just need to check another approach to webgpu_postprocessing_motion_blur example :)

gkjohnson · 2024-09-03T02:39:13Z

I'm not as familiar with shader nodes so I may be misunderstanding what's happening here but I'll write my two cents from what I understand from the description:

This change could cause precision issues when using large coordinates since GPU calculations use 32 bit math which hasn't been an uncommon issue with instances and skinned meshes with large position values (bone and instance matrices are multiplied into the mv matrix on the gpu). See here and here. I suspect it will be even more common if this is done on the GPU for every mesh.

Assume the camera is far from the origin (camera orbiting a to-scale globe model with a radius of 6.3e6 meters) meaning the objects in frame have extremely large positional values, as well. If the MV matrix is calculated on the CPU then 64-bit precision is used meaning any error resulting from the calculations will be much smaller than it will be if these calculations are done with 32 bits on the GPU. This can cause very noticeable jitter artifacts during rendering.

RenaudRohlinger · 2024-09-03T03:23:21Z

In that case, similar to how logarithmicDepthBuffer improves depth precision at the cost of performance, we could introduce another option like renderer.normalMatrix (or somethingNormalCPU...) for 64-bit precision when jitter artifacts occur.
This would still allow for high-precision MV matrix calculations on the CPU, would align with the existing logarithmicDepthBuffer logic, and improve performances while giving developers an option to increase precision in demanding scenarios.

sunag · 2024-09-03T04:03:59Z

I like the idea of having a point of origin relative to the camera as presented in item 3.2.1 of this article https://www.diva-portal.org/smash/get/diva2:275843/FULLTEXT02.

gkjohnson · 2024-09-03T04:10:41Z

we could introduce another option like renderer.normalMatrix

To be clear is this just for normal matrices? This PR involves moving both the model-view matrix multiplication (and implicitly the normal matrix generation) to the GPU, right? In this case we'd want to name it something indicating it's for more than just normal matrices.

I like the idea of having a point of origin relative to the camera as presented in item 3.2.1 of this article https://www.diva-portal.org/smash/get/diva2:275843/FULLTEXT02.

This is what multiplying the model and view matrices on the CPU is achieving - ie what WebGLRenderer is already doing.

sunag · 2024-09-03T04:22:32Z

I think the idea is to have global matrices relative to the camera position. It's not what we do today

WestLangley · 2024-09-03T04:26:24Z

Restatement of my previous comment:

The technique proposed in this PR will only be correct when the columns of the model view matrix are orthogonal. The columns will typically not be orthogonal when, for example,

(a) a non-uniformly-scaled parent has a rotated child,

(b) a user-provided object matrix has non-orthogonal columns.

WestLangley · 2024-09-03T04:28:38Z

Maybe revisit #5974, instead.

gkjohnson · 2024-09-03T05:53:19Z

I think the idea is to have global matrices relative to the camera position. It's not what we do today

This is no different than calculating a model-view matrix, though, as far as I undrstand. The model-view matrix places the object relative to the camera. Perhaps you're imagining something different but in order to maintain these you have to multiply the existing world matrix by the inverse of the camera world matrix. You can either do that before rendering or maintain it on each object but I'm not sure of the value of the latter since it just makes things more difficult to maintain and removes the ability to render with multiple cameras without recalculating everything. Either way the same (if not more) matrix multiplication has to happen and everything will have to be recalculated when the camera moves.

I may need a more concrete explanation to understand the differences in what's being suggested.

aardgoose · 2024-09-03T07:17:41Z

I have been experimenting with a similar ideas (obviously restricted to uniform scaling) but made an opt in to allow object.static as proposed in #28719. Thus the existing known to be correct behavior is preserved but a lighter CPU varient is available for renderBundles - (to get lighting working, light uniforms need moving into a shared bindGroup etc).

https://github.com/aardgoose/three.js/tree/freeze2

sunag · 2024-09-03T12:27:58Z

This would not use matrix multiplication, it would be a simple subtraction of the objects' world matrix position with the camera's world position, in which case the camera world would always have zero position for the GPU. It is certainly something else to add for CPU how to calculate viewMatrix and normalMatrix is today, not ideal for Bundler, but since three.js currently does not have dedicated API for "huge open world", given the issues you presented in WebGLRenderer, the viewMatrix calculated on the CPU does not solve problems such as Attached SkinnedMesh, InstacedMesh and probably others that a world matrices relative to the camera position should resolve.

It is also possible to notice that most of the issues are related to the incorrect use of the scale, where 1 meter should be 1.0.

I don't think there is a perfect solution in this here, just since this PR is prioritizing performance and keeping it functional in situations where the camera needs to move 10 kilometers of distance of center of scene, which seems reasonable to me. Since some specific cases are the scenario that moves.

We could have Nodes to deal with these situations since the function TSL Fn call are deferred we would not have problems in defining how the viewMatrix is constructed for a given object, these are other possibilities to be studied.

sunag · 2024-09-03T13:19:10Z

We could have Nodes to deal with these situations since the function TSL Fn call are deferred we would not have problems in defining how the viewMatrix is constructed for a given object, these are other possibilities to be studied.

It seems like the best way to close this issue:
You can use highPrecisionModelViewMatrix for global or specific cases for example:

Global usage:

// global
import { highPrecisionModelViewMatrix } from 'three/tsl';

const renderer = new THREE.WebGPURenderer( { antialias: true } );
renderer.nodes.modelViewMatrix = highPrecisionModelViewMatrix; // it will replace all MVP with this modelView node

Single Material / Object

import { cameraProjectionMatrix, highPrecisionModelViewMatrix, positionLocal } from 'three/tsl';

material = new THREE.NodeMaterial();
material.vertexNode = cameraProjectionMatrix.mul( highPrecisionModelViewMatrix).mul( positionLocal );

highPrecisionModelViewMatrix will use CPU and modelViewMatrix will use GPU.
modelViewMatrix will be the default.

WestLangley · 2024-09-03T18:23:37Z

Master branch (WebGLRenderer and WebGPURenderer)

This PR (WebGPURenderer)

This is because this PR computes the incorrect normals in the GPU... not surprising, based on my comments above.

sunag · 2024-09-03T18:40:12Z

@WestLangley Could you share the code of this test?

WestLangley · 2024-09-03T18:56:31Z

WebGPU dev branch fiddle: https://jsfiddle.net/La1e5gmz/

sunag · 2024-09-03T20:01:24Z

I'm checking that out, thanks, maybe I'll try something like that, but I need to do some testing still.

The code below is just an abstraction

const modelNormalMatrix = ( object ) => ... new Matrix3().getNormalMatrix( object.matrixWorld )
const normalView = cameraViewMatrix.transformDirection( modelNormalMatrix.mul( normal ) );

Compute modelViewMatrix using GPU

80a2f9b

sunag added 2 commits September 2, 2024 12:44

update webgpu_tsl_raging_sea example

1408a8c

update webgpu_tsl_procedural_terrain example

616b753

sunag changed the title ~~WebGPURenderer: Compute modelViewMatrix using GPU~~ WebGPURenderer: Compute modelViewMatrix using GPU - WIP Sep 2, 2024

sunag added 3 commits September 2, 2024 13:41

testing modelNormalViewMatrix

5123195

Update MaterialNode.js

0a35498

Update webgpu_tsl_raging_sea.jpg

ca6de23

sunag added 4 commits September 2, 2024 19:48

introduce transformNormal() and transformNormalToView()

49b98af

Update InstanceNode.js

d37f520

Merge branch 'dev' into dev-performance-3

41b30e9

cleanup

eb6b1ba

Update VelocityNode.js

369723d

sunag changed the title ~~WebGPURenderer: Compute modelViewMatrix using GPU - WIP~~ WebGPURenderer: Compute modelViewMatrix using GPU Sep 3, 2024

sunag marked this pull request as ready for review September 3, 2024 00:36

sunag added this to the r169 milestone Sep 3, 2024

introduce highPrecisionModelViewMatrix

cf3736d

introduce highPrecisionModelNormalMatrix

1357179

sunag merged commit 29cb17f into mrdoob:dev Sep 3, 2024
12 checks passed

sunag deleted the dev-performance-3 branch September 3, 2024 15:20

sunag mentioned this pull request Sep 4, 2024

WebGPURenderer: Use world space normal approach #29312

Merged

sunag mentioned this pull request Sep 11, 2024

WebGPURenderer: Introduce NodeMaterialObserver and updates #29386

Merged

8 tasks

This was referenced Sep 22, 2024

WebGPURenderer: Compute modelViewMatrix using GPU three-types/three-ts-types#1240

Merged

r169 three-types/three-ts-types#1224

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WebGPURenderer: Compute `modelViewMatrix` using GPU #29299

WebGPURenderer: Compute `modelViewMatrix` using GPU #29299

sunag commented Sep 2, 2024 •

edited

Loading

github-actions bot commented Sep 2, 2024 •

edited

Loading

WestLangley commented Sep 2, 2024

sunag commented Sep 2, 2024

gkjohnson commented Sep 3, 2024

RenaudRohlinger commented Sep 3, 2024

sunag commented Sep 3, 2024

gkjohnson commented Sep 3, 2024

sunag commented Sep 3, 2024 •

edited

Loading

WestLangley commented Sep 3, 2024

WestLangley commented Sep 3, 2024

gkjohnson commented Sep 3, 2024

aardgoose commented Sep 3, 2024

sunag commented Sep 3, 2024 •

edited

Loading

sunag commented Sep 3, 2024 •

edited

Loading

WestLangley commented Sep 3, 2024

sunag commented Sep 3, 2024 •

edited

Loading

WestLangley commented Sep 3, 2024

sunag commented Sep 3, 2024

WebGPURenderer: Compute modelViewMatrix using GPU #29299

WebGPURenderer: Compute modelViewMatrix using GPU #29299

Conversation

sunag commented Sep 2, 2024 • edited Loading

Performance

Precision

github-actions bot commented Sep 2, 2024 • edited Loading

📦 Bundle size

🌳 Bundle size after tree-shaking

WestLangley commented Sep 2, 2024

sunag commented Sep 2, 2024

gkjohnson commented Sep 3, 2024

RenaudRohlinger commented Sep 3, 2024

sunag commented Sep 3, 2024

gkjohnson commented Sep 3, 2024

sunag commented Sep 3, 2024 • edited Loading

WestLangley commented Sep 3, 2024

WestLangley commented Sep 3, 2024

gkjohnson commented Sep 3, 2024

aardgoose commented Sep 3, 2024

sunag commented Sep 3, 2024 • edited Loading

sunag commented Sep 3, 2024 • edited Loading

WestLangley commented Sep 3, 2024

sunag commented Sep 3, 2024 • edited Loading

WestLangley commented Sep 3, 2024

sunag commented Sep 3, 2024

WebGPURenderer: Compute `modelViewMatrix` using GPU #29299

WebGPURenderer: Compute `modelViewMatrix` using GPU #29299

sunag commented Sep 2, 2024 •

edited

Loading

github-actions bot commented Sep 2, 2024 •

edited

Loading

sunag commented Sep 3, 2024 •

edited

Loading

sunag commented Sep 3, 2024 •

edited

Loading

sunag commented Sep 3, 2024 •

edited

Loading

sunag commented Sep 3, 2024 •

edited

Loading