-
-
Notifications
You must be signed in to change notification settings - Fork 35.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WebGPURenderer: partial compute() shader support for WebGL backend #27367
Conversation
The introduction of a compute shaders fallback in WebGL is amazing. Moving ahead with this seems like a strategic step, especially considering the intricate task of aligning the WebGL and WebGPU backends. I am worried that waiting too long to merge this draft might lead to potential conflicts with future updates of the |
It seems that |
Amazing work so far. |
Maybe we should convert the I can take care of it if you want. /cc @aardgoose @sunag |
@sunag the audio example is one that can't be replicated, at least easily, because it relies on using storage buffer indexes other than InstanceIndex so you just get the same audio out that you put in. |
@RenaudRohlinger using interleaved buffers and remaining compatible with the existing webGPU storage buffer code would be difficult I suspect (I haven't looked deeply though), I presume you would have to define the storage buffers as structs in wgsl to keep buffers compatible (apart from alignment differences). |
I believe it would be doable without affecting the WebGPU side. I just gave it a try but I couldn't figure out why the buffers were not correctly bounded (no error and no result to the screen, definitely not the best type of issue). I'd say we can merge it as it is and try to improve it through interleaved array buffer with another PR. In the meantime developers can just try to be careful on how they pack their datas 😄. |
I think it's great. I think we can think about |
…rdoob#27367) * compute support * code bot fixes * add StorageBufferAttribute * adapt to storagetBufferAttribute * dynamically realign the size of storage buffer based on backend * compute examples takes too much time to init for puppeteer * Add StorageBufferAttribute.create() --------- Co-authored-by: aardgoose <[email protected]>
Should we close #25273 then? |
console.warn( 'Abstract class.' ); | ||
const gl = this.gl; | ||
|
||
this.discard = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need this flag? Maybe RASTERIZER_DISCARD can be just always enabled in beginCompute and disabled in finishCompute? /ping @aardgoose @sunag
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did have it that way at first, but for some reason which I can't remember that had issues. If I get a chance I will try and reproduce the problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checked again, the check deals with nested compute shaders.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm... Feels a bit strange for me. So you mean one shader starts executing before another is finished? Is this intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment needs fixing. Actually the problem is that compute passes can be nested and indeed are in the compute_points example, which is where I hit this issue:
three.js/examples/webgpu_compute_points.html
Line 121 in bd33e67
renderer.compute( precomputeShaderNode().compute( particleNum ) ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just thought about this again and I think the current approach is completely equivalent to always calling enable
in compute
(the only possibility is that it could be called multiple times -- but I guess it wouldn't break?) and disable
in finishCompute
(it already does this). Can you please check this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WIth nested calls you get the sequence, seen in practice in compute_points: I originally had enable in discard in beginCompute().
beginCompute()
beginCompute() <- called from onBeforeRender()
compute()
endCompute()
compute() <- discard disabled by previous endCompute()
endCompute()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compute() <- discard disabled by previous endCompute()
No -- I'm suggesting that it be enabled in compute() instead of beginCompute() (currently it is the same).
|
||
} | ||
|
||
static create( count, itemSize, typeClass = Float32Array ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe instead of using this method we can overload the constuctor? Like so: /ping @aardgoose @sunag
constructor( array, itemSize, typeClass = Float32Array ) {
if ( ArrayBuffer.isView( arrray ) === false ) array = new typeClass( array * itemSize );
super( array, itemSize );
this.isStorageBufferAttribute = true;
}
'webgpu_portal' | ||
'webgpu_portal', | ||
|
||
// WebGPU idleTime and parseTime too low |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to increase idleTime -- I think from these two only it could be relevant here.
Adds support for compute shaders where
storageBuffers
are only accessed byinstanceIndex
.Areas that need work are a possible helper StorageAttribute subclass of
InstancedBufferAttribute
to handle different alignment characteristics of WebGL and WebGPU and coordinate system differences.With minimal changes this allows the webgpu_compute_particles, webgpu_compute_particles_rain and webgpu_compute_points examples to run.
With #27355 the webgpu_compute_audio runs but passes the original audio through unchanged, demonstrating the
instanceIndex
limitation above.This PR replaces the WebGPU storage buffers with attribute/varying pairs with transform feedback in the WebGL shaders.
(as a result buffer allocation is doubled for these buffers)
This also includes (again) the movement of VAO creation into the draw call, now with full material/geometry caching of VAOs which allows other examples to reuse VAOs correctly across objects. This is required because the VAO required is not constant for draw calls when vertex attributes from storage buffers are used.
Also fixed confusion of return of NodeBuilderState or NodeBuilder from nodes.getForCompute().