Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Native unclipped depth on supported platforms #16095

Merged
merged 11 commits into from
Dec 3, 2024

Conversation

JMS55
Copy link
Contributor

@JMS55 JMS55 commented Oct 25, 2024

Objective

Solution

  • Rename things to clarify that we want unclipped depth for directional light shadow views, and need some way of disabling the GPU's builtin depth clipping
  • Use DEPTH_CLIP_CONTROL instead of the fragment shader emulation on supported platforms
    • Pass only the clip position depth instead of the whole clip position between vertex->fragment shader (no idea if this helps performance or not, compiler might optimize it anyways)
  • Meshlets
    • HW raster always uses DEPTH_CLIP_CONTROL since it targets a more limited set of platforms
    • SW raster was not handling DEPTH_CLAMP_ORTHO correctly, it ended up pretty much doing nothing.
    • This PR made me realize that SW raster technically should have depth clipping for all views that are not directional light shadows, but I decided not to bother writing it. I'm not sure that it ever matters in practice. If proven otherwise, I can add it.

Testing

  • Did you test these changes? If so, how?
    • Lighting example. Both opaque (no fragment shader) and alpha masked geometry (fragment shader emulation) are working with depth_clip_control, and both work when it's turned off. Also tested meshlet example.
  • Are there any parts that need more testing?
    • Performance. I can't figure out a good test scene.
  • How can other people (reviewers) test your changes? Is there anything specific they need to know?
    • Toggle depth_clip_control_supported in prepass/mod.rs line 323 to turn this PR on or off.
  • If relevant, what platforms did you test these changes on, and are there any important ones you can't test?
    • Native

Migration Guide

  • MeshPipelineKey::DEPTH_CLAMP_ORTHO is now MeshPipelineKey::UNCLIPPED_DEPTH_ORTHO
  • The DEPTH_CLAMP_ORTHO shaderdef has been renamed to UNCLIPPED_DEPTH_ORTHO_EMULATION
  • clip_position_unclamped: vec4<f32> is now unclipped_depth: f32

@JMS55 JMS55 added A-Rendering Drawing game state to the screen C-Performance A change motivated by improving speed, memory usage or compile times labels Oct 25, 2024
Copy link
Contributor

@pcwalton pcwalton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

@JMS55 JMS55 changed the title Native depth clamp on supported platforms Native unclipped depth on supported platforms Oct 25, 2024
@JMS55
Copy link
Contributor Author

JMS55 commented Oct 25, 2024

Technically I think I should have a version of the meshlet software raster that does depth clipping for all other views besides directional light shadows, but I'm not 100% sure if this even matters in practice so I've elected to just ignore it for now.

@JMS55 JMS55 marked this pull request as ready for review October 25, 2024 06:24
@JMS55
Copy link
Contributor Author

JMS55 commented Oct 25, 2024

Ready for review. Please post perf tests, I was unable to find a good comparison. Also please double check that the fragment shader is actually omitted when the emulation is off, renderdoc was confusing when I tried to check.

@BenjaminBrienen BenjaminBrienen added S-Needs-Review Needs reviewer attention (from anyone!) to move forward D-Straightforward Simple bug fixes and API improvements, docs, test and examples D-Shaders This code uses GPU shader languages labels Oct 27, 2024
@JMS55 JMS55 added this to the 0.16 milestone Oct 30, 2024
@JMS55
Copy link
Contributor Author

JMS55 commented Oct 30, 2024

@superdump when we set the frag depth to the unclipped depth, should we be writing 1 / unclipped depth or something? Is what we have correct?

@superdump
Copy link
Contributor

Running many_cubes --benchmark --shadows on M1 Max main (yellow) vs this PR (red):
Screenshot 2024-11-18 at 11 52 42
This PR looks slightly slower. I did check that depth clip control was supported. Seems odd...

@JMS55
Copy link
Contributor Author

JMS55 commented Nov 19, 2024

:/ yeah that's what I saw... Confusing.

@Elabajaba
Copy link
Contributor

Elabajaba commented Nov 19, 2024

Noticeably faster on the GPU for shadow passes, though overall framerate doesn't improve due to still being CPU bound.

Left: 2 runs without the PR. Right: 2 runs with the PR
image

I'd expect a bigger win on iGPUs and mobile GPUs where you're way more bandwidth limited than a 6800xt at 720p lol. Number of shaded pixels dropped from 68,743,094 to 17,371,373 according to radeon gpu profiler.

Tested on bistro in scene_viewer with vsync disabled, winit update modes set to continuous, depth prepass added to the camera, camera transform Transform::from_translation(Vec3::new(-17.654163, 2.798615, -6.699705)).with_rotation(Quat::from_slice(&[0.0038612718, -0.7603651, 0.0045205816, 0.6494687])),

@JMS55
Copy link
Contributor Author

JMS55 commented Nov 19, 2024

Ok, so it doesn't improve FPS because we're still CPU bound, but on scenes with a lot of shadow fragments, we are faster now. I think we're good to merge :)

@IceSentry IceSentry added S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it and removed S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Dec 1, 2024
@alice-i-cecile alice-i-cecile added this pull request to the merge queue Dec 3, 2024
Merged via the queue into bevyengine:main with commit d221665 Dec 3, 2024
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Rendering Drawing game state to the screen C-Performance A change motivated by improving speed, memory usage or compile times D-Shaders This code uses GPU shader languages D-Straightforward Simple bug fixes and API improvements, docs, test and examples S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Use native depth clip control instead of manual DEPTH_CLAMP_ORTHO where possible
8 participants