[Merged by Bors] - Frustum culling #2861

superdump · 2021-09-23T11:53:33Z

Objective

Implement frustum culling for much better performance on more complex scenes. With the Amazon Lumberyard Bistro scene, I was getting roughly 15fps without frustum culling and 60+fps with frustum culling on a MacBook Pro 16 with i9 9980HK 8c/16t CPU and Radeon Pro 5500M.

macOS does weird things with vsync so even though vsync was off, it really looked like sometimes other applications or the desktop window compositor were interfering, but the difference could be even more as I even saw up to 90+fps sometimes.

Solution

Until the Primitive Shapes rfcs#12 RFC is completed, I wanted to implement at least some of the bounding volume functionality we needed to be able to unblock a bunch of rendering features and optimisations such as frustum culling, fitting the directional light orthographic projection to the relevant meshes in the view, clustered forward rendering, etc.
I have added Aabb, Frustum, and Sphere types with only the necessary intersection tests for the algorithms used. I also added CubemapFrusta which contains a [Frustum; 6] and can be used by cube maps such as environment maps, and point light shadow maps.
- I did do a bit of benchmarking and optimisation on the intersection tests. I compared the rafx parallel-comparison bitmask approach with a naïve loop that has an early-out in case of a bounding volume being outside of any one of the Frustum planes and found them to be very similar, so I chose the simpler and more readable option. I also compared using Vec3 and Vec3A and it turned out that promoting Vec3s to Vec3A improved performance of the culling significantly due to Vec3A operations using SIMD optimisations where Vec3 uses plain scalar operations.
When loading glTF models, the vertex attribute accessors generally store the minimum and maximum values, which allows for adding AABBs to meshes loaded from glTF for free.
For meshes without an AABB (PbrBundle deliberately does not have an AABB by default), a system is executed that scans over the vertex positions to find the minimum and maximum values along each axis. This is used to construct the AABB.
The Frustum::intersects_obb and Sphere::insersects_obb algorithm is from Foundations of Game Engine Development 2: Rendering by Eric Lengyel. There is no OBB type, yet, rather an AABB and the model matrix are passed in as arguments. This calculates a 'relative radius' of the AABB with respect to the plane normal (the plane normal in the Sphere case being something I came up with as the direction pointing from the centre of the sphere to the centre of the AABB) such that it can then do a sphere-sphere intersection test in practice.
RenderLayers were copied over from the current renderer.
VisibleEntities was copied over from the current renderer and a CubemapVisibleEntities was added to support PointLights for now. VisibleEntities are added to views (cameras and lights) and contain a Vec<Entity> that is populated by culling/visibility systems that run in PostUpdate of the app world, and are iterated over in the render world for, for example, queuing up meshes to be drawn by lights for shadow maps and the main pass for cameras.
Visibility and ComputedVisibility components were added. The Visibility component is user-facing so that, for example, the entity can be marked as not visible in an editor. ComputedVisibility on the other hand is the result of the culling/visibility systems and takes Visibility into account. So if an entity is marked as not being visible in its Visibility component, that will skip culling/visibility intersection tests and just mark the ComputedVisibility as false.
The ComputedVisibility is used to decide which meshes to extract.
I had to add a way to get the far plane from the CameraProjection in order to define an explicit far frustum plane for culling. This should perhaps be optional as it is not always desired and in that case, testing 5 planes instead of 6 is a performance win.

I think that's about all. I discussed some of the design with @cart on Discord already so hopefully it's not too far from being mergeable. It works well at least. 😄

superdump · 2021-10-29T09:50:17Z

Updated on top of #2741 (normal maps PR.)

superdump · 2021-11-04T12:27:11Z

Updated again on top of updated #2741.

…riate

More is unnecessary.

CubeFrusta -> CubemapFrusta CubeFrustaVisibleEntities -> CubemapVisibleEntities

If they do not intersect, then the mesh is not relevant for the light.

superdump · 2021-11-04T22:52:48Z

Updated on top of pipelined-rendering, which involved dropping the 'draw sprites based on VisibleEntities' commit as visibility is used to extract the sprite or not, then batching is done, and then drawing is done based on the batches.

I only quickly tested 2D (bevymark_pipelined) and 3D (3d_scene_pipelined, load_gltf_pipelined) are working on native.

The WebGL2 testing I was doing was on top of this visibility branch, FYI. I don't think there is anything in here that prevents WebGL2 compatibility.

pipelined/bevy_render2/src/view/visibility/mod.rs

pipelined/bevy_render2/src/primitives/mod.rs

pipelined/bevy_pbr2/src/render/light.rs

cart · 2021-11-07T02:58:28Z

pipelined/bevy_pbr2/src/bundle.rs

+        self.data.iter_mut()
+    }
+}
+
 /// A component bundle for "point light" entities
 #[derive(Debug, Bundle, Default)]
 pub struct PointLightBundle {


Not all lights need to cast shadows, and given how expensive point light shadows are, its probably worth adding a way to disable shadows for lights. This should also skip entity visibility calculations.

I have another branch for that that comes when I knew it would be needed - before clustered forward rendering. Can we add it in after #3072 or do I have to move it in here?

I'm definitely down to wait!

We can do it in whatever order works best for you :)

Then I'll wait on this one as it's coming just after depth prepass and alpha modes. Thanks. :)

cart · 2021-11-07T21:45:36Z

bors r+

@cart

# Objective Implement frustum culling for much better performance on more complex scenes. With the Amazon Lumberyard Bistro scene, I was getting roughly 15fps without frustum culling and 60+fps with frustum culling on a MacBook Pro 16 with i9 9980HK 8c/16t CPU and Radeon Pro 5500M. macOS does weird things with vsync so even though vsync was off, it really looked like sometimes other applications or the desktop window compositor were interfering, but the difference could be even more as I even saw up to 90+fps sometimes. ## Solution - Until the bevyengine/rfcs#12 RFC is completed, I wanted to implement at least some of the bounding volume functionality we needed to be able to unblock a bunch of rendering features and optimisations such as frustum culling, fitting the directional light orthographic projection to the relevant meshes in the view, clustered forward rendering, etc. - I have added `Aabb`, `Frustum`, and `Sphere` types with only the necessary intersection tests for the algorithms used. I also added `CubemapFrusta` which contains a `[Frustum; 6]` and can be used by cube maps such as environment maps, and point light shadow maps. - I did do a bit of benchmarking and optimisation on the intersection tests. I compared the [rafx parallel-comparison bitmask approach](https://github.com/aclysma/rafx/blob/c91bd5fcfdfa3f4d1b43507c32d84b94ffdf1b2e/rafx-visibility/src/geometry/frustum.rs#L64-L92) with a naïve loop that has an early-out in case of a bounding volume being outside of any one of the `Frustum` planes and found them to be very similar, so I chose the simpler and more readable option. I also compared using Vec3 and Vec3A and it turned out that promoting Vec3s to Vec3A improved performance of the culling significantly due to Vec3A operations using SIMD optimisations where Vec3 uses plain scalar operations. - When loading glTF models, the vertex attribute accessors generally store the minimum and maximum values, which allows for adding AABBs to meshes loaded from glTF for free. - For meshes without an AABB (`PbrBundle` deliberately does not have an AABB by default), a system is executed that scans over the vertex positions to find the minimum and maximum values along each axis. This is used to construct the AABB. - The `Frustum::intersects_obb` and `Sphere::insersects_obb` algorithm is from Foundations of Game Engine Development 2: Rendering by Eric Lengyel. There is no OBB type, yet, rather an AABB and the model matrix are passed in as arguments. This calculates a 'relative radius' of the AABB with respect to the plane normal (the plane normal in the Sphere case being something I came up with as the direction pointing from the centre of the sphere to the centre of the AABB) such that it can then do a sphere-sphere intersection test in practice. - `RenderLayers` were copied over from the current renderer. - `VisibleEntities` was copied over from the current renderer and a `CubemapVisibleEntities` was added to support `PointLight`s for now. `VisibleEntities` are added to views (cameras and lights) and contain a `Vec<Entity>` that is populated by culling/visibility systems that run in PostUpdate of the app world, and are iterated over in the render world for, for example, queuing up meshes to be drawn by lights for shadow maps and the main pass for cameras. - `Visibility` and `ComputedVisibility` components were added. The `Visibility` component is user-facing so that, for example, the entity can be marked as not visible in an editor. `ComputedVisibility` on the other hand is the result of the culling/visibility systems and takes `Visibility` into account. So if an entity is marked as not being visible in its `Visibility` component, that will skip culling/visibility intersection tests and just mark the `ComputedVisibility` as false. - The `ComputedVisibility` is used to decide which meshes to extract. - I had to add a way to get the far plane from the `CameraProjection` in order to define an explicit far frustum plane for culling. This should perhaps be optional as it is not always desired and in that case, testing 5 planes instead of 6 is a performance win. I think that's about all. I discussed some of the design with @cart on Discord already so hopefully it's not too far from being mergeable. It works well at least. 😄

bors · 2021-11-07T22:04:30Z

Pull request successfully merged into pipelined-rendering.

Build succeeded:

@cart

# Objective Add depth prepass and support for opaque, alpha mask, and alpha blend modes for the 3D PBR target. ## Solution NOTE: This is based on top of #2861 frustum culling. Just lining it up to keep @cart loaded with the review train. 🚂 There are a lot of important details here. Big thanks to @cwfitzgerald of wgpu, naga, and rend3 fame for explaining how to do it properly! * An `AlphaMode` component is added that defines whether a material should be considered opaque, an alpha mask (with a cutoff value that defaults to 0.5, the same as glTF), or transparent and should be alpha blended * Two depth prepasses are added: * Opaque does a plain vertex stage * Alpha mask does the vertex stage but also a fragment stage that samples the colour for the fragment and discards if its alpha value is below the cutoff value * Both are sorted front to back, not that it matters for these passes. (Maybe there should be a way to skip sorting?) * Three main passes are added: * Opaque and alpha mask passes use a depth comparison function of Equal such that only the geometry that was closest is processed further, due to early-z testing * The transparent pass uses the Greater depth comparison function so that only transparent objects that are closer than anything opaque are rendered * The opaque fragment shading is as before except that alpha is explicitly set to 1.0 * Alpha mask fragment shading sets the alpha value to 1.0 if it is equal to or above the cutoff, as defined by glTF * Opaque and alpha mask are sorted front to back (again not that it matters as we will skip anything that is not equal... maybe sorting is no longer needed here?) * Transparent is sorted back to front. Transparent fragment shading uses the alpha blending over operator Co-authored-by: Carter Anderson <[email protected]>

superdump mentioned this pull request Sep 23, 2021

Renderer Rework: Initial Merge Tracking Issue #2535

Closed

64 tasks

superdump force-pushed the visibility branch from a4be430 to 83f3563 Compare September 23, 2021 21:16

inodentry added A-Rendering Drawing game state to the screen C-Feature A new feature, making something new possible S-Needs-Review labels Sep 24, 2021

superdump force-pushed the visibility branch from 83f3563 to dd8bfba Compare October 8, 2021 22:17

mockersf mentioned this pull request Oct 22, 2021

[Merged by Bors] - Freeing memory held by visible entities vector #3009

Closed

superdump force-pushed the visibility branch from dd8bfba to cc3f4be Compare October 29, 2021 09:48

superdump force-pushed the visibility branch from cc3f4be to cb33bd0 Compare November 4, 2021 12:26

superdump added 19 commits November 4, 2021 23:30

Add primitives to support frustum culling

563e396

Copy over RenderLayers from the old renderer

0371410

Add functionality to Mesh to compute an AABB

bc9faf3

Load AABBs from glTF models

b44698b

Add far() method to CameraProjection trait

dd47c76

Add visibility types and systems

087fd7b

Add VisibleEntities and Frustum to camera and light bundles as approp…

ea546a9

…riate

Add Visibility and ComputedVisibility to entities to be drawn

b4fd252

Extract mesh/sprite entities based on ComputedVisibility

34f785c

Extract VisibleEntities for cameras

c1954bc

Enable VisibilityPlugin within the ViewPlugin

2cb2490

Remove unnecessary Camera.far member

2a4cf48

Pass the plane normal to Aabb::relative_radius()

27287cd

More is unnecessary.

Add sphere - obb intsersection test

273d8fa

Add a CubeFrusta type containing 6 frusta for cube maps

69d2440

Add CubeFrustaVisibleEntities for use with cube maps

e25b1da

Add culling for lights

69d824d

Minor renaming

7c03a3d

CubeFrusta -> CubemapFrusta CubeFrustaVisibleEntities -> CubemapVisibleEntities

Unify bounded entity query into visible entity query for performance

69845d2

superdump added 3 commits November 4, 2021 23:44

Register types to support loading from glTF models

e0c88f4

Fix the logic of the point light sphere vs mesh obb test

d5c2c2c

If they do not intersect, then the mesh is not relevant for the light.

Remove unnecessary With<PointLight> in check_light_visibility query

1fe1df6

superdump force-pushed the visibility branch from cb33bd0 to 1fe1df6 Compare November 4, 2021 22:48

superdump mentioned this pull request Nov 5, 2021

[Merged by Bors] - Add support for opaque, alpha mask, and alpha blend modes #3072

Closed

alice-i-cecile mentioned this pull request Nov 5, 2021

Fix frustum culling w/ sprite transform scaling #2753

Closed

cart reviewed Nov 7, 2021

View reviewed changes

Address review comments

5727a9d

bors bot changed the title ~~Frustum culling~~ [Merged by Bors] - Frustum culling Nov 7, 2021

bors bot closed this Nov 7, 2021

alice-i-cecile mentioned this pull request Dec 28, 2021

CPU Frustum Culling #1333

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Merged by Bors] - Frustum culling #2861

[Merged by Bors] - Frustum culling #2861

superdump commented Sep 23, 2021

superdump commented Oct 29, 2021

superdump commented Nov 4, 2021

superdump commented Nov 4, 2021

cart Nov 7, 2021

superdump Nov 7, 2021

cart Nov 7, 2021

cart Nov 7, 2021

superdump Nov 7, 2021

cart commented Nov 7, 2021

bors bot commented Nov 7, 2021

[Merged by Bors] - Frustum culling #2861

[Merged by Bors] - Frustum culling #2861

Conversation

superdump commented Sep 23, 2021

Objective

Solution

superdump commented Oct 29, 2021

superdump commented Nov 4, 2021

superdump commented Nov 4, 2021

cart Nov 7, 2021

Choose a reason for hiding this comment

superdump Nov 7, 2021

Choose a reason for hiding this comment

cart Nov 7, 2021

Choose a reason for hiding this comment

cart Nov 7, 2021

Choose a reason for hiding this comment

superdump Nov 7, 2021

Choose a reason for hiding this comment

cart commented Nov 7, 2021

bors bot commented Nov 7, 2021