Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - separate tonemapping and upscaling passes #3425

Conversation

jakobhellermann
Copy link
Contributor

@jakobhellermann jakobhellermann commented Dec 23, 2021

Attempt to make features like bloom #2876 easier to implement.

This PR:

  • Moves the tonemapping from pbr.wgsl into a separate pass
  • also add a separate upscaling pass after the tonemapping which writes to the swap chain (enables resolution-independant rendering and post-processing after tonemapping)
  • adds a hdr bool to the camera which controls whether the pbr and sprite shaders render into a Rgba16Float texture

Open questions:

  • should the 2d graph work the same as the 3d one? it is the same now
  • The current solution is a bit inflexible because while you can add a post processing pass that writes to e.g. the hdr_texture, you can't write to a separate user_postprocess_texture while reading the hdr_texture and tell the tone mapping pass to read from the user_postprocess_texture instead. If the tonemapping and upscaling render graph nodes were to take in a TextureView instead of the view entity this would almost work, but the bind groups for their respective input textures are already created in the Queue render stage in the hardcoded order. solved by creating bind groups in render node

New render graph:

render_graph

Before

render_graph_old

@github-actions github-actions bot added the S-Needs-Triage This issue needs to be labelled label Dec 23, 2021
@alice-i-cecile alice-i-cecile added A-Rendering Drawing game state to the screen C-Usability A targeted quality-of-life change that makes Bevy easier to use and removed S-Needs-Triage This issue needs to be labelled labels Dec 23, 2021
@ChangeCaps
Copy link

ChangeCaps commented Dec 23, 2021

Would it maybe make sense to do post processing after the main pass driver node rather than inside the 3d sub graph?

@jakobhellermann
Copy link
Contributor Author

Would it maybe make sense to do post processing after the main pass driver node rather than inside the 3d sub graph?

I don't think so because I could image situations where the 2d and the 3d cameras would have different post processing settings which wouldn't work if it was after the main_pass_driver.

@jakobhellermann jakobhellermann force-pushed the tonemapping-upscaling-passes branch 3 times, most recently from 49c5153 to 45865f1 Compare December 24, 2021 11:35
@jakobhellermann
Copy link
Contributor Author

CI failure is because the breakout example has no 3d camera and doesn't write to the intermediate textures, so wgpu needs COPY_DST to zero the texture. Will be fixed either by using the same post processing in the 2d subgraph or by gfx-rs/wgpu#2307.

@jakobhellermann jakobhellermann force-pushed the tonemapping-upscaling-passes branch 3 times, most recently from 11c326a to 294bcb1 Compare December 30, 2021 15:51
@jakobhellermann
Copy link
Contributor Author

New render graph:
render_graph

The tonemapping node does nothing if hdr isn't enabled, because then the pbr.wgsl or sprite.wgsl shaders already do tone mapping.

hdr can be configured on cameras, but is automatically true for 3d and false for 2d.

@jakobhellermann jakobhellermann force-pushed the tonemapping-upscaling-passes branch from 294bcb1 to 2d4ac35 Compare December 30, 2021 16:07
Copy link
Contributor

@superdump superdump left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This on-the-whole looks good to me. Just a few smaller comments really.

crates/bevy_render/src/view/mod.rs Outdated Show resolved Hide resolved
crates/bevy_render/src/view/mod.rs Outdated Show resolved Hide resolved
crates/bevy_core_pipeline/src/upscaling/upscaling.wgsl Outdated Show resolved Hide resolved
BindGroupLayoutEntry {
binding: 1,
visibility: ShaderStages::FRAGMENT,
ty: BindingType::Sampler(SamplerBindingType::NonFiltering),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this should at least be an option like nearest-neighbour / bilinear. Then later we can add AMD FSR, NVIDIA DLSS, Intel's offering, something TAA-based as part of the TAA node perhaps, or whatever else we like.

Random curiosity that I haven't thought about before: is non-filtering practically the same as filtering with FilterMode::Nearest?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a UpscalingPipelineKey with support for specifying the mode, but it's not exposed in a usable way yet.

Hdr and the upscaling mode should probably be configuration options on the render target? I'll wait until the new render target changes are finalized.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess two bind groups would be needed, one filtering and one not.

[[stage(vertex)]]
fn vs_main([[builtin(vertex_index)]] in_vertex_index: u32) -> VertexOutput {
let uv = vec2<f32>(f32((in_vertex_index << 1u) & 2u), f32(in_vertex_index & 2u));
let position = vec4<f32>(uv * vec2<f32>(2.0, -2.0) + vec2<f32>(-1.0, 1.0), 0.0, 1.0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this approach preferred to a hard-coded fullscreen triangle? Seems like it requires some more operations perhaps even though both are practically insignificant performance-wise I guess.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea.

Is the alternative you had in mind

var vertices: array<vec3, 6> = [ ... ];`
vertices[in_vertex_index]

?

Copy link
Contributor

@superdump superdump Mar 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean a single triangle that is larger than the visible normalised device coordinates in x,y and is specially crafted so that the UVs work out. I thought this would have insignificant performance impact but after looking it up, it seems it can definitely make a significant performance difference!
https://twitter.com/SebAaltonen/status/705831150019866624?s=20&t=wCGuq2EdqA2jYKx6euNkHw
https://michaldrobot.com/2014/04/01/gcn-execution-patterns-in-full-screen-passes/

In x,y, normalised device coordinates are lower-left -1,-1 to upper right 1,1. And UVs are interpolated across a triangle. So, you need 3 vertices, you need to cover the NDC x,y space, and you want vertices to be 0,0 bottom-left to 1,1 top-right. Note that the length of the sides of NDC x,y are 2. If you put vertices at (-1,-1), (3, -1), (-1, 3) they form a right-angled triangle that covers the screen. The UVs should be (0,0), (2,0), (0,2) so that they interpolate correctly. So for vertex indices 0,1,2, that can be expressed as (unless I made mistakes):

    out.clip_position = vec4<f32>(
        f32(in.vertex_index & 1),
        f32(in.vertex_index >> 1),
        0.0,
        0.5
    ) * 4.0 - 1.0;
    out.uv = vec2<f32>(
        f32(in.vertex_index & 1),
        f32(in.vertex_index >> 1),
    ) * 2.0;

And then draw 3 vertices. The * 4.0 - 1.0 being outside the vec4 is I believe because of SIMD.

EDIT: I see now that it is already using a fullscreen triangle. I wonder why I wrote this before. And that it has UV 0,0 top-left. There are still some simplifications to make. I'll make a specific suggestion for those. Interesting to discover that the fullscreen triangle is faster though. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, commented further above. This will be a common vertex stage so I think we should add a fullscreen.wgsl or something, with just the vertex stage, and use it via an import.

@superdump
Copy link
Contributor

Any chance you could address the feedback and update this on top of main @jakobhellermann ? It would be good to get this and bloom in for 0.7. :)

@jakobhellermann jakobhellermann force-pushed the tonemapping-upscaling-passes branch from 2d4ac35 to 1af8ae3 Compare March 20, 2022 12:18
@jakobhellermann jakobhellermann force-pushed the tonemapping-upscaling-passes branch from 1af8ae3 to 60fe4f0 Compare March 20, 2022 12:19
@jakobhellermann jakobhellermann marked this pull request as ready for review March 20, 2022 12:22
@jakobhellermann jakobhellermann force-pushed the tonemapping-upscaling-passes branch from 60fe4f0 to 21ba00d Compare March 20, 2022 12:25
@DGriffin91
Copy link
Contributor

Does "using the runtime surface format" for the TextureFormat mean that using a texture as the camera RenderTarget for other formats than TextureFormat::bevy_default() like RGBA16 will work? That would be super cool.

I think disabling hdr by default makes sense. Better to be able to move forward with this than blocking on figuring out the best MSAA solution. 

@cart
Copy link
Member

cart commented Oct 24, 2022

Does "using the runtime surface format" for the TextureFormat mean that using a texture as the camera RenderTarget for other formats than TextureFormat::bevy_default() like RGBA16 will work? That would be super cool.

Doing this would require specializing the pipeline on each camera's RenderTarget format. We could do this, but idk if its worth it.

@cart
Copy link
Member

cart commented Oct 24, 2022

Ah wait hold on. Actually it should just work! I wasn't thinking properly. The final blit in UpscalingNode will make it work "as expected".

@JMS55
Copy link
Contributor

JMS55 commented Oct 24, 2022

The biggest downside here is that the blit step becomes "required" as the main pass render pipelines are no longer aware of what the surface supports (by default).

If I understand this correctly, you mean the downside is that we can't render to the surface directly, and always need a blit? Imo, this is an upside. While developing TAA, I ran into the major issue where you can't bind+sample the camera's texture, as it could be a surface. Having it always be a texture helps greatly.


We can revisit this once we have an alternative AA implementation and/or we implement a reversible intermediate tonemapping solution (as discussed above).

From what I understand, TAA will also need a "reversible intermediate tonemapping". It's not required, you can do TAA before/after (in this case, after, in bevy's current setup), but each choice has some downsides.

@cart
Copy link
Member

cart commented Oct 25, 2022

If I understand this correctly, you mean the downside is that we can't render to the surface directly, and always need a blit? Imo, this is an upside. While developing TAA, I ran into the major issue where you can't bind+sample the camera's texture, as it could be a surface. Having it always be a texture helps greatly.

Yup the one downside is that on lower end devices like phones blits can have a non-trivial performance cost (from what i've heard). What that cost is ... I have no clue. For now I'm cool with simplifying this case here / assuming a blit and we can build a "fast path" if there are numbers to back it up.

@cart
Copy link
Member

cart commented Oct 25, 2022

Just tested (1) on both web and android: they both work! I did test that using Bgra8UnormSrgb for this this value does work on the web (when it used to fail), but I chose to default to Rgba8UnormSrgb everywhere instead of Bgra8UnormSrgb for maximum default compatibility (in case users ever decide to use this value for swapchains). Afaik the theory behind preferring Bgra only applies if you're writing directly to the swapchain, which we're no longer doing. And there are questions about the actual value of doing this in the first place. Better favor maximum compatibility.

We do have one more issue we'll probably need to solve: we enable Msaa by default, but webgl (at least on my reasonably high end machine) doesn't support multisampling hdr textures (Rgba16Float). Disabling hdr by default will hide the problem / make everything continue working by default. But the second we turn hdr textures on by default, web will be broken by default (unless we change the behavior).

@cart
Copy link
Member

cart commented Oct 25, 2022

(Just pushed the changes for (1). The commits are above my previous two messages)

@cart cart added this to the Bevy 0.9 milestone Oct 25, 2022
@cart
Copy link
Member

cart commented Oct 25, 2022

Just disabled hdr by default (and added a warning to the docs describing the caveats of enabling it).

@cart
Copy link
Member

cart commented Oct 25, 2022

@DGriffin91 Haha I take back my takeback on this:

Doing this would require specializing the pipeline on each camera's RenderTarget format. We could do this, but idk if its worth it.
Ah wait hold on. Actually it should just work! I wasn't thinking properly. The final blit in UpscalingNode will make it work "as expected".

We do need to specialize the final blit pipeline (or at least, create a new pipeline for each target texture type). Right now it is hard coded to the SurfaceTextureFormat, which will fail for cameras using arbitrary formats. We can and should specialize on this. I'll follow up this pr with that change.

@DGriffin91
Copy link
Contributor

I personally only really want Rgba16Float to work, fwiw.

@cart
Copy link
Member

cart commented Oct 25, 2022

I just got arbitrary textures working :)
(ill still leave that for another pr though)

@cart
Copy link
Member

cart commented Oct 25, 2022

Just fixed UI. It was broken in this PR because it was rendering directly to the swapchain / out texture. It now uses the target texture instead (post msaa resolve, post main pass, post tonemapping, pre upscaling)
However this means that plugins inserting things into the graph with orders ambiguous with UI will need to orient themselves relative to the ui pass now. We should discuss introducing bevy_ui-agnostic points in the graph to make it possible to do this generically.

@cart
Copy link
Member

cart commented Oct 26, 2022

bors r+

bors bot pushed a commit that referenced this pull request Oct 26, 2022
Attempt to make features like bloom #2876 easier to implement.

**This PR:**
- Moves the tonemapping from `pbr.wgsl` into a separate pass
- also add a separate upscaling pass after the tonemapping which writes to the swap chain (enables resolution-independant rendering and post-processing after tonemapping)
- adds a `hdr` bool to the camera which controls whether the pbr and sprite shaders render into a `Rgba16Float` texture

**Open questions:**
- ~should the 2d graph work the same as the 3d one?~ it is the same now
- ~The current solution is a bit inflexible because while you can add a post processing pass that writes to e.g. the `hdr_texture`, you can't write to a separate `user_postprocess_texture` while reading the `hdr_texture` and tell the tone mapping pass to read from the `user_postprocess_texture` instead. If the tonemapping and upscaling render graph nodes were to take in a `TextureView` instead of the view entity this would almost work, but the bind groups for their respective input textures are already created in the `Queue` render stage in the hardcoded order.~ solved by creating bind groups in render node

**New render graph:**

![render_graph](https://user-images.githubusercontent.com/22177966/147767249-57dd4229-cfab-4ec5-9bf3-dc76dccf8e8b.png)
<details>
<summary>Before</summary>

![render_graph_old](https://user-images.githubusercontent.com/22177966/147284579-c895fdbd-4028-41cf-914c-e1ffef60e44e.png)
</details>

Co-authored-by: Carter Anderson <[email protected]>
@bors bors bot changed the title separate tonemapping and upscaling passes [Merged by Bors] - separate tonemapping and upscaling passes Oct 26, 2022
@bors bors bot closed this Oct 26, 2022
james7132 pushed a commit to james7132/bevy that referenced this pull request Oct 28, 2022
Attempt to make features like bloom bevyengine#2876 easier to implement.

**This PR:**
- Moves the tonemapping from `pbr.wgsl` into a separate pass
- also add a separate upscaling pass after the tonemapping which writes to the swap chain (enables resolution-independant rendering and post-processing after tonemapping)
- adds a `hdr` bool to the camera which controls whether the pbr and sprite shaders render into a `Rgba16Float` texture

**Open questions:**
- ~should the 2d graph work the same as the 3d one?~ it is the same now
- ~The current solution is a bit inflexible because while you can add a post processing pass that writes to e.g. the `hdr_texture`, you can't write to a separate `user_postprocess_texture` while reading the `hdr_texture` and tell the tone mapping pass to read from the `user_postprocess_texture` instead. If the tonemapping and upscaling render graph nodes were to take in a `TextureView` instead of the view entity this would almost work, but the bind groups for their respective input textures are already created in the `Queue` render stage in the hardcoded order.~ solved by creating bind groups in render node

**New render graph:**

![render_graph](https://user-images.githubusercontent.com/22177966/147767249-57dd4229-cfab-4ec5-9bf3-dc76dccf8e8b.png)
<details>
<summary>Before</summary>

![render_graph_old](https://user-images.githubusercontent.com/22177966/147284579-c895fdbd-4028-41cf-914c-e1ffef60e44e.png)
</details>

Co-authored-by: Carter Anderson <[email protected]>

impl FromWorld for TonemappingPipeline {
fn from_world(render_world: &mut World) -> Self {
let tonemap_texture_bind_group = render_world
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bind group layout, not a bind group, so the variable name is misleading.

@cart cart mentioned this pull request Nov 3, 2022
bors bot pushed a commit that referenced this pull request Nov 11, 2022
# Objective

- Closes #5262 
- Fix color banding caused by quantization.

## Solution

- Adds dithering to the tonemapping node from #3425.
- This is inspired by Godot's default "debanding" shader: https://gist.github.com/belzecue/
- Unlike Godot:
  - debanding happens after tonemapping. My understanding is that this is preferred, because we are running the debanding at the last moment before quantization (`[f32, f32, f32, f32]` -> `f32`). This ensures we aren't biasing the dithering strength by applying it in a different (linear) color space.
  - This code instead uses and reference the origin source, Valve at GDC 2015

![Screenshot from 2022-11-10 13-44-46](https://user-images.githubusercontent.com/2632925/201218880-70f4cdab-a1ed-44de-a88c-8759e77197f1.png)
![Screenshot from 2022-11-10 13-41-11](https://user-images.githubusercontent.com/2632925/201218883-72393352-b162-41da-88bb-6e54a1e26853.png)


## Additional Notes 

Real time rendering to standard dynamic range outputs is limited to 8 bits of depth per color channel. Internally we keep everything in full 32-bit precision (`vec4<f32>`) inside passes and 16-bit between passes until the image is ready to be displayed, at which point the GPU implicitly converts our `vec4<f32>` into a single 32bit value per pixel, with each channel (rgba) getting 8 of those 32 bits.

### The Problem

8 bits of color depth is simply not enough precision to make each step invisible - we only have 256 values per channel! Human vision can perceive steps in luma to about 14 bits of precision. When drawing a very slight gradient, the transition between steps become visible because with a gradient, neighboring pixels will all jump to the next "step" of precision at the same time.

### The Solution

One solution is to simply output in HDR - more bits of color data means the transition between bands will become smaller. However, not everyone has hardware that supports 10+ bit color depth. Additionally, 10 bit color doesn't even fully solve the issue, banding will result in coherent bands on shallow gradients, but the steps will be harder to perceive.

The solution in this PR adds noise to the signal before it is "quantized" or resampled from 32 to 8 bits. Done naively, it's easy to add unneeded noise to the image. To ensure dithering is correct and absolutely minimal, noise is adding *within* one step of the output color depth. When converting from the 32bit to 8bit signal, the value is rounded to the nearest 8 bit value (0 - 255). Banding occurs around the transition from one value to the next, let's say from 50-51. Dithering will never add more than +/-0.5 bits of noise, so the pixels near this transition might round to 50 instead of 51 but will never round more than one step. This means that the output image won't have excess variance:
  - in a gradient from 49 to 51, there will be a step between each band at 49, 50, and 51.
  - Done correctly, the modified image of this gradient will never have a adjacent pixels more than one step (0-255) from each other.
  - I.e. when scanning across the gradient you should expect to see:
```
                  |-band-| |-band-| |-band-|
Baseline:         49 49 49 50 50 50 51 51 51
Dithered:         49 50 49 50 50 51 50 51 51
Dithered (wrong): 49 50 51 49 50 51 49 51 50
```

![Screenshot from 2022-11-10 14-12-36](https://user-images.githubusercontent.com/2632925/201219075-ab3f46be-d4e9-4869-b66b-a92e1706f49e.png)
![Screenshot from 2022-11-10 14-11-48](https://user-images.githubusercontent.com/2632925/201219079-ec5d2add-817d-487a-8fc1-84569c9cda73.png)




You can see from above how correct dithering "fuzzes" the transition between bands to reduce distinct steps in color, without adding excess noise.

### HDR

The previous section (and this PR) assumes the final output is to an 8-bit texture, however this is not always the case. When Bevy adds HDR support, the dithering code will need to take the per-channel depth into account instead of assuming it to be 0-255. Edit: I talked with Rob about this and it seems like the current solution is okay. We may need to revisit once we have actual HDR final image output.

---

## Changelog

### Added

- All pipelines now support deband dithering. This is enabled by default in 3D, and can be toggled in the `Tonemapping` component in camera bundles. Banding is a graphical artifact created when the rendered image is crunched from high precision (f32 per color channel) down to the final output (u8 per channel in SDR). This results in subtle gradients becoming blocky due to the reduced color precision. Deband dithering applies a small amount of noise to the signal before it is "crunched", which breaks up the hard edges of blocks (bands) of color. Note that this does not add excess noise to the image, as the amount of noise is less than a single step of a color channel - just enough to break up the transition between color blocks in a gradient.


Co-authored-by: Carter Anderson <[email protected]>
ItsDoot pushed a commit to ItsDoot/bevy that referenced this pull request Feb 1, 2023
Attempt to make features like bloom bevyengine#2876 easier to implement.

**This PR:**
- Moves the tonemapping from `pbr.wgsl` into a separate pass
- also add a separate upscaling pass after the tonemapping which writes to the swap chain (enables resolution-independant rendering and post-processing after tonemapping)
- adds a `hdr` bool to the camera which controls whether the pbr and sprite shaders render into a `Rgba16Float` texture

**Open questions:**
- ~should the 2d graph work the same as the 3d one?~ it is the same now
- ~The current solution is a bit inflexible because while you can add a post processing pass that writes to e.g. the `hdr_texture`, you can't write to a separate `user_postprocess_texture` while reading the `hdr_texture` and tell the tone mapping pass to read from the `user_postprocess_texture` instead. If the tonemapping and upscaling render graph nodes were to take in a `TextureView` instead of the view entity this would almost work, but the bind groups for their respective input textures are already created in the `Queue` render stage in the hardcoded order.~ solved by creating bind groups in render node

**New render graph:**

![render_graph](https://user-images.githubusercontent.com/22177966/147767249-57dd4229-cfab-4ec5-9bf3-dc76dccf8e8b.png)
<details>
<summary>Before</summary>

![render_graph_old](https://user-images.githubusercontent.com/22177966/147284579-c895fdbd-4028-41cf-914c-e1ffef60e44e.png)
</details>

Co-authored-by: Carter Anderson <[email protected]>
ItsDoot pushed a commit to ItsDoot/bevy that referenced this pull request Feb 1, 2023
…5264)

# Objective

- Closes bevyengine#5262 
- Fix color banding caused by quantization.

## Solution

- Adds dithering to the tonemapping node from bevyengine#3425.
- This is inspired by Godot's default "debanding" shader: https://gist.github.com/belzecue/
- Unlike Godot:
  - debanding happens after tonemapping. My understanding is that this is preferred, because we are running the debanding at the last moment before quantization (`[f32, f32, f32, f32]` -> `f32`). This ensures we aren't biasing the dithering strength by applying it in a different (linear) color space.
  - This code instead uses and reference the origin source, Valve at GDC 2015

![Screenshot from 2022-11-10 13-44-46](https://user-images.githubusercontent.com/2632925/201218880-70f4cdab-a1ed-44de-a88c-8759e77197f1.png)
![Screenshot from 2022-11-10 13-41-11](https://user-images.githubusercontent.com/2632925/201218883-72393352-b162-41da-88bb-6e54a1e26853.png)


## Additional Notes 

Real time rendering to standard dynamic range outputs is limited to 8 bits of depth per color channel. Internally we keep everything in full 32-bit precision (`vec4<f32>`) inside passes and 16-bit between passes until the image is ready to be displayed, at which point the GPU implicitly converts our `vec4<f32>` into a single 32bit value per pixel, with each channel (rgba) getting 8 of those 32 bits.

### The Problem

8 bits of color depth is simply not enough precision to make each step invisible - we only have 256 values per channel! Human vision can perceive steps in luma to about 14 bits of precision. When drawing a very slight gradient, the transition between steps become visible because with a gradient, neighboring pixels will all jump to the next "step" of precision at the same time.

### The Solution

One solution is to simply output in HDR - more bits of color data means the transition between bands will become smaller. However, not everyone has hardware that supports 10+ bit color depth. Additionally, 10 bit color doesn't even fully solve the issue, banding will result in coherent bands on shallow gradients, but the steps will be harder to perceive.

The solution in this PR adds noise to the signal before it is "quantized" or resampled from 32 to 8 bits. Done naively, it's easy to add unneeded noise to the image. To ensure dithering is correct and absolutely minimal, noise is adding *within* one step of the output color depth. When converting from the 32bit to 8bit signal, the value is rounded to the nearest 8 bit value (0 - 255). Banding occurs around the transition from one value to the next, let's say from 50-51. Dithering will never add more than +/-0.5 bits of noise, so the pixels near this transition might round to 50 instead of 51 but will never round more than one step. This means that the output image won't have excess variance:
  - in a gradient from 49 to 51, there will be a step between each band at 49, 50, and 51.
  - Done correctly, the modified image of this gradient will never have a adjacent pixels more than one step (0-255) from each other.
  - I.e. when scanning across the gradient you should expect to see:
```
                  |-band-| |-band-| |-band-|
Baseline:         49 49 49 50 50 50 51 51 51
Dithered:         49 50 49 50 50 51 50 51 51
Dithered (wrong): 49 50 51 49 50 51 49 51 50
```

![Screenshot from 2022-11-10 14-12-36](https://user-images.githubusercontent.com/2632925/201219075-ab3f46be-d4e9-4869-b66b-a92e1706f49e.png)
![Screenshot from 2022-11-10 14-11-48](https://user-images.githubusercontent.com/2632925/201219079-ec5d2add-817d-487a-8fc1-84569c9cda73.png)




You can see from above how correct dithering "fuzzes" the transition between bands to reduce distinct steps in color, without adding excess noise.

### HDR

The previous section (and this PR) assumes the final output is to an 8-bit texture, however this is not always the case. When Bevy adds HDR support, the dithering code will need to take the per-channel depth into account instead of assuming it to be 0-255. Edit: I talked with Rob about this and it seems like the current solution is okay. We may need to revisit once we have actual HDR final image output.

---

## Changelog

### Added

- All pipelines now support deband dithering. This is enabled by default in 3D, and can be toggled in the `Tonemapping` component in camera bundles. Banding is a graphical artifact created when the rendered image is crunched from high precision (f32 per color channel) down to the final output (u8 per channel in SDR). This results in subtle gradients becoming blocky due to the reduced color precision. Deband dithering applies a small amount of noise to the signal before it is "crunched", which breaks up the hard edges of blocks (bands) of color. Note that this does not add excess noise to the image, as the amount of noise is less than a single step of a color channel - just enough to break up the transition between color blocks in a gradient.


Co-authored-by: Carter Anderson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Rendering Drawing game state to the screen C-Usability A targeted quality-of-life change that makes Bevy easier to use
Projects
Archived in project
Status: Done
Development

Successfully merging this pull request may close these issues.