Skip to content

Commit

Permalink
Sprite Batching (#3060)
Browse files Browse the repository at this point in the history
This implements the following:

* **Sprite Batching**: Collects sprites in a vertex buffer to draw many sprites with a single draw call. Sprites are batched by their `Handle<Image>` within a specific z-level. When possible, sprites are opportunistically batched _across_ z-levels (when no sprites with a different texture exist between two sprites with the same texture on different z levels). With these changes, I can now get ~130,000 sprites at 60fps on the `bevymark_pipelined` example.
* **Sprite Color Tints**: The `Sprite` type now has a `color` field. Non-white color tints result in a specialized render pipeline that passes the color in as a vertex attribute. I chose to specialize this because passing vertex colors has a measurable price (without colors I get ~130,000 sprites on bevymark, with colors I get ~100,000 sprites). "Colored" sprites cannot be batched with "uncolored" sprites, but I think this is fine because the chance of a "colored" sprite needing to batch with other "colored" sprites is generally probably way higher than an "uncolored" sprite needing to batch with a "colored" sprite.
* **Sprite Flipping**: Sprites can be flipped on their x or y axis using `Sprite::flip_x` and `Sprite::flip_y`. This is also true for `TextureAtlasSprite`.
* **Simpler BufferVec/UniformVec/DynamicUniformVec Clearing**:  improved the clearing interface by removing the need to know the size of the final buffer at the initial clear.

![image](https://user-images.githubusercontent.com/2694663/140001821-99be0d96-025d-489e-9bfa-ba19c1dc9548.png)


Note that this moves sprites away from entity-driven rendering and back to extracted lists. We _could_ use entities here, but it necessitates that an intermediate list is allocated / populated to collect and sort extracted sprites. This redundant copy, combined with the normal overhead of spawning extracted sprite entities, brings bevymark down to ~80,000 sprites at 60fps. I think making sprites a bit more fixed (by default) is worth it. I view this as acceptable because batching makes normal entity-driven rendering pretty useless anyway (and we would want to batch most custom materials too). We can still support custom shaders with custom bindings, we'll just need to define a specific interface for it.
  • Loading branch information
cart committed Nov 4, 2021
1 parent 2f22f5c commit 8548770
Show file tree
Hide file tree
Showing 13 changed files with 384 additions and 236 deletions.
21 changes: 16 additions & 5 deletions examples/tools/bevymark_pipelined.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,13 @@ use bevy::{
ecs::prelude::*,
input::Input,
math::Vec3,
prelude::{App, AssetServer, Handle, MouseButton, Transform},
prelude::{info, App, AssetServer, Handle, MouseButton, Transform},
render2::{camera::OrthographicCameraBundle, color::Color, texture::Image},
sprite2::PipelinedSpriteBundle,
sprite2::{PipelinedSpriteBundle, Sprite},
window::WindowDescriptor,
PipelinedDefaultPlugins,
};
use rand::Rng;
use rand::{random, Rng};

const BIRDS_PER_SECOND: u32 = 10000;
const _BASE_COLOR: Color = Color::rgb(5.0, 5.0, 5.0);
Expand All @@ -21,6 +21,7 @@ const HALF_BIRD_SIZE: f32 = 256. * BIRD_SCALE * 0.5;

struct BevyCounter {
pub count: u128,
pub color: Color,
}

struct Bird {
Expand Down Expand Up @@ -52,7 +53,10 @@ fn main() {
.add_plugin(FrameTimeDiagnosticsPlugin::default())
.add_plugin(LogDiagnosticsPlugin::default())
// .add_plugin(WgpuResourceDiagnosticsPlugin::default())
.insert_resource(BevyCounter { count: 0 })
.insert_resource(BevyCounter {
count: 0,
color: Color::WHITE,
})
// .init_resource::<BirdMaterial>()
.add_startup_system(setup)
.add_system(mouse_handler)
Expand Down Expand Up @@ -161,6 +165,9 @@ fn mouse_handler(
// texture: Some(texture_handle),
// });
// }
if mouse_button_input.just_released(MouseButton::Left) {
counter.color = Color::rgb(random(), random(), random());
}

if mouse_button_input.pressed(MouseButton::Left) {
let spawn_count = (BIRDS_PER_SECOND as f64 * time.delta_seconds_f64()) as u128;
Expand Down Expand Up @@ -194,6 +201,10 @@ fn spawn_birds(
scale: Vec3::splat(BIRD_SCALE),
..Default::default()
},
sprite: Sprite {
color: counter.color,
..Default::default()
},
..Default::default()
})
.insert(Bird {
Expand Down Expand Up @@ -255,7 +266,7 @@ fn counter_system(
counter: Res<BevyCounter>,
) {
if timer.timer.tick(time.delta()).finished() {
println!("counter: {}", counter.count);
info!("counter: {}", counter.count);
}
}

Expand Down
9 changes: 4 additions & 5 deletions pipelined/bevy_core_pipeline/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@ pub use main_pass_3d::*;
pub use main_pass_driver::*;

use bevy_app::{App, Plugin};
use bevy_asset::Handle;
use bevy_core::FloatOrd;
use bevy_ecs::{
prelude::*,
Expand All @@ -23,7 +22,7 @@ use bevy_render2::{
},
render_resource::*,
renderer::RenderDevice,
texture::{Image, TextureCache},
texture::TextureCache,
view::{ExtractedView, Msaa, ViewDepthTexture},
RenderApp, RenderStage, RenderWorld,
};
Expand Down Expand Up @@ -131,18 +130,18 @@ impl Plugin for CorePipelinePlugin {
}

pub struct Transparent2d {
pub sort_key: Handle<Image>,
pub sort_key: FloatOrd,
pub entity: Entity,
pub pipeline: CachedPipelineId,
pub draw_function: DrawFunctionId,
}

impl PhaseItem for Transparent2d {
type SortKey = Handle<Image>;
type SortKey = FloatOrd;

#[inline]
fn sort_key(&self) -> Self::SortKey {
self.sort_key.clone_weak()
self.sort_key
}

#[inline]
Expand Down
9 changes: 4 additions & 5 deletions pipelined/bevy_pbr2/src/render/light.rs
Original file line number Diff line number Diff line change
Expand Up @@ -383,10 +383,7 @@ pub fn prepare_lights(
point_lights: Query<&ExtractedPointLight>,
directional_lights: Query<&ExtractedDirectionalLight>,
) {
// PERF: view.iter().count() could be views.iter().len() if we implemented ExactSizeIterator for archetype-only filters
light_meta
.view_gpu_lights
.reserve_and_clear(views.iter().count(), &render_device);
light_meta.view_gpu_lights.clear();

let ambient_color = ambient_light.color.as_rgba_linear() * ambient_light.brightness;
// set up light data for each view
Expand Down Expand Up @@ -605,7 +602,9 @@ pub fn prepare_lights(
});
}

light_meta.view_gpu_lights.write_buffer(&render_queue);
light_meta
.view_gpu_lights
.write_buffer(&render_device, &render_queue);
}

pub fn queue_shadow_view_bind_group(
Expand Down
4 changes: 4 additions & 0 deletions pipelined/bevy_render2/src/color/colorspace.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ pub trait SrgbColorSpace {

// source: https://entropymine.com/imageworsener/srgbformula/
impl SrgbColorSpace for f32 {
#[inline]
fn linear_to_nonlinear_srgb(self) -> f32 {
if self <= 0.0 {
return self;
Expand All @@ -17,6 +18,7 @@ impl SrgbColorSpace for f32 {
}
}

#[inline]
fn nonlinear_to_linear_srgb(self) -> f32 {
if self <= 0.0 {
return self;
Expand All @@ -32,6 +34,7 @@ impl SrgbColorSpace for f32 {
pub struct HslRepresentation;
impl HslRepresentation {
/// converts a color in HLS space to sRGB space
#[inline]
pub fn hsl_to_nonlinear_srgb(hue: f32, saturation: f32, lightness: f32) -> [f32; 3] {
// https://en.wikipedia.org/wiki/HSL_and_HSV#HSL_to_RGB
let chroma = (1.0 - (2.0 * lightness - 1.0).abs()) * saturation;
Expand Down Expand Up @@ -60,6 +63,7 @@ impl HslRepresentation {
}

/// converts a color in sRGB space to HLS space
#[inline]
pub fn nonlinear_srgb_to_hsl([red, green, blue]: [f32; 3]) -> (f32, f32, f32) {
// https://en.wikipedia.org/wiki/HSL_and_HSV#From_RGB
let x_max = red.max(green.max(blue));
Expand Down
1 change: 1 addition & 0 deletions pipelined/bevy_render2/src/color/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -416,6 +416,7 @@ impl Color {
}

/// Converts a `Color` to a `[f32; 4]` from linear RBG colorspace
#[inline]
pub fn as_linear_rgba_f32(self: Color) -> [f32; 4] {
match self {
Color::Rgba {
Expand Down
9 changes: 4 additions & 5 deletions pipelined/bevy_render2/src/render_component.rs
Original file line number Diff line number Diff line change
Expand Up @@ -92,10 +92,7 @@ fn prepare_uniform_components<C: Component>(
) where
C: AsStd140 + Clone,
{
let len = components.iter().len();
component_uniforms
.uniforms
.reserve_and_clear(len, &render_device);
component_uniforms.uniforms.clear();
for (entity, component) in components.iter() {
commands
.get_or_spawn(entity)
Expand All @@ -105,7 +102,9 @@ fn prepare_uniform_components<C: Component>(
});
}

component_uniforms.uniforms.write_buffer(&render_queue);
component_uniforms
.uniforms
.write_buffer(&render_device, &render_queue);
}

pub struct ExtractComponentPlugin<C, F = ()>(PhantomData<fn() -> (C, F)>);
Expand Down
34 changes: 18 additions & 16 deletions pipelined/bevy_render2/src/render_resource/buffer_vec.rs
Original file line number Diff line number Diff line change
Expand Up @@ -43,17 +43,20 @@ impl<T: Pod> BufferVec<T> {
self.capacity
}

#[inline]
pub fn len(&self) -> usize {
self.values.len()
}

#[inline]
pub fn is_empty(&self) -> bool {
self.values.is_empty()
}

pub fn push(&mut self, value: T) -> usize {
let len = self.values.len();
if len < self.capacity {
self.values.push(value);
len
} else {
panic!(
"Cannot push value because capacity of {} has been reached",
self.capacity
);
}
let index = self.values.len();
self.values.push(value);
index
}

pub fn reserve(&mut self, capacity: usize, device: &RenderDevice) {
Expand All @@ -69,12 +72,11 @@ impl<T: Pod> BufferVec<T> {
}
}

pub fn reserve_and_clear(&mut self, capacity: usize, device: &RenderDevice) {
self.clear();
self.reserve(capacity, device);
}

pub fn write_buffer(&mut self, queue: &RenderQueue) {
pub fn write_buffer(&mut self, device: &RenderDevice, queue: &RenderQueue) {
if self.values.is_empty() {
return;
}
self.reserve(self.values.len(), device);
if let Some(buffer) = &self.buffer {
let range = 0..self.item_size * self.values.len();
let bytes: &[u8] = cast_slice(&self.values);
Expand Down
38 changes: 14 additions & 24 deletions pipelined/bevy_render2/src/render_resource/uniform_vec.rs
Original file line number Diff line number Diff line change
Expand Up @@ -58,19 +58,12 @@ impl<T: AsStd140> UniformVec<T> {
}

pub fn push(&mut self, value: T) -> usize {
let len = self.values.len();
if len < self.capacity {
self.values.push(value);
len
} else {
panic!(
"Cannot push value because capacity of {} has been reached",
self.capacity
);
}
let index = self.values.len();
self.values.push(value);
index
}

pub fn reserve(&mut self, capacity: usize, device: &RenderDevice) {
pub fn reserve(&mut self, capacity: usize, device: &RenderDevice) -> bool {
if capacity > self.capacity {
self.capacity = capacity;
let size = self.item_size * capacity;
Expand All @@ -81,15 +74,17 @@ impl<T: AsStd140> UniformVec<T> {
usage: BufferUsages::COPY_DST | BufferUsages::UNIFORM,
mapped_at_creation: false,
}));
true
} else {
false
}
}

pub fn reserve_and_clear(&mut self, capacity: usize, device: &RenderDevice) {
self.clear();
self.reserve(capacity, device);
}

pub fn write_buffer(&mut self, queue: &RenderQueue) {
pub fn write_buffer(&mut self, device: &RenderDevice, queue: &RenderQueue) {
if self.values.is_empty() {
return;
}
self.reserve(self.values.len(), device);
if let Some(uniform_buffer) = &self.uniform_buffer {
let range = 0..self.item_size * self.values.len();
let mut writer = std140::Writer::new(&mut self.scratch[range.clone()]);
Expand Down Expand Up @@ -152,13 +147,8 @@ impl<T: AsStd140> DynamicUniformVec<T> {
}

#[inline]
pub fn reserve_and_clear(&mut self, capacity: usize, device: &RenderDevice) {
self.uniform_vec.reserve_and_clear(capacity, device);
}

#[inline]
pub fn write_buffer(&mut self, queue: &RenderQueue) {
self.uniform_vec.write_buffer(queue);
pub fn write_buffer(&mut self, device: &RenderDevice, queue: &RenderQueue) {
self.uniform_vec.write_buffer(device, queue);
}

#[inline]
Expand Down
10 changes: 5 additions & 5 deletions pipelined/bevy_render2/src/view/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -90,11 +90,9 @@ fn prepare_view_uniforms(
render_device: Res<RenderDevice>,
render_queue: Res<RenderQueue>,
mut view_uniforms: ResMut<ViewUniforms>,
mut views: Query<(Entity, &ExtractedView)>,
views: Query<(Entity, &ExtractedView)>,
) {
view_uniforms
.uniforms
.reserve_and_clear(views.iter_mut().len(), &render_device);
view_uniforms.uniforms.clear();
for (entity, camera) in views.iter() {
let projection = camera.projection;
let view_uniforms = ViewUniformOffset {
Expand All @@ -108,7 +106,9 @@ fn prepare_view_uniforms(
commands.entity(entity).insert(view_uniforms);
}

view_uniforms.uniforms.write_buffer(&render_queue);
view_uniforms
.uniforms
.write_buffer(&render_device, &render_queue);
}

fn prepare_view_targets(
Expand Down
9 changes: 7 additions & 2 deletions pipelined/bevy_sprite2/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,11 @@ use bevy_app::prelude::*;
use bevy_asset::{AddAsset, Assets, HandleUntyped};
use bevy_core_pipeline::Transparent2d;
use bevy_reflect::TypeUuid;
use bevy_render2::{render_phase::DrawFunctions, render_resource::Shader, RenderApp, RenderStage};
use bevy_render2::{
render_phase::DrawFunctions,
render_resource::{Shader, SpecializedPipelines},
RenderApp, RenderStage,
};

#[derive(Default)]
pub struct SpritePlugin;
Expand All @@ -36,8 +40,9 @@ impl Plugin for SpritePlugin {
render_app
.init_resource::<ImageBindGroups>()
.init_resource::<SpritePipeline>()
.init_resource::<SpecializedPipelines<SpritePipeline>>()
.init_resource::<SpriteMeta>()
.add_system_to_stage(RenderStage::Extract, render::extract_atlases)
.init_resource::<ExtractedSprites>()
.add_system_to_stage(RenderStage::Extract, render::extract_sprites)
.add_system_to_stage(RenderStage::Prepare, render::prepare_sprites)
.add_system_to_stage(RenderStage::Queue, queue_sprites);
Expand Down
Loading

0 comments on commit 8548770

Please sign in to comment.