-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Moving transform data out of Mat4 #596
Conversation
@@ -29,8 +29,8 @@ fn propagate_recursive( | |||
transform_query.get::<Transform>(entity), | |||
transform_query.get_mut::<GlobalTransform>(entity), | |||
) { | |||
*global_transform.value_mut() = parent * *transform.value(); | |||
*global_transform.value() | |||
global_transform.value = *parent * transform.compute_matrix(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This requires recomputing the local space transform for every decedent of a parent recursively, which is unnecessary. The original impl cached it in LocalToParent
. This also can never work with on-change detection because you are forced to recompute every hierarchy in case a descendant changed. Meaning every run you must recompute local space transforms, and global space transforms for every singe entity in a hierarchy regardless of if they changed or not. Especially in development builds where static objects in a hierarchy haven't yet been backed into a GlobalTransform
, that sounds like a lot performance overhead to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, I totally overlooked that. Thanks for pointing that out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now the matrix is only recomputed when the transform changed. This still recomputes all the GlobalTransform
s every frame, regardless if any changes occured, but this is how the system worked before and should IMO be for a different PR to fix.
translation: Vec3::zero(), | ||
rotation: Quat::identity(), | ||
scale: Vec3::one(), | ||
matrix_cache: Some(Mat4::identity()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caching the matrix in this transform is going to result in a cache miss for any access of translation, rotation or scale. Rust automatically orders struct fields by alignment and Mat4 is 16 byte aligned due to SIMD so it will go first. It's a 64 byte struct so it will fill an entire cache line by itself. Additionally putting it inside an Option will add 15 bytes of padding to the beginning of the structure. An Option<Mat4>
will start with a discriminant byte that tells Rust if the option is Some
or None
, this byte will be aligned to the same alignment of what the option holds, so 16, thus 15 bytes of padding.
It would be more efficient to keep these cached values in a separate structure. However if you want to keep it in the same structure you would be better off getting rid of the Option
and just use a bool
to flag if there is a cache value or not.
Also glam
has a type which stores transforms as translation, rotation (as Quat
) and scale so you could use that instead of writing it yourself. It does require enabling the transform-types
feature of glam
though.
Edit: see example of size differences of Option
versus bool
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=2624654e172aac09955ffb796e73404a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I don't think we should store the "matrix cache" here because the type is now "too big"
But I also want to point out that if we rip out the matrix into a separate component, we now have re-created the very "synchronization problem" that we were trying to solve in the first place with the transform rework. At that point, why not use individual components like @AThilenius's initial implementation used? Then we get the benefits of "better cache perf and easier access to individual parts of a transform".
I think its very hard to justify a Transform { pos/rot/scale } component w/ a matrix cache (either in another component or embedded inside Transform). Unless I'm missing something, we should either eat the "mat4 creation cost" at the call site whenever a full 4x4 matrix is required, or we should just move back to the old system (or something like it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I like @termhn's idea to use Similarities and multiply those directly whenever possible. We only really need a 4x4 when writing data used by shaders, which only needs to happen once per rendered GlobalTransform.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that Transform { pos/rot/scale }
(with no matrix cache) is a Similarity (in glam known as a TransformSRT
). But I agree, the transformation of that to a matrix should only happen as @cart described
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those are some very good points.
Unless I'm missing something, we should either eat the "mat4 creation cost" at the call site whenever a full 4x4 matrix is required, or we should just move back to the old system (or something like it).
I think that is the main question we have to decide on. I'm personally against separate components, only for the fact that with a transform we can take more work away from the user (the reason being explained in the lower part of my comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. I think I would prefer a custom Transform type with the relevant methods/traits implemented instead of new-typing a math-lib type. It makes direct property access nicer and this logic isnt particularly complicated.
Lets wait a day or so for someone to chime in with counterpoints. Otherwise I'm inclined to move forward on
struct Transform {
pub translation: Vec3,
pub rotation: Quat,
pub scale: Vec3,
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll revert the last commit so everyone checking in is on the same page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. I think the biggest missing piece at this point is implementing Transform x Transform = Transform and using that when calculating GlobalTransform. To get the most out of this, I think we should also make GlobalTransform a similarity (ideally not a newtype to allow convenient direct field access). But that then creates the issue of "how do we convert GlobalTransform to matrix bytes when creating uniform buffers?". I think "data-driven uniforms" need a better story for "this type should be converted to this other byte-convertible type when used as a uniform". The current system assumes the type is directly byte-convertible.
In the short term I think something like this will work:
#[derive(RenderResources, RenderResource)]
#[render_resources(from_self)]
pub struct GlobalTransform {
pub translation: Vec3,
pub rotation: Quat,
pub scale: Vec3,
}
impl Bytes for GlobalTransform {
fn write_bytes(&self, buffer: &mut [u8]) {
// convert self to matrix and write bytes to buffer
}
fn byte_len(&self) -> usize {
return std::mem::size_of::<[f32; 16]>();
}
}
I think it also makes sense to remove things like the "rotate" and "apply_scale" helper methods in favor of direct field access.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fully agree with the first part.
pub translation: Vec3,
pub rotation: Quat,
pub scale: Vec3,
I'm not quite sold on the attributes being public though, yet. The way I'm perceiving GlobalTransform
, the developer shouldn't / shouldn't need to mutate it themselves, or am I forgetting something?
I think it also makes sense to remove things like the "rotate" and "apply_scale" helper methods in favor of direct field access.
I think this comes down to the question "how much should the developer know to use the engine". Most people should find it trivial to apply one Quat
to another, but I think the few added lines of code are a worthwhile price for not requiring that sort of knowledge. It could be, however, that I'm reading too much into the priorities of this project, in which case I'd be totally fine with that change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sold on the attributes being public though, yet. The way I'm perceiving GlobalTransform, the developer shouldn't / shouldn't need to mutate it themselves, or am I forgetting something?
I see no harm in making them public in this case. First, it provides parity with Transform (which I think is the biggest win). People can access "translation" the same way on both components. Different apis would increase the cognitive load. Additionally, there may be cases where people want to directly modify GlobalTransform (ex: given that the engine internally uses GlobalTransform as the "actual transform of the entity", there might eventually be a case where we have an entity that only has GlobalTransform).
I think this comes down to the question "how much should the developer know to use the engine". Most people should find it trivial to apply one Quat to another, but I think the few added lines of code are a worthwhile price for not requiring that sort of knowledge. It could be, however, that I'm reading too much into the priorities of this project, in which case I'd be totally fine with that change.
I think I'd like to keep it simple for now. Every api we add has a cost. If it turns out that this is a real usability gap, we can always layer it on top.
} | ||
|
||
pub fn from_scale(scale: f32) -> Self { | ||
Transform::new(Mat4::from_scale(Vec3::splat(scale))) | ||
Transform { | ||
scale: Vec3::one() * scale, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could do Vec3::splat(scale) here so it's just assignment with no multiplication.
self | ||
/// Returns transform with the same translation and scale, but rotation so that transform.forward() points at the origin | ||
pub fn looking_at_origin(self) -> Self { | ||
self.looking_at(Vec3::zero(), Vec3::unit_y()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an aside it might pay for bevy to have some constants for UP
, FORWARD
and SIDE
(or RIGHT
) axes, might make it clearer what the coordinate system is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added support for creating constants in glam
but I need to publish a new release. It's a bit tricky because underlying intrinsics aren't const fn
so macros have to be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooh thats cool / much appreciated 😄
I'd like to include this in the next Bevy release (which should be in the next week or so). This is pretty close to being ready. @MarekLg do you want me to push the last few requested changes needed or would you prefer to do it? Don't feel pressured to do work you don't have time for. You've already done a ton of good work here ❤️ |
Feel free to make the final changes. I don't have much time this weekend, sadly. But thanks for the kind words, looking forward to more contributions! |
This reverts commit 6da85b3.
Alrighty I made GlobalTransform a similarity, implemented math ops (using @termhn's Similarity impl as a reference), and removed some methods from Transform/GlobalTransform to match my personal taste. I'll leave this open for a day or two in case someone wants to comment on my changes, then its merge time! |
Transform and GlobalTransform are now Similarities. This resolves precision errors and simplifies the api Co-authored-by: Carter Anderson <[email protected]>
Concerning #229.
As pointed out @AThilenius, having mutable transform data stored in a matrix results in rotation overlapping with scale and therefore affecting each other over time (see also this article).
This PR aims to resolve only this issue.
Any feedback and propositions for changes are very welcome!