Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

u32::max / u32::min require OpCapability U8 #147

Open
Firestar99 opened this issue Nov 18, 2024 · 6 comments
Open

u32::max / u32::min require OpCapability U8 #147

Firestar99 opened this issue Nov 18, 2024 · 6 comments
Labels
bug Something isn't working

Comments

@Firestar99
Copy link
Member

Firestar99 commented Nov 18, 2024

u32::max / u32::min cannot be used as they require OpCapability U8. They should just work out of the box.

Workaround: use u32::clamp() instead.

It's the Ordering enum. it has #[repr(i8)] (with -1, 0, +1 values)

Maybe we should special case the Ordering enum to be a u32?

@Firestar99 Firestar99 added the bug Something isn't working label Nov 18, 2024
@LegNeato
Copy link
Collaborator

Wouldn't that mean it takes more space on devices that support OpCapability U8?

@Firestar99
Copy link
Member Author

Used within a shader not much would change, the registers are (usually) 32bit anyway and you'd want to do the actual comparison directly afterwards anyway to reduce it to bools, which on many platforms have more compact storage. (eg. on AMD they represent bools by bits in a scalar register instead of using vector registers.)
The bigger issue is when the Ordering enum is written to or read from a buffer, as their representation between CPU and GPU would now differ and any struct containing Ordering as a member would me misaligned.
This would open up the discussion if we should automatically widen u8 and u16 to u32 for arbirary types that are used only within the shader, and whether to bitpack them automagically when loading / storing them to a buffer.
But honestly pretty much all desktop devices support U8 and U16 loads and stores anyway...

@Firestar99
Copy link
Member Author

What about a way simpler solution: Overwrite the method {u8, u16, u32, ..., f32}::max and min to just do what f32::clamp does, which does not use the Ordinal enum at all:

pub fn clamp(mut self, min: f32, max: f32) -> f32 {
    assert!(min <= max, "min > max, or either was NaN. min = {min:?}, max = {max:?}");
    if self < min {
        self = min;
    }
    if self > max {
        self = max;
    }
    self
}

@LegNeato
Copy link
Collaborator

What do we lose with that? I assume nothing?

@Firestar99
Copy link
Member Author

Exactly, it would just make those common methods work without breaking anything else. Now everything using Ordinal directly or implementing Ord / PartialOrd would still fail to compile without OpCapability U8. But I think that's a fine tradeoff, as comparing custom types is rarely done on GPUs anyway, whereas min and maxing int / floats is used everywhere.

@LegNeato
Copy link
Collaborator

I assume the impl with Ordinal is there for a reason, so perhaps we could even switch back to that when OpCapability U8 is used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants