Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed large swaths of unsoundness in nalgebra. #949

Merged
merged 34 commits into from
Aug 4, 2021
Merged

Conversation

vihdzp
Copy link
Contributor

@vihdzp vihdzp commented Jul 18, 2021

Uninitialized data buffers are used throughout the nalgebra codebase. As per the Rust docs, the correct way to use an uninitialized buffer would be to declare a Storage<MaybeUninit<T>, _, _>, initialize all of its entries, and then somehow convert it into a Storage<T, _, _>. However, nalgebra skips the MaybeUninit step altogether. This isn't much of a problem when T is a primitive numeric type (under the current compiler!), but if T has invalid bit combinations or needs to be dropped, this can cause serious trouble. Consequently, most code in nalgebra can currently trigger UB. 😰

This PR aims to solve this issue. It will be marked as a Draft until we rigorously check safety requirements throughout the new code, make the transition from the previous code simpler, polish some negative consequences of this change, and until the other nalgebra crates are rid of these same issues.

Changes

Solving this issue required an extensive rework of some core traits of the library. I explain some of the more important ones:

Scalar trait redefinition

Previously, Scalar served as the monolithic trait that bounded the data type in every single data structure in nalgebra. It had the following traits as sub-traits:

pub trait Scalar: Clone + PartialEq + Debug + Any { /* ... */ }

This was immediately problematic, since MaybeUninit<T> does not implement PartialEq, and only implements Clone when T: Copy.

To get around this, Scalar was demoted from being a universally required trait to merely being a convenient marker for methods with clear numerical purpose. That is, T: Scalar is now only required for numerical methods on matrices, and is no longer required for methods that merely treat matrices as a data storage.

The PartialEq bound was removed, as the vast majority of methods make no use of it (and those that did could simply use a trait bound). This restuls in the current Scalar trait:

pub trait Scalar: 'static + Clone + Debug { /* ... */ }

The 'static bound is currently only used to optimize matrix multiplication on floating point numbers (specifically gemm), via TypeId::of. If we wanted to, we could replace this use by associated methods is_f32 and is_f64. The remaining trait bounds are essential, since there's very little we can numerically operate on a matrix without cloning (unless we use some highly unsafe moves), and since debugging generic code would be almost impossible without the Scalar: Debug bound.

Allocator trait redefinition

The Allocator trait allows for matrix storage allocation to be specialized, depending on whether any of its dimensions are Dynamic. Formerly, it consisted only of a Buffer associated type, and an allocate_from_iterator associated method. This simpler trait has been renamed InnerAllocator.

The new Allocator trait looks like this:

pub trait Allocator<T, R: Dim, C: Dim = U1>:
    InnerAllocator<T, R, C>
    + InnerAllocator<MaybeUninit<T>, R, C>
    + InnerAllocator<ManuallyDrop<T>, R, C>
{ /* ... */ }

Using this more specific trait is almost never an issue, and is in fact encouraged, as it makes code future-proof while not actually changing the types you can use your method with.

Various methods for uninitialized buffers

The following methods have been added or reworked for convenience of working with uninitialized data storage.

  • new_uninitialized_generic: allocates a matrix with an owned uninitialized storage.
  • assume_init: retrieves a matrix with an initialized storage out of an uninitialized one.
  • assume_init_ref / assume_init_mut: same as above, but don't take ownership of the matrix and return a full slice of it instead.

Other minor changes

There's a few other changes I made and methods I implemented, mostly out of convenience.

  • Added various safety remarks on new and existing code.
  • Added methods to take matrix transposes from an owned matrix.
  • Added a convenience method to build a "full slice" out of matrix.
  • Expanded the documentation of certain methods and types.
  • Used unsafe typecasts to get rid of a few clones here and there.
  • Switched a few #[repr(C)] into #[repr(transparent)], got rid of others altogether.

@vihdzp
Copy link
Contributor Author

vihdzp commented Jul 18, 2021

In the current implementation, we've lost the capability to use #[derive(Copy, Clone, Debug)] on any type containing an owned matrix. I'm currently thinking of workarounds, or perhaps a more radical solution to overhaul the Owned type itself.

@sebcrozet
Copy link
Member

sebcrozet commented Aug 2, 2021

Thank you a lot @OfficialURL for all your efforts on this. You have done a fantastic job exploring the various approaches. Since you told me you may no longer have much time for this, I am taking over as agreed.

The single Allocator trait and two storage traits approach

From all the approaches @OfficialURL explored, it became apparent that all the possible solutions have shortcomings. And among these solutions the one that does not have a negative impact on the ergonomics of the API involving initialized matrices is the approach involving a single allocator trait with two buffer associated types. I have been experimenting with this and it works well. It looks like this:

pub trait Allocator<T, R: Dim, C: Dim = U1>: Any + Sized {
    type Buffer: StorageMut<T, R, C> + IsContiguous + Clone + Debug;
    type BufferUninit: RawStorageMut<MaybeUninit<T>, R, C> + IsContiguous;

    fn allocate_uninit(nrows: R, ncols: C) -> Self::BufferUninit;
    unsafe fn assume_init(uninit: Self::BufferUninit) -> Self::Buffer;
    fn allocate_from_iterator<I: IntoIterator<Item = T>>(
        nrows: R,
        ncols: C,
        iter: I,
    ) -> Self::Buffer;
}

We can see a few new things here: RawStorageMut and IsContiguous.
Because non-initialized matrices can’t be cloned, storages containing MaybeUninit<T> elements can’t implement the Storage::into_owned and Storage::clone_owned methods. That’s why I split the Storage trait into two parts:

  • RawStorage that contains everything our existing Storage trait has right now, except for ::clone_owned and into_owned.
  • Storage that inherits from RawStorage and adds the ::clone_owned and ::into_owned methods.

From the en-user point of view, this should not have no impact on pre-existing code: they will continue to use Storage normally. And this allows us to use RawStorage whenever we are using a matrix that may be uninitialized.

The IsContiguous trait replaces what was before ContiguousStorageMut and ContiguousStorage. For example our current ContiguousStorage trait has been removed and replaced by ContiguousStorage + IsContiguous. This may make some pre-existing code more verbose, so we can re-add the ContiguousStorage trait in the future if that can avoid unwanted downstream breakage. @Andlon please let me know what you thing about this.

I have pushed a commit making these changes. I also removed from the previous commits everything that wasn’t directly related to fixing our unsoundness problem. So the Scalar trait is back to the way it was before (with the RelativeEq inheritance), and Scalar is used in more places than it could be. I did this to keep the diff smaller, so we can focus more on the main purpose of this PR.

About blas.rs/blas_uninit.rs

I simplified the blas.rs modification made by the previous commit by:

  • Introducing a way to re-use the same code for the case where output matrix has T or MaybeUninit<T> argument. This avoids lots of code duplication.
  • Implementing the uninit version of gemm (and the other blas operations it uses) only. This is the only blas method we currently used with an uninitialized buffer.

About linalg

The changes needed to use uninit matrices properly wherever we currently use them was too complicated to get right, error prone, and difficult to read. So I ran some benchmark and noticed that:

  • Using uninit matrices for output matrices have a benefit.
  • Using uninit matrices for workspaces have no benefit.

This makes sense because workspaces are read/written a lot, making the initialization cost negligible.

This is great because uninit workspaces are the ones that are very difficult to use properly because they are passed around to multiple functions, are read/written multiple times, etc. So avoiding things like double-initialization is very difficult.
Uninit output matrices on the other hand are very easy to manipulate because we only write to them once.

So in the end, I used zero-initialized matrices for workspaces, and uninit matrices for outputs.

Limitation

The main limitation of this approach is that Allocator::UninitBuffer may not unify correctly with Matrix<MaybeUninit<T>,...>. However, this didn’t prove very problematic in practice to make everything work.

@Andlon
Copy link
Collaborator

Andlon commented Aug 3, 2021

I would like to try out this branch on my code base shortly.

In the meantime, I have a question about linalg outputs: what if you want to store the result in an initialized buffer? I do this all the time to ensure no/fewer allocations when doing repeated operations.

@sebcrozet
Copy link
Member

sebcrozet commented Aug 3, 2021

In the meantime, I have a question about linalg outputs: what if you want to store the result in an initialized buffer? I do this all the time to ensure no/fewer allocations when doing repeated operations.

The public API of the linalg module hasn’t changed so what you are currently doing will continue to work.

@Andlon
Copy link
Collaborator

Andlon commented Aug 3, 2021

In the meantime, I have a question about linalg outputs: what if you want to store the result in an initialized buffer? I do this all the time to ensure no/fewer allocations when doing repeated operations.

The public API of the linalg module hasn’t changed so what you are currently doing will continue to work.

Ah, thanks for clarifying! I tried to quickly zoom through the commit to figure it out, but it was hard to get an overview from a brief glance.

@sebcrozet
Copy link
Member

The CI passes now. The main remaining problem are the uses of ptr::copy and ptr::copy_nonoverlapping which are sometimes problematic for non-copy types (may lead to double-drop or missing drops).

@Andlon
Copy link
Collaborator

Andlon commented Aug 3, 2021

All right, so with some minor fixes only related to the changes to .zip_apply and similar, my code compiles without any changes to trait bounds or similar, so that's great :-)

I looked through the docs a little. Shouldn't methods like Allocator::allocate_uninitialized (in its current form) be removed? Similarly, shouldn't methods like OMatrix::new_uninitialized_generic also be changed to return something like UninitMatrix<T, R, C>?

@Andlon
Copy link
Collaborator

Andlon commented Aug 3, 2021

By the way, one slight "regression" in the linalg APIs. IIRC, before you could have an uninitialized Matrix and call e.g. .gemm() on it with beta == 0, since it's guaranteed that self was never read in that case. But this is no longer possible, is it? So you'd need to fully initialize the output in this case?

EDIT: This isn't necessarily a big deal (for me certainly not), but I thought it worth mentioning.

@sebcrozet
Copy link
Member

@Andlon

I looked through the docs a little. Shouldn't methods like Allocator::allocate_uninitialized (in its current form) be removed? Similarly, shouldn't methods like OMatrix::new_uninitialized_generic also be changed to return something like UninitMatrix<T, R, C>?

Yes, I’ve removed them this morning. I’m finishing cleaning everything up today.

By the way, one slight "regression" in the linalg APIs. IIRC, before you could have an uninitialized Matrix and call e.g. .gemm() on it with beta == 0, since it's guaranteed that self was never read in that case. But this is no longer possible, is it? So you'd need to fully initialize the output in this case?

It’s not really a regression because passing an uninitialized Matrix was unsound in the first place (even though it didn’t cause any trouble in practice). If someone really wants to use an uninitialized matrix, they will be able to use the gemm_uninit function which can take an uninit matrix as the y mutable argument. However, that method is unsafe (because passing an uninit watrix with beta != 0 would be unsound).

@Andlon
Copy link
Collaborator

Andlon commented Aug 3, 2021

That's excellent, thanks for clearing that up :)

@sebcrozet sebcrozet marked this pull request as ready for review August 3, 2021 15:44
@sebcrozet sebcrozet merged commit 71ceb1f into dimforge:dev Aug 4, 2021
@Andlon
Copy link
Collaborator

Andlon commented Aug 4, 2021

Great work once again with spearheading the initiative to remove this source of unsoundness in the design of nalgebra, @OfficialURL! 🥳 This was only made possible by your relentless exploration of the different design candidates. And really awesome that you could take the time to help push it over the finishing line @sebcrozet!

It's been an open issue for such a long time, it's amazing to see this fixed! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants