-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zeroize: zeroize entire capacity of Vec
#341
Conversation
zeroize/src/lib.rs
Outdated
// - The total size of the slice must be no larger than `isize::MAX`. | ||
// This is true, because `Vec` never allocates more than `isize::MAX` bytes. | ||
let extra_capacity = unsafe { | ||
core::slice::from_raw_parts_mut( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appears to create a slice of uninitialized memory, to give it to the Zeroize
impl on [Z]
, which iterates over it and performs ptr::write_volatile
.
It might make sense instead to perform ptr::write_volatile
here directly, followed by atomic_fence()
, to avoid ever constructing an uninitialized slice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively this was recently stabilized: rust-lang/rust#68234
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite see how constructing a *mut [T]
helps. It's main intent is for data structures storing raw pointer or a NonNull<T>
without lifetime information but here it would still require somehow manually iterating over the individual element pointers to do a volatile write. In particular, the implementation of Z::zeroize(&mut self)
can still not be called without UB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks ok now, but will run it by a few others before merging
Thanks! By the way, I just noticed that a similar thing might make sense for Oh and I just noticed the |
Yes, that'd be good.
Sure, but I'd prefer a separate PR so it doesn't block this one. |
let extra_capacity_start = unsafe { self.as_mut_ptr().add(self.len()) as *mut u8 }; | ||
let extra_capacity_len = self.capacity().saturating_sub(self.len()); | ||
|
||
for i in 0..(extra_capacity_len * core::mem::size_of::<Z>()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's a simpler option than casting to a *mut u8
, which is using mem::zeroed
to produce the all-zero byte pattern for Z
, and ptr::write_volatile
to write it, e.g.
for i in 0..extra_capacity_len {
unsafe { ptr::write_volatile(extra_capacity_start.add(i), mem::zeroed()); }
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered that, but it would result in undefined behavior, when Z
is not a type where an all-zero pattern is a valid value. However now that I think about it again, it wouldn't make much sense to implement Zeroize
for such a type. I'll adjust that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually on second thought, if it were implemented that way, it would be possible to invoke undefined behavior from safe code. For example this would then result in undefined behavior:
use core::num::NonZeroU8;
use zeroize::Zeroize;
struct NonZero(NonZeroU8);
impl Zeroize for NonZero {
fn zeroize(&mut self) {
self.0 = NonZeroU8::new(1).unwrap();
}
}
fn main() {
let mut vec = vec![NonZero(NonZeroU8::new(2).unwrap())];
vec.clear();
// undefined behavior: this would create a `NonZeroU8` with a
// memory-representation of all-zeroes while zeroing the
// uninitialized memory
vec.zeroize();
}
This may not be a very useful implementation of Zeroize
, but it would be a way to invoke undefined behavior in safe code. The other version does not have this problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would the uninitialized memory be exposed to safe code?
Regardless, the net effect is the same: you are writing zeroed bytes to the excess capacity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would the uninitialized memory be exposed to safe code?
It doesn't need to. According to the doc of mem::zeroed
:
There is no guarantee that an all-zero byte-pattern represents a valid value of some type
T
. For example, the all-zero byte-pattern is not a valid value for reference types (&T
and&mut T
). Usingzeroed
on such types causes immediate undefined behavior because the Rust compiler assumes that there always is a valid value in a variable it considers initialized.
Which links to a section in the docs about undefined behavior:
It is the programmer's responsibility when writing
unsafe
code to ensure that any safe code interacting with theunsafe
code cannot trigger these behaviors.unsafe
code that satisfies this property for any safe client is called sound; ifunsafe
code can be misused by safe code to exhibit undefined behavior, it is unsound.
The same page also states that
Rust code is incorrect if it exhibits any of the behaviors in the following list.
- [...]
- Invalid values for a type with a custom definition of invalid values. In the standard library, this affects
NonNull<T>
andNonZero*
.
The example above would therefore be both unsound and incorrect, which is possible from safe code. The values do not need to be accessible by safe code to create the problem here.
With the method which zeroes the memory using u8
this would not be the case, as u8
is guaranteed to be able to hold a value of 0
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is a better solution?
The implementation, as currently written in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It still seems like 6 of one, half dozen of another to me, except looping a byte-at-a-time and doing pointer arithmetic around core::mem::zeroed::<NonNull<T>>()
is more complex and potentially slower.
It seems the crux of this is...
Memory behind a raw pointer can hold any value without any problem, because Rust does not consider it to be initialized. The problem only arises when an invalid value of type T exists, which Rust considers initialized, which mem::zeroed() does.
...but we're talking about a buffer which is defined as being:
What does a normal Vec::with_capacity call initialize it to?
Nothing, it does not create a value of type T though and just leaves the memory completely uninitialized.
The contract of Vec
is always to initialize this capacity in some way before reading from it.
It seems the claim is it might be considered initialized by the Rust compiler when it is uninitialized, but before its API permits any reads, it will be initialized again.
Can either of you give a concrete example of a tractable problem which is possible given the contract of Vec
to always (re)initialize this memory which would not occur with pointer-based zeroing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What you are doing to the excess buffer of the vector is and never was the problem. You are indeed free to do with the extra capacity of bytes whatever you want. But not however you want. The code that writes invalid or uninitialized values into it must still be UB-free and executing mem::zeroed::<T>
is not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can either of you give a concrete example of a tractable problem which is possible given the contract of Vec to always (re)initialize this memory which would not occur with pointer-based zeroing?
The problem is not the Vec
or its memory, the problem is simply that the code could call mem::zeroed<NonNull<()>>
, which is UB. The compiler might make optimizations that would crash the program if such a value ever exists. This would be the same problem in every program, even without a Vec
being involved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I think I get it now. Thank you for the explanation.
I, Niclas Schwarzlose, hereby agree to license all contributions I make to this project under the terms of the Apache License, Version 2.0.
Re-implement #180, but without any additional required trait bounds, which is why #180 was reverted in #276.