Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Miri should catch uses of slice::from_raw_parts on uninitialized memory #1240

Open
CAD97 opened this issue Mar 19, 2020 · 16 comments
Open

Miri should catch uses of slice::from_raw_parts on uninitialized memory #1240

CAD97 opened this issue Mar 19, 2020 · 16 comments
Labels
A-interpreter Area: affects the core interpreter C-proposal Category: a proposal for something we might want to do, or maybe not; details still being worked out

Comments

@CAD97
Copy link

CAD97 commented Mar 19, 2020

[example]

let layout = Layout::array::<u8>(10).unwrap();
let ptr = alloc(layout);
slice::from_raw_parts_mut(ptr, 10);
dealloc(ptr, layout);

This would have prevented an actual issue I had where I accidentally used slice::from_raw_parts instead of ptr::slice_from_raw_parts from a version-aware import.

More generally, this is the "create reference to uninitialized memory" catch, but since these two methods have now-stable sound alternatives, it'd be nice for miri to catch incorrect usage and point at the correct raw pointer version.

@CAD97
Copy link
Author

CAD97 commented Mar 19, 2020

@RalfJung RalfJung added A-interpreter Area: affects the core interpreter C-support Category: Not necessarily a bug, but someone asking for support labels Mar 19, 2020
@RalfJung
Copy link
Member

This is currently deliberate. Quoting from the README:

In particular, Miri does currently not check that integers are initialized or that references point to valid data.

The reasoning is that I think this will complain about tons of stuff and make Miri basically unusable. Additionally, checking recursively below references will be extremely slow. The discussion in the UCG was mostly moving towards "we probably eventually want to accept that code", so I figured I'd make Miri less aggressive here.

I suppose we could offer a flag for more aggressive checking, for those that want it. However, checking transitively below references is expensive, I don't think we can reasonably do it -- and fixing this issue would require traversing below a reference.

@RalfJung RalfJung added C-proposal Category: a proposal for something we might want to do, or maybe not; details still being worked out and removed C-support Category: Not necessarily a bug, but someone asking for support labels Mar 19, 2020
@oli-obk
Copy link
Contributor

oli-obk commented Mar 19, 2020

We could do it on the result of calls to const fn that take raw pointers and yield references. That should be rare enough and should at some point cover all such constructor functions

@CAD97
Copy link
Author

CAD97 commented Mar 19, 2020

Yeah, this is intended to just be for some small "easier" subset. That could be just slice::from_raw_parts with a special message to use ptr::slice_from_raw_parts instead, or it could be more general like const fn.

@RalfJung
Copy link
Member

That feels like a rather unprincipled hack to me... ad-hoc diagnostics for special cases are fine and useful, but ad-hoc changes to the basic semantics are somewhat disconcerting, IMO.

@cmazakas
Copy link

Being able to form a mutable slice to an unitialized region of memory isn't a bug and is instead a feature.

Consider networking buffers which must first allocate memory which then immediately gets read into. Forcing initialization of these bytes makes erroneous reads "safe" but it does slightly hinder perf.

@bjorn3
Copy link
Member

bjorn3 commented Jul 13, 2020

You should use &[MaybeUninit<u8>] instead. &[u8] requires all u8 elements to be initialized.

@CAD97
Copy link
Author

CAD97 commented Jul 13, 2020

Technically the validity (including initializedness) of anything behind a reference is an open question. Even if constructing a reference to uninitialized data is valid, it's still likely that performing any kind of typed copy of that reference (such as passing it to a function) asserts that the value behind the reference is valid. On top of that, even for data that is trivially dropped by forgetting, it's likely that a reference write still asserts that the value previously there is valid.

TL;DR using references to uninitialized data in a regular type is at the very best unclear if it's sound, and likely to be determined unsound as we nail down the memory and borrowing model for Rust.

Use raw pointers and/or MaybeUninit if you need to write into maybe uninitialized memory. Read::read is definitely not allowed to write into uninitialized memory with the currently understood and defined subset of Rust's memory model, even with a known trait implementation.

@RalfJung
Copy link
Member

Being able to form a mutable slice to an unitialized region of memory isn't a bug and is instead a feature.

It's a feature only if you tell the compiler that you are doing it. Just be explicit, that's all. You can be explicit by using MaybeUninit or raw pointers.

@cmazakas
Copy link

Well, I'm happy to stand corrected!

@saethlin
Copy link
Member

saethlin commented Dec 8, 2021

I think Miri flagged an example of this:
https://github.com/rust-lang/rust/blob/e6b883c74f49f32cb5d1cbad3457f2b8805a4a38/library/alloc/src/str.rs#L180-L181

        let pos = result.len();
        let target = result.get_unchecked_mut(pos..reserved_len);

(reported here: rust-lang/rust#91574)

This is only flagged with -Zmiri-tag-raw-pointers. Is this an unintentional increase in accuracy, or am I misunderstanding the above discussion?

@RalfJung
Copy link
Member

RalfJung commented Dec 9, 2021

I think Miri is flagging an unrelated aliasing problem in that code.

@kpp
Copy link

kpp commented Dec 10, 2021

@RalfJung is slice::from_raw_parts on uninitialized memory for an array of integers an actual UB? Miri does not catch it so I am trying to figure out. This one passes, this one does not.

Should a similar to calloc test with mmap(MAP_PRIVATE | MAP_ANONYMOUS) pass as well?

@RalfJung
Copy link
Member

RalfJung commented Dec 11, 2021

is slice::from_raw_parts on uninitialized memory for an array of integers an actual UB?

You mean a slice of integers?
The rules for this are not finalized yet. The Reference currently says they are UB, to keep our options open. Discussion happens at rust-lang/unsafe-code-guidelines#71 and rust-lang/unsafe-code-guidelines#77. My personal preference is that it should not be UB: integers must be initialized, but references do not recursively require anything about the memory they point to. This is also what Miri implements.

The 2nd one is definitely UB since you are comparing two uninitialized integers.

@kpp
Copy link

kpp commented Dec 11, 2021

The fist one is not UB. How about this one:

let p4 = mmap(length=4, flags=MAP_PRIVATE | MAP_ANONYMOUS, ...);
assert!(!p4.is_null());
let slice = slice::from_raw_parts(p4 as *const u8, 4);
assert_eq!(&slice, &[0_u8; 4]);

I am curious because Miri does not support mmap yet.

@CAD97
Copy link
Author

CAD97 commented Dec 11, 2021

calloc zeroes the provided memory, so it's guaranteed that the memory is initialized to 0, so creating a reference is fine.

Since you've specified MAP_ANONYMOUS, mmap initializes the contents of the memory to zero. Like calloc, this means that accessing the memory via reference is fine, because it's initizlied.

(Note that you've not specified prot, so mmap defaults to prot=PROT_NONE, and the page may not be accessed. This does violate the rules of references, and obviously the read for the equality check.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-interpreter Area: affects the core interpreter C-proposal Category: a proposal for something we might want to do, or maybe not; details still being worked out
Projects
None yet
Development

No branches or pull requests

7 participants