Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strict Provenance MVP #95241

Merged
merged 15 commits into from
Mar 30, 2022
Merged

Strict Provenance MVP #95241

merged 15 commits into from
Mar 30, 2022

Conversation

Gankra
Copy link
Contributor

@Gankra Gankra commented Mar 23, 2022

This patch series examines the question: how bad would it be if we adopted
an extremely strict pointer provenance model that completely banished all
int<->ptr casts.

The key insight to making this approach even vaguely pallatable is the

ptr.with_addr(addr) -> ptr

function, which takes a pointer and an address and creates a new pointer
with that address and the provenance of the input pointer. In this way
the "chain of custody" is completely and dynamically restored, making the
model suitable even for dynamic checkers like CHERI and Miri.

This is not a formal model, but lots of the docs discussing the model
have been updated to try to the concept of this design in the hopes
that it can be iterated on.

See #95228

@Gankra
Copy link
Contributor Author

Gankra commented Mar 23, 2022

Hmm I probably should remove my changes to ptr examples, they should stick to "official" APIs.

@Gankra Gankra changed the title Cleaned provenance Strict Provenance MVP Mar 23, 2022
@Gankra Gankra force-pushed the cleaned-provenance branch from e172082 to 8c1bdc6 Compare March 23, 2022 15:53
@Gankra
Copy link
Contributor Author

Gankra commented Mar 23, 2022

Note that a lot of the meat of this PR is just an enormous treatise being appended to core::ptr's top-level docs, discussing the high level "concept".

@Gankra
Copy link
Contributor Author

Gankra commented Mar 23, 2022

cc @RalfJung for if you feel this new enormous treatise is acceptable to include given that it prefixed with the caveat that it is extremely overly strict, experimental, and non-normative.

// formatter takes an &Opaque. Rust understandably doesn't think we should compare
// the function pointers if they don't have the same signature, so we cast to
// pointers to convince it that we know what we're doing.
if self.formatter as *mut u8 == USIZE_MARKER as *mut u8 {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i should probably revert this to not make AVR sad

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this caused by an address space mismatch in the cast since avr uses AS1 for function pointers? I.e. an invalid bitcast error?

I played around with the datalayout code a few weeks ago to improve handling of address spaces. I can try to upload that as a PR later this week and see what else needs to be done to use an addrspacecast instead of a bitcast so you don't need the inttoptr/ptrtoint workaround. Should hopefully also help for CHERI support (which was my original motivation for looking at that code).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See: avr-rust#143

And:

// HACK: The intermediate cast as usize is required for AVR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(fwiw validating that these changes help for AVR is probably blocked by https://reviews.llvm.org/D114611)

library/std/src/backtrace.rs Outdated Show resolved Hide resolved
@@ -57,6 +57,9 @@ pub struct DirEntry {
data: c::WIN32_FIND_DATAW,
}

unsafe impl Send for OpenOptions {}
unsafe impl Sync for OpenOptions {}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was some argument as to whether this is a serious Robustness To Refactoring regression. If so, I can replace this with a wrapper type for just c::LPSECURITY_ATTRIBUTES -- I don't know if such a wrapper already exists.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Robustness To Refactoring

I searched for this term on Google and there are a total of two results, none of which have any relation to the term itself. What does this mean?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a specific term, just Capitalization for Emphasis. (i.e. this is a potential regression to correctness in the face of refactoring in the future, such as the addition of another !Send/!Sync field in the future.)

@Gankra
Copy link
Contributor Author

Gankra commented Mar 23, 2022

lmk know what y'all wanna do about the "AVR is weird" situation which might also be a "wasm is weird" situation

library/std/src/sys/unix/thread.rs Outdated Show resolved Hide resolved

// Try to slide in the node at the head of the linked list, making sure
// that another thread didn't just replace the head of the linked list.
let exchange_result = state_and_queue.compare_exchange(
current_state,
me | RUNNING,
me.with_addr(me.addr() | RUNNING),

This comment was marked as resolved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah map_addr was added to the set of APIs we want to go with after I wrote all this so it's most only using with_addr, which isn't necessarily a problem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we get really serious with this approach, I also like to imagine further conveniences, like abbreviating p.with_addr(p.addr() | TAG) as p.tag_addr(TAG) and similar for masking, which saves writing a map_addr closure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, let's be exuberantly maximalist about encapsulating these patterns. :)

Copy link
Member

@solson solson Mar 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(My really maximalist thought is wondering how much raw pointer usage that's just for tagging could be replaced with completely safe wrappers like Tagged<'a, T> that consider how much alignment you get from T and handle masking on access automatically. But this can happen outside the standard library, I hope.)

Edit: And I feel like I would have felt like such an interface was vaguely ~dangerous~ in the past, but the strict provenance mental model and with_addr make it way more obvious to me when something is correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(My really maximalist thought is wondering how much raw pointer usage that's just for tagging could be replaced with completely safe wrappers like Tagged<'a, T> that consider how much alignment you get from T and handle masking on access automatically. But this can happen outside the standard library, I hope.)

(Is this considered bad practice to link crates which happen to be yours on rust-lang discussion? Anyway, I have a crate that provides tagged pointer support, ptr-union, though it doesn't automatically detect guaranteed alignment yet, due to stable const generic limitations. I'm currently working on making it work under strict provenance / not use inttoptr.)

@workingjubilee
Copy link
Member

workingjubilee commented Mar 23, 2022

We'll want to do a bunch of nonsense with try builds, so:
@bors delegate=Gankra
Grr.

Copy link
Member

@solson solson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple more trivial nitpicks. (Well, and a typo that looks like it should fail to compile?)

library/std/src/sys/unix/thread.rs Outdated Show resolved Hide resolved
library/std/src/sys/unix/thread.rs Outdated Show resolved Hide resolved
@rustbot rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Mar 23, 2022
@rust-highfive
Copy link
Collaborator

r? @Mark-Simulacrum

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Mar 23, 2022
@workingjubilee
Copy link
Member

Hmm, apparently I don't understand bors syntax as well as I thought.
@bors delegate+

@workingjubilee
Copy link
Member

workingjubilee commented Mar 23, 2022

...Weird. Is the bot dead?
@bors try

@Gankra
Copy link
Contributor Author

Gankra commented Mar 23, 2022

Yeah this PR seems... cursed? the CI isn't even running.

@bstrie
Copy link
Contributor

bstrie commented Mar 23, 2022

Github is having trouble it seems: https://www.githubstatus.com/

@Gankra Gankra force-pushed the cleaned-provenance branch from 8e1015b to b4e12ad Compare March 23, 2022 16:44
}
}

pub(crate) fn is_dangling<T: ?Sized>(ptr: *mut T) -> bool {
let address = ptr as *mut () as usize;
address == usize::MAX
(ptr as *mut ()).addr() == usize::MAX
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't a vital question for this PR, but it makes me wonder if .addr() should work on ?Sized directly, making this just ptr.addr() == usize::MAX despite ptr being potentially wide.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume historically as usize isn't allowed on DST pointers because it's a footgun in a world where we claim this roundtrips. With a clear addr method it might be fine to do this, but I'd like to be conservative against people just abusing these APIs and roundtripping anyways? Not sure I'm convinced by that argument.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could definitely live without it, wasn't really sure what to think.

@bors
Copy link
Contributor

bors commented Mar 23, 2022

✌️ @Gankra~~ can now approve this pull request

@bors
Copy link
Contributor

bors commented Mar 23, 2022

✌️ @Gankra can now approve this pull request

@Gankra Gankra mentioned this pull request Mar 23, 2022
13 tasks
@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 30, 2022
@bors
Copy link
Contributor

bors commented Mar 30, 2022

⌛ Testing commit e3a3afe with merge e50ff9b...

@bors
Copy link
Contributor

bors commented Mar 30, 2022

☀️ Test successful - checks-actions
Approved by: workingjubilee
Pushing e50ff9b to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Mar 30, 2022
@bors bors merged commit e50ff9b into rust-lang:master Mar 30, 2022
@rustbot rustbot added this to the 1.61.0 milestone Mar 30, 2022
@Gankra
Copy link
Contributor Author

Gankra commented Mar 30, 2022

YESSSSS

ok i will properly post a public announcement for this when it hits nightly and I can link the docs

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (e50ff9b): comparison url.

Summary: This benchmark run did not return any relevant results.

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

@rustbot label: -perf-regression

@rustbot rustbot removed the perf-regression Performance regression. label Mar 30, 2022
@Gankra Gankra added the A-strict-provenance Area: Strict provenance for raw pointers label Mar 30, 2022
sunfishcode added a commit to sunfishcode/rust that referenced this pull request Mar 31, 2022
Fix a minor typo from rust-lang#95241 which prevented compilation on x86_64-unknown-openbsd.
Dylan-DPC added a commit to Dylan-DPC/rust that referenced this pull request Mar 31, 2022
…, r=dtolnay

Fix library/std compilation on openbsd.

Fix a minor typo from rust-lang#95241 which prevented compilation on x86_64-unknown-openbsd.
@Gankra
Copy link
Contributor Author

Gankra commented Apr 5, 2022

I have published an article "properly" announcing this and further discussing the ideas behind the project and "The Tower Of Weakening": https://gankra.github.io/blah/tower-of-weakenings/

(Just wrapping up loose threads on my work)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-strict-provenance Area: Strict provenance for raw pointers merged-by-bors This PR was explicitly merged by bors. perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.