Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use default rather than hard-coded 8 for maximum aggregate member alignment #1378

Merged
merged 1 commit into from
Jan 12, 2022

Conversation

jrtc27
Copy link
Contributor

@jrtc27 jrtc27 commented Jan 4, 2022

On CHERI, and thus Arm's Morello prototype, pointers are represented as
hardware capabilities. These capabilities are comprised of not just an
integer address, as is the representation for traditional pointers, but
also bounds, permissions and other metadata, plus a tag bit used as the
validity bit, which provides fine-grained spatial and referential safety
for C and C++ in hardware. This tag bit is not part of the data itself
and is instead kept on the side, flowing with the capability between
registers and the memory subsystem, and any attempt to amplify the
privilege of or corrupt a capability clears this tag (or, in some cases,
traps), rendering them impossible to forge; you can only create
capabilities that are (possibly trivial) subsets of existing ones.

When the capability is stored in memory, this tag bit needs to be
preserved, which is done through the use of tagged memory. Every
capability-sized word gains an additional non-addressable (from the
CPU's perspective; depending on the implementation the tag bits may be
stored in a small block of memory carved out of normal DRAM that the CPU
is blocked from accessing) bit. This means that capabilities can only be
stored to aligned locations; attempting to store them to unaligned
locations will trap with an alignment fault or, if you end up using a
memcpy call, will copy the raw bytes of the capability's representation
but lose the tag, so when it is eventually loaded back as a capability
and dereferenced it will fault.

Since, on 64-bit architectures, our capabilities, used to implement C
language pointers, are 128-bit quantities, this means they need 16-byte
alignment. Currently the various #pragma pack directives, used to work
around (extremely broken and bogus) code that includes jsoncpp in a
context where the maximum alignment has been overridden, hard-code 8 as
the maximum alignment to use, and so do not sufficiently align CHERI /
Morello capabilities on 64-bit architectures. On Windows x64, the
default is also not 8 but 16 (ARM64 is supposedly 8), so this is
slightly dodgy to do there too, but in practice likely not an issue so
long as you don't use any 128-bit types there.

Instead of hard-coding a width, use a directive that resets the packing
back to the default. Unfortunately, whilst GCC and Clang both accept
using #pragma pack(push, 0) as shorthand like for any non-zero value,
MSVC does not, so this needs to be two directives.

…gnment

On CHERI, and thus Arm's Morello prototype, pointers are represented as
hardware capabilities. These capabilities are comprised of not just an
integer address, as is the representation for traditional pointers, but
also bounds, permissions and other metadata, plus a tag bit used as the
validity bit, which provides fine-grained spatial and referential safety
for C and C++ in hardware. This tag bit is not part of the data itself
and is instead kept on the side, flowing with the capability between
registers and the memory subsystem, and any attempt to amplify the
privilege of or corrupt a capability clears this tag (or, in some cases,
traps), rendering them impossible to forge; you can only create
capabilities that are (possibly trivial) subsets of existing ones.

When the capability is stored in memory, this tag bit needs to be
preserved, which is done through the use of tagged memory. Every
capability-sized word gains an additional non-addressable (from the
CPU's perspective; depending on the implementation the tag bits may be
stored in a small block of memory carved out of normal DRAM that the CPU
is blocked from accessing) bit. This means that capabilities can only be
stored to aligned locations; attempting to store them to unaligned
locations will trap with an alignment fault or, if you end up using a
memcpy call, will copy the raw bytes of the capability's representation
but lose the tag, so when it is eventually loaded back as a capability
and dereferenced it will fault.

Since, on 64-bit architectures, our capabilities, used to implement C
language pointers, are 128-bit quantities, this means they need 16-byte
alignment. Currently the various #pragma pack directives, used to work
around (extremely broken and bogus) code that includes jsoncpp in a
context where the maximum alignment has been overridden, hard-code 8 as
the maximum alignment to use, and so do not sufficiently align CHERI /
Morello capabilities on 64-bit architectures. On Windows x64, the
default is also not 8 but 16 (ARM64 is supposedly 8), so this is
slightly dodgy to do there too, but in practice likely not an issue so
long as you don't use any 128-bit types there.

Instead of hard-coding a width, use a directive that resets the packing
back to the default. Unfortunately, whilst GCC and Clang both accept
using #pragma pack(push, 0) as shorthand like for any non-zero value,
MSVC does not, so this needs to be two directives.
@BillyDonahue BillyDonahue self-requested a review January 4, 2022 15:46
@BillyDonahue
Copy link
Contributor

thanks. seems ok on paper. Does the jsoncpp code ever rely on the quantity being exactly 8? with pointer arithmetic or other kludgy tricks?

@jrtc27
Copy link
Contributor Author

jrtc27 commented Jan 4, 2022

It shouldn't do; that'd break on 32-bit architectures, since those would still only use 4 not 8 for the alignment given the 8 here is just a cap, not a value to force. I ran all the tests on Morello and they all pass, too (well, other than the ones that rely on C++ exceptions, something's broken in libunwind/libcxxrt/the toolchain I need to go hunt down...).

@jrtc27
Copy link
Contributor Author

jrtc27 commented Jan 12, 2022

What's this waiting on, please? A review from someone else?

@BillyDonahue
Copy link
Contributor

Yeah, I was giving the other maintainers a chance to look at it before merging it.
Then I forgot about it. We can merge it now.

@BillyDonahue BillyDonahue merged commit 42e892d into open-source-parsers:master Jan 12, 2022
@jrtc27
Copy link
Contributor Author

jrtc27 commented Jan 12, 2022

Great, thanks

@jrtc27 jrtc27 deleted the cheri branch January 12, 2022 21:31
ccwanggl pushed a commit to CLNBG/jsoncpp that referenced this pull request Jul 7, 2023
…gnment (open-source-parsers#1378)

On CHERI, and thus Arm's Morello prototype, pointers are represented as
hardware capabilities. These capabilities are comprised of not just an
integer address, as is the representation for traditional pointers, but
also bounds, permissions and other metadata, plus a tag bit used as the
validity bit, which provides fine-grained spatial and referential safety
for C and C++ in hardware. This tag bit is not part of the data itself
and is instead kept on the side, flowing with the capability between
registers and the memory subsystem, and any attempt to amplify the
privilege of or corrupt a capability clears this tag (or, in some cases,
traps), rendering them impossible to forge; you can only create
capabilities that are (possibly trivial) subsets of existing ones.

When the capability is stored in memory, this tag bit needs to be
preserved, which is done through the use of tagged memory. Every
capability-sized word gains an additional non-addressable (from the
CPU's perspective; depending on the implementation the tag bits may be
stored in a small block of memory carved out of normal DRAM that the CPU
is blocked from accessing) bit. This means that capabilities can only be
stored to aligned locations; attempting to store them to unaligned
locations will trap with an alignment fault or, if you end up using a
memcpy call, will copy the raw bytes of the capability's representation
but lose the tag, so when it is eventually loaded back as a capability
and dereferenced it will fault.

Since, on 64-bit architectures, our capabilities, used to implement C
language pointers, are 128-bit quantities, this means they need 16-byte
alignment. Currently the various #pragma pack directives, used to work
around (extremely broken and bogus) code that includes jsoncpp in a
context where the maximum alignment has been overridden, hard-code 8 as
the maximum alignment to use, and so do not sufficiently align CHERI /
Morello capabilities on 64-bit architectures. On Windows x64, the
default is also not 8 but 16 (ARM64 is supposedly 8), so this is
slightly dodgy to do there too, but in practice likely not an issue so
long as you don't use any 128-bit types there.

Instead of hard-coding a width, use a directive that resets the packing
back to the default. Unfortunately, whilst GCC and Clang both accept
using #pragma pack(push, 0) as shorthand like for any non-zero value,
MSVC does not, so this needs to be two directives.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants