Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Architecture-dependent sizes in portable code #132

Open
1 of 2 tasks
ForNeVeR opened this issue Jul 16, 2022 · 3 comments
Open
1 of 2 tasks

Architecture-dependent sizes in portable code #132

ForNeVeR opened this issue Jul 16, 2022 · 3 comments
Assignees
Labels
area:cil-interop Related to CIL (.NET) interop area:compiler Related to code compilation or type checking area:standard-support Related to the C standard support kind:feature New feature or request

Comments

@ForNeVeR
Copy link
Owner

ForNeVeR commented Jul 16, 2022

There's a problem: if we compile to AnyCPU, then we cannot generally determine sizeof(void*), which has to be a constant expression, according to the standard. Same about, say, size_t or ptrdiff_t, and actually any architecture-dependent types, mostly derived from pointers.

There are two possible strategies of dealing with that.

  1. We can forbid using sizeof with a pointer and any pointer-sized types in portable code. Our users will have to specify the architecture when compiling a binary that uses these features.

    It is possible that this will effectively prevent compiling to AnyCPU of the most of C code.

    Also, using of such C libraries from external code would be problematic sometimes (the library author will have to distribute versions built for each architecture, which kinda loses the point of Cesium).

  2. We can introduce a special architecture-independent compilation mode which will use 8 bytes for any pointer or size_t, and generally will prefer bigger object sizes, but will still allow building to AnyCPU.

    This may be problematic for cases when a C library exposes some .NET interface which operates on pointer types of IntPtr: it's unclear how to properly compile that (or if it is even possible).

For now, I am considering going both ways: add a flag to enable (or disable) architecture-independent compilation, and allow the user to specify the architecture that will be used to calculate sizeof and whatnot.

Depends on:

@ForNeVeR ForNeVeR added kind:feature New feature or request area:cil-interop Related to CIL (.NET) interop area:standard-support Related to the C standard support area:compiler Related to code compilation or type checking labels Jul 16, 2022
@ForNeVeR ForNeVeR self-assigned this Jul 16, 2022
@kant2002
Copy link
Collaborator

Basically what I propose is instead of int IType.SizeInBytes have IExpression IType.GetSizeInBytes() function or property. If paired with constant folding, we have same IL as today, and we can have dynamic IL for types like void*[10]

@ForNeVeR
Copy link
Owner Author

ForNeVeR commented Nov 27, 2022

Ok, after some thought, I've decided we'll try to support the following architecture sets in Cesium:

  • 32b: an architecture with 32-bit pointers, gets compiled as an x86 assembly (maybe with a flag to make it compatible with ARM32, though I'm not sure such a flag exists),
  • 64b: an architecture with 64-bit pointers, gets compiled as an x64 assembly (with obligatory ARM64 support),
  • wide: an architecture forcing pointers to be 64-bit even on a 32-bit platform (we'll have a lot of fun with that I reckon),
  • dynamic: an architecture calculating pointer size dynamically at runtime.

Notes:

  1. I am specifically calling 32b and 64b architecture sets and not architectures, and explicitly not call them x86 and x64 to avoid confusion with the actual x86 and x86_64 architectures. We only impose restrictions on pointer size (and, likely, memory layout in the future, when we'll implement offsetof and whatnot), and not the instruction set.

    Though it's possible that we won't be able to make produced binaries compatible with both x86 and ARM32, or both x86_64 and ARM64. In such case, the idea will be to introduce four "real" architectures instead (x86, ARM32, x86_64, ARM64). The general scheme won't change in such case; internally in the the compiler code, we'll have a flag for architecture bitness and not actual output architecture.

  2. Both wide and dynamic should provide portable Any CPU-targeted binaries.

  3. Ideologically, dynamic will work as @kant2002 proposed.

    Yet for dynamic, I don't want to extend its magic scope too much for now. Certain things will be forbidden in dynamic, such as fixed arrays of size based on type sizes or offsets (i.e. struct foo { char x[sizeof(void*)]; };). All these things should fail at compile-time.

    It is possible to make these types work by abusing runtime dispatch, but their interop story will be very confusing (while codegen will just become mildly messy). We may consider supporting them in the future, if demand arises.

  4. I want to introduce all of these in one run right now, but only future will tell how fruitful are those ideas; maybe we'll drop some of the more exotic variants, or significantly limit the scope of what's allowed in wide and dynamic.

    I really want at least one architecture to support 100% of C standard, but I'm ready to take compromises on all the others. This will still allow us to call Cesium C17-compilant compiler for that architecture, right?

@ForNeVeR
Copy link
Owner Author

Seems like wide will require a separate version of the standard library.

We could either compile it using #if magic, or have fun with Cecil and post-processing.

ForNeVeR added a commit that referenced this issue Dec 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:cil-interop Related to CIL (.NET) interop area:compiler Related to code compilation or type checking area:standard-support Related to the C standard support kind:feature New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants