Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wasm support #71

Open
ratmice opened this issue Feb 24, 2024 · 9 comments
Open

Wasm support #71

ratmice opened this issue Feb 24, 2024 · 9 comments

Comments

@ratmice
Copy link

ratmice commented Feb 24, 2024

wasm-ld now seems to have rudimentary .init_array support.
I posted a rustc patch here: rust-lang/rust#121533

While it does work to just add

diff --git a/src/lib.rs b/src/lib.rs
index de4cdef..599f064 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -451,6 +451,7 @@ macro_rules! __do_submit {
                     target_os = "netbsd",
                     target_os = "openbsd",
                     target_os = "none",
+                    target_os = "wasi",
                 ),
                 link_section = ".init_array",
             )]

This suffers from the limitation of wasm only supporting a single symbol in the link_section.
It would perhaps be nice if there was a submit_many!() macro which would

static __CTORS: [unsafe extern "C" fn(); num_submitted] = [ ... ]

I tested that this also works with .init_array at least on linux, on other platforms presumably it could just submit! for each
arg to submit_many!.

Probably not worth implementing until we see the rustc patch go in, but I figured it was worth mentioning since
wasm support has come up in the past. E.g. @matklad asked about it in #3

@dtolnay
Copy link
Owner

dtolnay commented Feb 24, 2024

With the limitation of having a single submit in the whole program, I think this is not something I would want to make use of in this crate.

@ratmice
Copy link
Author

ratmice commented Feb 24, 2024

Indeed it is a bummer, but I want to be precise, the limitation is not a single submit in the whole program.
It is a single submit from a library/compilation unit where there is a workaround for submitting multiple items at once.

E.g. you can have a submit in a library, and a submit in a binary which depends on the library.
But you cannot have 2 submits in the library, or 2 submits in the binary. These must be colesced into the submission of one array at the crate level.

The other limitation is that the library must either export a symbol with #[export_name] or #[no_mangle], or the binary must use
a symbol in the library for the constructor to be called.

Hopefully wasm-ld can be fixed though, I don't have plans to work on it given there are workarounds.
That said, I totally understand not wanting to support it in this crate until the situation improves.

@dtolnay
Copy link
Owner

dtolnay commented Feb 25, 2024

Thank you for clarifying the limitation. I missed that it is one per crate, not one per the whole program. It's clear from your PR description.

Maybe it would be reasonable for inventory to use this, then? The reason I am not sure is: this adds build errors where there didn't used to be one. If I have a Wasm project that depends on some library that uses inventory in one part of its implementation, but in my project I never use that part, previously I'd still be able to use all the rest of the library; but if we add .init_array on Wasm, then the library would fail to build for Wasm with "only one .init_array section fragment supported". Or maybe the library provides an explicit way to trigger initialization, which you only need to call if you are using the library from Wasm, while on other platforms inventory takes care of it; but after adding .init_array on Wasm the library becomes unusable on Wasm.

I am on board with trying it if someone wants to send a PR. We'll see how useful it ends up being in practice.

I probably wouldn't do a submit_many!, and instead rely on the submitted data type to hold a &'static [T] if it wants. Libraries using inventory::collect would already need to be designed to accommodate Wasm anyway — for example something like https://github.com/dtolnay/typetag would not just work out of the box in Wasm as currently written. So if they are accommodating a Wasm limitation anyway, it might as well be like this:

pub struct Thing { ... }

pub struct Things(pub &'static [Thing]);

inventory::collect!(Things);

@ratmice
Copy link
Author

ratmice commented Feb 25, 2024

Indeed, the compilation failures thing is a bit of a pickle, the even worse thing is just emitting this is going to lead to a bump of MSRV on wasm, and the old compilers are all going to get a compilation failure from rustc when just emitting .init_array sections.

I would be somewhat inclined to considering having a temporary inventory_wasm crate which people can depend upon,
and/or if their dependencies align magical fashion, they could consider using a [patch] section, but then eventually when things
have stabilized we can deprecate that and have it re-export inventory. That is one idea (not sure it is a good one though).

I was going to wait until something has landed in the compiler before making a PR for this.

@kwhitehouse
Copy link

Would love to use inventory with Wasm as well! I'm new to both Rust and Wasm, so I'm not quite following the root issue here, but hoping for some advice on the following:

  • It sounds like inventory should be somewhat? usable with Wasm, but I've had trouble with this. Specifically, even though my code compiles, and regardless of how many times I invoke submit, when I iterate through the inventory::iter I don't ever see a single element in the inventory. Q: Is there something I need to add to my code to get this to work with Wasm that isn't officially mentioned in documentation?
  • Q: Are there other existing solutions for Wasm that would do something similar? My use case is the following:
    • I'd like to annotate a number of simple functions, similar to how rust unit tests are annotated with #[test]
    • Then, I want to be able to invoke those functions at runtime and aggregate the results into a vector
    • In other languages I'd use reflection to accomplish this, but from my research that doesn't seem to be an option in Rust.

Thanks in advance for any advice!

@ratmice
Copy link
Author

ratmice commented Apr 26, 2024

@kwhitehouse Currently this only works if you have all of the following things:

  • a patched rust compiler (linked in my first comment). now merged
  • a patched inventory (inline in my initial comment).
  • a new enough wasm-ld to support .init_array.
  • each crate in your dependency chain using inventory calls submit! at most once.
  • each crate in your dependency chain exports a symbol with #[export_name] or #[no_mangle].

These last two things are issues in the underlying wasm-ld implementation and probably the thing keeping the rustc patches from landing. Apart from the current status quo of using a modified rustc and inventory I don't know of any other workarounds to achieve this.

I spent quite a bit of time debugging the wasm-ld implementation at some point, but didn't really succeeded in finding any path towards fixing those two things, and haven't found much motivation further investigations into it.
As such, it is possible to get working but requires some contortion.

@georgestagg
Copy link

georgestagg commented Oct 3, 2024

For interested explorers:

I have forked and modified LLVM to support multiple .init_array symbols. After compiling a custom rustc using the patched LLVM, multiple uses of #[link_section = ".init_array"] works. At least, when using the wasm32-unknown-emscripten target.

#[used]
#[link_section = ".init_array"]
pub static __CTOR: unsafe extern "C" fn() = __ctor1;
unsafe extern "C" fn __ctor1() {
    VAL = VAL + 1;
}

#[used]
#[link_section = ".init_array"]
pub static __CTOR2: unsafe extern "C" fn() = __ctor2;
unsafe extern "C" fn __ctor2() {
    VAL = VAL + 1;
}

static mut VAL: i32 = 0;

fn main() {
    unsafe {
        let val_ptr = &raw mut VAL;
        println!("val = {}", *val_ptr);
    }
}
$ rustc +stage2 test.rs --target=wasm32-unknown-emscripten -C link-self-contained=no
$ node test.js
val = 2

Using a patch similar to the one in the OP of this thread,

diff --git a/src/lib.rs b/src/lib.rs
index 4ff9d71..93941bf 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -452,6 +452,7 @@ macro_rules! __do_submit {
                     target_os = "netbsd",
                     target_os = "openbsd",
                     target_os = "none",
+                    target_os = "emscripten",
                 ),
                 link_section = ".init_array",
             )]

I can now use the inventory package with the wasm32-unknown-emscripten target. I am currently using this to experiment with building a Python package that uses the multiple-pymethods feature of PyO3, which relies on inventory with multiple submits.

My patch to LLVM is here: llvm/llvm-project@8266ca5.

@bitwalker
Copy link

@georgestagg Have you submitted your LLVM patch upstream, or is it just in your fork? Sounds like it would be worth upstreaming if possible, but not sure how big of a lift that is from your current patch.

@georgestagg
Copy link

I'm working towards upstreaming,

llvm/llvm-project#111008
llvm/llvm-project#119127

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants