Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cross-compiling i686-pc-windows-gnu from Linux #55444

Closed
wants to merge 1 commit into from

Conversation

Manishearth
Copy link
Member

@Manishearth Manishearth commented Oct 28, 2018

This is a patch from @neersighted from the Tor project, which lets Tor
cross compile on Windows.

See the commit message for details.

https://trac.torproject.org/projects/tor/ticket/28157

I don't actually understand this patch (see the commit message for an
explanation), I'm just helping upstream it.

r? @alexcrichton @Mark-Simulacrum

@rust-highfive
Copy link
Collaborator

⚠️ Warning ⚠️

  • These commits modify submodules.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Oct 28, 2018
This is still very rough and serves as a proof-of-concept for fixing
Linux -> 32-bit MinGW cross compilation workflow. Currently, clang and
GCC's MinGW targets both only support DW2 (DWARF) or SJLJ (Set Jump Long
Jump) unwinding on 32-bit Windows.

The default for GCC (and the way it is shipped on every major distro) is
to use SJLJ on Windows, as DWARF cannot traverse non-DWARF frames. This
would work fine, except for the fact that libgcc (our C runtime on the
MinGW platform) exports symbols under a different name when configured
to use SJLJ-style unwinding, and uses a preprocessor macro internally to
alias them.

Because of this, we have to detect this scenario and link to the correct
symbols ourselves. Linking has been tested with a full bootstrap on both
x86_64-unknown-linux-gnu and i686-pc-windows-gnu, as well as
cross-compilation of some of my own projects.

Obviously, the detection is a bit unrefined. Right now we
unconditionally use SJLJ when compiling Linux -> MinGW. I'd like to add
feature detection using compiler build flags or autotools-style
compilation and object analysis. Input on the best way to proceed here
is welcome.

Also, currently there is copy-pasted/duplicated code in libunwind.
Ideally, this could be reduced, but this would likely require a
rethinking of how iOS is special-cased above, to avoid further
duplication. Input on how to best structure this file is requested.
@rust-highfive
Copy link
Collaborator

The job x86_64-gnu-llvm-5.0 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
travis_time:end:00fe9e50:start=1540729002113336540,finish=1540729003139284146,duration=1025947606
$ git checkout -qf FETCH_HEAD
travis_fold:end:git.checkout

Encrypted environment variables have been removed for security reasons.
See https://docs.travis-ci.com/user/pull-requests/#Pull-Requests-and-Security-Restrictions
$ export SCCACHE_BUCKET=rust-lang-ci-sccache2
$ export SCCACHE_REGION=us-west-1
Setting environment variables from .travis.yml
$ export SCCACHE_BUCKET=rust-lang-ci-sccache2
---
[00:02:31]    Compiling bootstrap v0.0.0 (/checkout/src/bootstrap)
[00:02:32] error[E0425]: cannot find value `build` in this scope
[00:02:32]    --> bootstrap/compile.rs:154:8
[00:02:32]     |
[00:02:32] 154 |     if build.build.build.contains("linux") && target == "i686-pc-windows-gnu" {
[00:02:32] help: possible candidates are found in other modules, you can import them into scope
[00:02:32]     |
[00:02:32] 19  | use cmake::build;
[00:02:32]     |
[00:02:32]     |
[00:02:32] 19  | use metadata::build;
[00:02:32]     |
[00:02:32] 
[00:02:32] error[E0425]: cannot find value `features` in this scope
[00:02:32]    --> bootstrap/compile.rs:155:9
[00:02:32]     |
[00:02:32] 155 |         features.push_str(" sjlj_eh");
[00:02:32] 
[00:02:35] error: aborting due to 2 previous errors
[00:02:35] 
[00:02:35] For more information about this error, try `rustc --explain E0425`.
---
[00:02:36]    Compiling bootstrap v0.0.0 (/checkout/src/bootstrap)
[00:02:37] error[E0425]: cannot find value `build` in this scope
[00:02:37]    --> bootstrap/compile.rs:154:8
[00:02:37]     |
[00:02:37] 154 |     if build.build.build.contains("linux") && target == "i686-pc-windows-gnu" {
[00:02:37] help: possible candidates are found in other modules, you can import them into scope
[00:02:37]     |
[00:02:37] 19  | use cmake::build;
[00:02:37]     |
[00:02:37]     |
[00:02:37] 19  | use metadata::build;
[00:02:37]     |
[00:02:37] 
[00:02:37] error[E0425]: cannot find value `features` in this scope
[00:02:37]    --> bootstrap/compile.rs:155:9
[00:02:37]     |
[00:02:37] 155 |         features.push_str(" sjlj_eh");
[00:02:37] 
[00:02:41] error: aborting due to 2 previous errors
[00:02:41] 
[00:02:41] For more information about this error, try `rustc --explain E0425`.
---
[00:02:43]    Compiling bootstrap v0.0.0 (/checkout/src/bootstrap)
[00:02:44] error[E0425]: cannot find value `build` in this scope
[00:02:44]    --> bootstrap/compile.rs:154:8
[00:02:44]     |
[00:02:44] 154 |     if build.build.build.contains("linux") && target == "i686-pc-windows-gnu" {
[00:02:44] help: possible candidates are found in other modules, you can import them into scope
[00:02:44]     |
[00:02:44] 19  | use cmake::build;
[00:02:44]     |
[00:02:44]     |
[00:02:44] 19  | use metadata::build;
[00:02:44]     |
[00:02:44] 
[00:02:44] error[E0425]: cannot find value `features` in this scope
[00:02:44]    --> bootstrap/compile.rs:155:9
[00:02:44]     |
[00:02:44] 155 |         features.push_str(" sjlj_eh");
[00:02:44] 
[00:02:47] error: aborting due to 2 previous errors
[00:02:47] 
[00:02:47] For more information about this error, try `rustc --explain E0425`.
---
[00:02:50]    Compiling bootstrap v0.0.0 (/checkout/src/bootstrap)
[00:02:51] error[E0425]: cannot find value `build` in this scope
[00:02:51]    --> bootstrap/compile.rs:154:8
[00:02:51]     |
[00:02:51] 154 |     if build.build.build.contains("linux") && target == "i686-pc-windows-gnu" {
[00:02:51] help: possible candidates are found in other modules, you can import them into scope
[00:02:51]     |
[00:02:51] 19  | use cmake::build;
[00:02:51]     |
[00:02:51]     |
[00:02:51] 19  | use metadata::build;
[00:02:51]     |
[00:02:51] 
[00:02:51] error[E0425]: cannot find value `features` in this scope
[00:02:51]    --> bootstrap/compile.rs:155:9
[00:02:51]     |
[00:02:51] 155 |         features.push_str(" sjlj_eh");
[00:02:51] 
[00:02:55] error: aborting due to 2 previous errors
[00:02:55] 
[00:02:55] For more information about this error, try `rustc --explain E0425`.
---
[00:02:59]    Compiling bootstrap v0.0.0 (/checkout/src/bootstrap)
[00:03:00] error[E0425]: cannot find value `build` in this scope
[00:03:00]    --> bootstrap/compile.rs:154:8
[00:03:00]     |
[00:03:00] 154 |     if build.build.build.contains("linux") && target == "i686-pc-windows-gnu" {
[00:03:00] help: possible candidates are found in other modules, you can import them into scope
[00:03:00]     |
[00:03:00] 19  | use cmake::build;
[00:03:00]     |
[00:03:00]     |
[00:03:00] 19  | use metadata::build;
[00:03:00]     |
[00:03:00] 
[00:03:00] error[E0425]: cannot find value `features` in this scope
[00:03:00]    --> bootstrap/compile.rs:155:9
[00:03:00]     |
[00:03:00] 155 |         features.push_str(" sjlj_eh");
[00:03:00] 
[00:03:03] error: aborting due to 2 previous errors
[00:03:03] 
[00:03:03] For more information about this error, try `rustc --explain E0425`.
---
travis_time:end:2615dc60:start=1540729197312735905,finish=1540729197319055544,duration=6319639
travis_fold:end:after_failure.3
travis_fold:start:after_failure.4
travis_time:start:27147480
$ ln -s . checkout && for CORE in obj/cores/core.*; do EXE=$(echo $CORE | sed 's|obj/cores/core\.[0-9]*\.!checkout!\(.*\)|\1|;y|!|/|'); if [ -f "$EXE" ]; then printf travis_fold":start:crashlog\n\033[31;1m%s\033[0m\n" "$CORE"; gdb --batch -q -c "$CORE" "$EXE" -iex 'set auto-load off' -iex 'dir src/' -iex 'set sysroot .' -ex bt -ex q; echo travis_fold":"end:crashlog; fi; done || true
travis_fold:end:after_failure.4
travis_fold:start:after_failure.5
travis_time:start:16127650
travis_time:start:16127650
$ cat ./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers || true
cat: ./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers: No such file or directory
travis_fold:end:after_failure.5
travis_fold:start:after_failure.6
travis_time:start:29871adf
$ dmesg | grep -i kill

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@@ -150,6 +150,11 @@ pub fn std_cargo(builder: &Builder,
cargo.env("MACOSX_DEPLOYMENT_TARGET", target);
}

// FIXME: Temporary detection of SJLJ MinGW compilers.
if build.build.build.contains("linux") && target == "i686-pc-windows-gnu" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be builder.build.build?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

target.contains("linux") ?

as like:
...
766 !target.contains("freebsd") &&
767 !target.contains("windows") &&
768 !target.contains("apple") {
...

@neersighted
Copy link

@alexcrichton can correct me if I'm wrong, but I originally discontinued this patch since I realized it wouldn't work as-is for officially distributed toolchains; targets seem to be shared between hosts, so this patch will not quite work as it depends on compile-time checks.

Instead a different approach is likely needed; this PR could likely be converted to use runtime detection.

I might take another look making this change if review finds this change solves the issue (when compiled on 32-bit Linux), as I first researched and tested this issue many months ago and my memory of it is a bit fuzzy.

@alexcrichton
Copy link
Member

Ah yes indeed @neersighted! I think in addition to this causing discrepancy across hosts I think that this largely just builds rather than works right? I think that with this patch it means that panics won't actually run Rust destructors because we've configured LLVM to emit dwarf information rather than use sj/lj for exception handling.

@neersighted does that sound right?

@neersighted
Copy link

I believe so yes. Exceptions appear to work on the surface, but lifecycles are likely incorrect. I think we talked about what would be needed to fix this on IRC at one point, but it was too deep in the compiler for me at the time.

What I believe would be necessary to fix this properly is for the compiler itself to be aware of SJLJ/DWARF2 at project build-time (instead of it being baked in to the standard library), based on the host and target triples. It would need to emit different symbols at compile-time, as well as switching between SJLJ or DWARF2 in the LLVM backend.

I might take another look at this sometime; this was a very naive and experimental approach to solving the issue and wasn't really meant for use as-is; instead I was attempting to iterate on the issue until I fully understood it.

@alexcrichton
Copy link
Member

Ok! In that case I think we probably shouldn't land this PR, for the primary reason that while this produces linked binaries it doesn't actually produce correct binaries. The incorrectness specifically stems from the fact that LLVM is, for i686-pc-windows-gnu, always generating DWARF unwinding tables. This means that if the sjlj runtime for throwing exceptions is used instead that when a panic happens no destructors will be run.

I think there's probably two fixes here, both of which are unfortunately quite difficult:

  • For tor's use case specifically, somehow (I'm not sure how myself) acquire a mingw compiler that uses dwarf for exception handling. This will be compatible with the i686-pc-windows-gnu target as-is today.
  • Otherwise add an entirely new target, like i686-pc-windows-gnu-sjlj (or something like that). This entire target would be geared towards sjlj exception handling, and this would involve changing how we generate code in LLVM as well as including the runtime support changes in this patch itself.

I think we probably have basically no path to getting i686-pc-windows-gnu as-is today working with the mingw toolchains on linux which use sjlj exception handling. Even if we could change LLVM at runtime the binary of the standard library would be in one mode or the other.

@neersighted
Copy link

I do think there is a third, (slightly) easier path, though I'm not sure that it is in line with Rust's philosophy and compiler design. The compiler could be made to use multiple unwinding formats, and pick from them at build-time. Essentially, pushing the decision on what to link to from the standard library to the compiler (which also would have to emit different unwinding tables).

However, this couples the compiler much more closely to the target environment itself, and could potentially be problematic when building for targets that do not use the standard library (depending on how it was implemented), like ARM uCs.

@malbarbo
Copy link
Contributor

cross builds a i686-pc-windows-gnu that uses dwarf for exception handling.

@alexcrichton
Copy link
Member

@neersighted oh I think that's basically the same route as taking a new target. We can add a compiler flag but it's not too useful unless the standard library is recompiled, which is effectively a new target anyway.

@malbarbo oh that's perfect! Thanks for the link! I'll send that to the tor folks as well

@neersighted
Copy link

Well, the key difference is (weak?) linking in the standard library and the compiler selecting the appropriate symbols instead of the standard library's feature detection. Still, it would require intrusive changes to avoid another target, so I concede it is not the best option.

I too was unaware of mingw builds from cross... Are they capable of unwinding across SEH frames (DWARF2 is not capable of crossing SEH as far as I know)? Are SEH frames (built by MSVC most likely) ever present in rust's call stack when cross-compiling to Windows?

@alexcrichton
Copy link
Member

@neersighted oh but it's not just about the symbols used, right? It's all about how the compiler actually compiles libraries? We ship a binary version of the standard library to users, and while we could dynamically pick the right symbol at runtime the binary itself is only compatible with one strategy (not the other)

(I'm not sure about SEH and mingw, I don't know how the two interact)

@aleksmelnikov
Copy link

So, globally:

  1. Rust's toolchains support dwarf exceptions for win32 and seh exceptions for win64.
  2. Rust's toolchains doesn't have mingw compilers internally, so developers have to install own mingw compilers into their systems. And these local mingw compilers can support different exception mode. For debian\ubuntu are: sjls for win32 (it's information from the debian\ubuntu maintainer of mingw-w64) and seh for win64.
  3. For right working both mingw compilers and Rust's toolcahins must support similar exception mode.
  4. malbarbo's patch shows how to rebuild local mingw compiler for dwarf exceptions.
  5. torproject's patch shows how to build Rust's toolchains for sljl exceptions.

Right? Thx.

So, split i686-pc-windows-gnu into two new toolchains like i686-pc-windows-gnu-dwarf and i686-pc-windows-gnu-sjlj is a pragmatical idea.

@neersighted
Copy link

@alexcrichton yeah, my idea was to ship a 'fat' version of the standard library essentially, that includes both. But upon digging into the code, it looks like it is significantly easier to add a new target, which should likely be the way forward.

@superriva that's a somewhat accurate summation, but there are additional considerations with compatibility between unwinding formats. As far as I know, 32-bit MinGW GCC uses SJLJ because DWARF2 cannot traverse SEH frames, which are what MSVC outputs. 64-bit MinGW GCC just emits SEH.

My patch merely links against (internal) SJLJ symbols, but does not configure LLVM itself to use SJLJ.

@alexcrichton
Copy link
Member

Ok I'm gonna close this for now as it sounds like it's not the strategy we want to take, but I believe the concrete step forward for rust, if necessary, is to add a new target which is 32-bit MinGW using sjlj exceptions, leaving the existing target as the one for dwarf exceptions. That addition will require this patch as-is, but will also require support to configure LLVM appropriately and such.

If this is in error though please just let me know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants