Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rust: Import LLD for linking wasm objects #48125

Merged
merged 2 commits into from
Mar 4, 2018
Merged

Conversation

alexcrichton
Copy link
Member

This commit imports the LLD project from LLVM to serve as the default linker for
the wasm32-unknown-unknown target. The binaryen submoule is consequently
removed along with "binaryen linker" support in rustc.

Moving to LLD brings with it a number of benefits for wasm code:

  • LLD is itself an actual linker, so there's no need to compile all wasm code
    with LTO any more. As a result builds should be much speedier as LTO is no
    longer forcibly enabled for all builds of the wasm target.
  • LLD is quickly becoming an "official solution" for linking wasm code together.
    This, I believe at least, is intended to be the main supported linker for
    native code and wasm moving forward. Picking up support early on should help
    ensure that we can help LLD identify bugs and otherwise prove that it works
    great for all our use cases!
  • Improvements to the wasm toolchain are currently primarily focused around LLVM
    and LLD (from what I can tell at least), so it's in general much better to be
    on this bandwagon for bugfixes and new features.
  • Historical "hacks" like wasm-gc will soon no longer be necessary, LLD
    will natively implement --gc-sections (better than wasm-gc!) which
    means a postprocessor is no longer needed to show off Rust's "small wasm
    binary size".

LLD is added in a pretty standard way to rustc right now. A new rustbuild target
was defined for building LLD, and this is executed when a compiler's sysroot is
being assembled. LLD is compiled against the LLVM that we've got in tree, which
means we're currently on the release_60 branch, but this may get upgraded in
the near future!

LLD is placed into rustc's sysroot in a bin directory. This is similar to
where gcc.exe can be found on Windows. This directory is automatically added
to PATH whenever rustc executes the linker, allowing us to define a WasmLd
linker which implements the interface that wasm-ld, LLD's frontend, expects.

Like Emscripten the LLD target is currently only enabled for Tier 1 platforms,
notably OSX/Windows/Linux, and will need to be installed manually for compiling
to wasm on other platforms. LLD is by default turned off in rustbuild, and
requires a config.toml option to be enabled to turn it on.

Finally the unstable #![wasm_import_memory] attribute was also removed as LLD
has a native option for controlling this.

@alexcrichton
Copy link
Member Author

cc @yurydelendik I think this may regress the source map support for wasm32-unknown-unknown as I'm not sure if lld has that implemented like binaryen does, but I could be wrong!

// rethought from the ground up once a linker (lld) is available, so this is all
// temporary and should improve in the future.
// There's some trickery below on crate types supported and various defaults
// (aka panic=abort by default), but otherwise this is ing eneral a relatively
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ing eneral -> in general ?

@kennytm kennytm added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 10, 2018
@yurydelendik
Copy link
Contributor

I think this may regress the source map support for wasm32-unknown-unknown as I'm not sure if lld has that implemented like binaryen does, but I could be wrong!

Correct, it will regress the rust source maps support. I was planing to look into exporting source maps from LLVM next.

@alexcrichton alexcrichton force-pushed the lld branch 3 times, most recently from ddf7169 to 8b11f93 Compare February 10, 2018 16:44
@alexcrichton
Copy link
Member Author

r? @Mark-Simulacrum

@japaric
Copy link
Member

japaric commented Feb 10, 2018

This directory is automatically added to PATH whenever rustc executes the linker

Is that the case for all the targets or only for the wasm32-unknown-unknown target? Because I think it would be great to be able to link ARM Cortex-M binaries using the lld shipped with rustc; that way we don't need to install arm-none-eabi-ld on the host.

@alexcrichton
Copy link
Member Author

@japaric it is indeed true for all targets!

I haven't added support for LLD for other targets yet in this PR but I'm hopefully structuring it in such a way that it should be fairly easy. That I'm hoping can be a quick followup :)

@alexcrichton
Copy link
Member Author

alexcrichton commented Feb 10, 2018

It's also important to note that this PR actually regresses the apparent hello size of "hello world" in wasm. Fear not, though, this is intended and not as bad as it seems! Today this program:

#![crate_type = "cdylib"]          
                       
#[no_mangle]            
pub extern fn foo(a: u32) -> u32 { 
    a + 1                          
}                                  

generates a 17k hello world with rustc +nightly foo.rs -O --target wasm32-unknown-unknown. Now note that 17k is also way too big in this case! The 17k, though, is due to a lack of a linker (aka lld) to remove all the extra compiler-rt intrinsics by default. When you run wasm-gc this hello world today reduces to 137 bytes as you'd expect.

With this PR, however, the size of the first compilation is 633k (!) and with wasm-gc it only reduces to 209k (!). Not to fear, though, this is expected! This "bug" is due to the fact that the release_60 branch of LLD does not yet implement --gc-sections for wasm, only for other targets. As a result, we're not invoking LLD with --gc-sections yet. The way to recover this size difference is to actually do what wasm is already doing today, compile with LTO! Compiling with rustc +nightly foo.rs -C lto -O --target wasm32-unknown-unknown. This yield a 149 byte executable without running wasm-gc. The gc step can slim it down by a few dozen bytes, but there's also some bugs for us to fix in rustc.

To recap, here's the size of the program above in various stages:

-O -O -C lto -O + wasm-gc -O -C lto + wasm-gc
today 17403 17403 137 137
this PR 647467 149 213569 89
this PR + LLD tip 209 212 132 132

As always for the smallest binaries it's recommended to compile with -O -C lto and run wasm-gc, and while this PR will drastically change the initial state it'll get better when LLD is upgraded again!

@alexcrichton
Copy link
Member Author

alexcrichton commented Feb 10, 2018

@japaric hm actually I've been meaning to refactor this for quite some time now, so f359356 should do the trick. In theory, once this lands, you can simply specify "linker-flavor": "ld.lld" in target specs and you should be good to go!

(this will likely be buggy when it first starts out as it's never been used before!)

@japaric
Copy link
Member

japaric commented Feb 10, 2018

@alexcrichton awesome! I've also confirmed that the version of LLD that ihis PR will ship can link ARM Cortex-M applications (older versions of LLD had problems with the linker scripts we use)

@badboy
Copy link
Member

badboy commented Feb 10, 2018

Finally the unstable #![wasm_import_memory] attribute was also removed as LLD
has a native option for controlling this.

With this PR, how is memory imported then?

@alexcrichton
Copy link
Member Author

@badboy with LLD you've now got a whole slew of options for configuring memory and how everything gets linked, but I think you'd at least start with --import-memory and go from there (I haven't tested it myself yet)

@badboy
Copy link
Member

badboy commented Feb 10, 2018

Thanks!

@bors
Copy link
Contributor

bors commented Feb 10, 2018

☔ The latest upstream changes (presumably #48113) made this pull request unmergeable. Please resolve the merge conflicts.

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 10, 2018
Copy link
Member

@Mark-Simulacrum Mark-Simulacrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this in general looks good to me but I'm not familiar with the linker bits in rustc_trans and rustc_back so might be good for someone else to review those. r=me if you feel good about them (they all look relatively fine to me).

.travis.yml Outdated
@@ -81,7 +81,7 @@ matrix:
# OSX 10.7 and `xcode7` is the latest Xcode able to compile LLVM for 10.7.
- env: >
RUST_CHECK_TARGET=dist
RUST_CONFIGURE_ARGS="--build=i686-apple-darwin --enable-extended --enable-profiler --enable-emscripten"
RUST_CONFIGURE_ARGS="--build=i686-apple-darwin --enable-tier1-tools --enable-profiler"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tier1-tools is probably not quite the best name for this -- since you can enable it (presumably) on non-tier1 platforms, and the platforms being "tier1" really has nothing to do with it -- I'd prefer something like full-tools or similar.

lld_install_root: &Path) {
let target = target_compiler.host;

let dst = builder.sysroot_libdir(target_compiler, target)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, so I feel like I'd sort of expect this to be in a non-target specific place -- that is, lld would presumably work for both x86_64-unknown-linux-gnu as much as wasm32-unknown-unknown... I don't think we have a good place to put it (since the sysroot/bin directory is probably a bad idea)... but something to think about. We may want to change this down the road to avoid duplicating it across the various platforms that will probably want to default to lld down the road.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah so this isn't necessarily target-specific, it's also where we put the codegen backends for example. The target in the directory name is basically "things in here are only relevant for this target", which for the case of binaries just means that platform the binaries run on. We'll use the lld executable here (by placing it in PATH) regardless of the actual target of the compiler.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, okay. Seems slightly odd, since presumably that means that we'd have to have the rustlib for, say, x86_64-unknown-linux-gnu installed if that's where we ship lld even if you want to use lld for some windows target. But maybe I'm creating problems where none exist.

fn finalize(&mut self) -> Command {
self.cmd.arg("--threads");

// FIXME we probably shouldn't pass this but instead pass an explicit
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth opening an issue about this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly!

@Mark-Simulacrum
Copy link
Member

-O -O -C lto -O + wasm-gc -O -C lto + wasm-gc
today 17403 17403 137 137
this PR 647467 149 213569 89
this PR + LLD tip 209 212 132 132

I find this somewhat suspicious. Did LLD tip regress? It looks like it gc's less out of the wasm files...

@alexcrichton
Copy link
Member Author

@Mark-Simulacrum heh I had to double check as well. LLD didn't regress per se but it did emit a few more annotations/exports for things I didn't quite recognize (but seemed injected in LLVM/LLD/etc). In general LLD-for-wasm I believe is still early-days, so I suspect that this'll get smoothed out over time!

bors added a commit that referenced this pull request Mar 4, 2018
rust: Import LLD for linking wasm objects

This commit imports the LLD project from LLVM to serve as the default linker for
the `wasm32-unknown-unknown` target. The `binaryen` submoule is consequently
removed along with "binaryen linker" support in rustc.

Moving to LLD brings with it a number of benefits for wasm code:

* LLD is itself an actual linker, so there's no need to compile all wasm code
  with LTO any more. As a result builds should be *much* speedier as LTO is no
  longer forcibly enabled for all builds of the wasm target.
* LLD is quickly becoming an "official solution" for linking wasm code together.
  This, I believe at least, is intended to be the main supported linker for
  native code and wasm moving forward. Picking up support early on should help
  ensure that we can help LLD identify bugs and otherwise prove that it works
  great for all our use cases!
* Improvements to the wasm toolchain are currently primarily focused around LLVM
  and LLD (from what I can tell at least), so it's in general much better to be
  on this bandwagon for bugfixes and new features.
* Historical "hacks" like `wasm-gc` will soon no longer be necessary, LLD
  will [natively implement][gc] `--gc-sections` (better than `wasm-gc`!) which
  means a postprocessor is no longer needed to show off Rust's "small wasm
  binary size".

LLD is added in a pretty standard way to rustc right now. A new rustbuild target
was defined for building LLD, and this is executed when a compiler's sysroot is
being assembled. LLD is compiled against the LLVM that we've got in tree, which
means we're currently on the `release_60` branch, but this may get upgraded in
the near future!

LLD is placed into rustc's sysroot in a `bin` directory. This is similar to
where `gcc.exe` can be found on Windows. This directory is automatically added
to `PATH` whenever rustc executes the linker, allowing us to define a `WasmLd`
linker which implements the interface that `wasm-ld`, LLD's frontend, expects.

Like Emscripten the LLD target is currently only enabled for Tier 1 platforms,
notably OSX/Windows/Linux, and will need to be installed manually for compiling
to wasm on other platforms. LLD is by default turned off in rustbuild, and
requires a `config.toml` option to be enabled to turn it on.

Finally the unstable `#![wasm_import_memory]` attribute was also removed as LLD
has a native option for controlling this.

[gc]: https://reviews.llvm.org/D42511
@bors
Copy link
Contributor

bors commented Mar 4, 2018

☀️ Test successful - status-appveyor, status-travis
Approved by: Mark-Simulacrum
Pushing 4a72063 to master...

@bors bors merged commit 0129b01 into rust-lang:master Mar 4, 2018
@alexcrichton alexcrichton deleted the lld branch March 4, 2018 11:25
@oooutlk
Copy link

oooutlk commented Aug 23, 2018

Any schedule to turn on lld by default on Tier-2 platform?

@bstrie
Copy link
Contributor

bstrie commented May 17, 2019

[Oops, wrong thread...]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.