-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch aarch64-darwin codegen to JITLink (ObjectLinkingLayer) and small code model #43664
Conversation
8ef4488
to
2eeb68d
Compare
6ff7036
to
3b3d270
Compare
Tentatively marking this for 1.8 (feature freeze is on February 15th) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!
@Keno we should chat about CI for M1
jitlink::JITLinkContext &Ctx, | ||
MemoryBufferRef InputObject) override | ||
{ | ||
// Keeping around a full copy of the input object file (and re-parsing it) is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How bad is the performance/memory impact?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't measured it, but it's the same as before in the RTDyld implementation – apparently even for ELF, where getObjectForDebug
isn't a no-op, a full copy is made internally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we still pretend that getObjectForDebug
is meaningfully implemented however?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do you mean? It's RTDyld only, and we do use it there if available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the upcoming plugin support in llvm/lib/ExecutionEngine/Orc/DebuggerSupportPlugin.cpp, it appears good support for this will appear in the future, so perhaps not worth working on now.
Alternatively, could we enable split-debug, and parse the .dwo files instead, instead of making JITLink deal with implementing the extract-dwo objcopy pass itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't use DebuggerSupportPlugin verbatim. I actually planned on using that first – hence the whole LLVM 14 effort –, but it already patches the section addresses to match the runtime memory layout, so we can't re-parse it using llvm::ObjectFile
.
I was about to try patching that out, but in the end, decided to just copy the input buffer as that's what we do on x86_64 MachO and other platforms anyway.
In the long run, particularly as more platforms will move onto JITLink, it will certainly be a good idea to remove the ObjectFile
references from debuginfo
and keep track of the required tables/… at a finer granularity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for split DWARF information, I haven't really thought about how that would interact with OrcJIT at all.
// (with the llvm_orc_registerJITLoaderGDBAllocAction symbol being in either | ||
// libjulia-codegen or yet another shared library for LLVM depending on the build | ||
// flags, etc.). | ||
const auto Addr = ExecutorAddr::fromPtr(&llvm_orc_registerJITLoaderGDBAllocAction); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lhames this might be of interest to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. @dnadlinger Can you describe what made the lookup fragile?
If possible we want to make ::Create
smooth enough to use that it's the natural default, even for in-process setups. (Just because it makes switching between in-process and out-of-process modes trivial)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lhames I'm not actually sure – I didn't want to touch the existing JITDylib
setup code, as I'm not very familiar with the history of the Julia codebase, and apparently orc::DynamicLibrarySearchGenerator::GetForCurrentProcess
wasn't enough to make it work. I didn't check whether that is supposed to work only for symbols in the main image or also for loaded dylibs, but just based on a quick glance, it appeared to be the former. In the "canonical" configuration, Julia links LLVM as a dylib dependency of an internal dylib used by the main process, so rather than trying to figure out what the correct setup is there, it seemed easier to just rely on the host linker to figure this out. Happy to incorporate any suggestions to avoid duplicating the symbol name, though!
The Buildkite failures appear to be due to #43415. I should also note that I more or less exclusively tested this on LLVM Git main locally. LLVM 13.x should run though |
We have one machine ready to go. We should be able to set that up right away. |
For CI, you'll probably want to make sure that there is a robust timeout mechanism in place that, if necessary, uses Any ideas regarding the tester_linux64 failure? At first glance, it looks like it hits an internal assertion in rr ( |
3b3d270
to
fa35aef
Compare
Fixed the typos, rebased on top of master, and removed the extra commit from #43612 – without it, there will be some stacktrace-related failures in |
Known bug, I'll be looking at it today. |
None of the remaining failures seem to be related to this PR. |
The same interface used to be called two different ways depending on the JIT implementation: via llvm::JITEventListener for use with the old JIT APIs, as well as through our own ORCNotifyObjectEmitted() callback for use with the new ORC-based JIT. Legacy JIT and MCJIT support is long gone, so this removes the unused llvm::JITEventListener interface. Since there was a single instance held in a static global in debuginfo.cpp anyway, I also removed CreateJuliaJITEventListener(), etc. in favour of a single static global instance – this of course still isn't exactly clean lifetime management.
fa35aef
to
6a469ea
Compare
…ll code model This fixes JuliaLang#41440, JuliaLang#43285 and similar issues, which stem from CodeModel::Large not being correctly implemented on MachO/ARM64. Requires LLVM 13.x or Git main (tested: 1dd5e6fed5db with patches from the JuliaLang/llvm-project julia-release/13.x branch, available at https://github.com/dnadlinger/llvm-project/commits/julia-main). Requires an LLVM patch to pass through __eh_frame unwind information, without which backtraces silently won't work (already applied on JuliaLang/llvm-project@julia-release/13.x): llvm/llvm-project#52921
6a469ea
to
955d427
Compare
Rebased on top of latest master and fixed backtraces when linking against LLVM as a shared library (the default). |
Tests now pass on my M1 Max when building with default options (LLVM 13 from BinaryBuilder):
…
This is with an extra hack to avoid the task switching hangs, see #41820. |
Gentle ping – anything I can do to help the review along? It would be nice for M1 to get some more testing before 1.8. Note that there are no (intended) changes to non-aarch64-darwin platforms. |
Just tested this on my M1 Pro. Works great! It solves all of the segfaults I was seeing in my package SymbolicRegression.jl. Julia maintainers, please merge this ASAP. My packages are broken on M1 without this change; with the change, they are working well. To summarize my test:
This lets me run the example code on the README of SymbolicRegression.jl. Before this change, on Julia 1.7.1, I encountered immediate segfaults in both distributed and multithreaded mode (but serial mode seemed to work). |
@MilesCranmer: Glad to hear this does the job for you. I'm actually a bit surprised that upstream LLVM (Let's not rush the people who'll have to review this slight mess of a change, though.) |
Yes, this is using the LLVM I did see some issues with mislinked LLVM libraries but those went away after setting Update: I get similar test numbers (LLVM 14
|
Ah, 2291413554ff does already have my __eh_frame JITLink patch (with Lang's tests), so it's not entirely implausible, but I'm still surprised that the stacktrace-related tests pass without the other patch that unconditionally enables __eh_frame generation. Regarding the miscompilation, I was referring to code generated by Julia – see Keno's commit for details. The original issue is probably linked from the M1 tracking issue. |
Do you have any idea why I may be getting this error when using an llvm installed from homebrew?
If I remove this from the path, I can build this PR. This isn't really a problem I have with this PR, unless errors building with certain libraries like this is related more to it than user error/inexperience on my behalf. About this PR itself: great work! I tried a few tests that consistently segfaulted before. Now, it all works! Will be great when this gets merged. |
@chriselrod Unfortunately, I'm (mostly) not using Homebrew myself these days, so I don't really know. Looking at the LLVM formula, it seems that they are overriding libcxx; not sure if that could be related. |
It seems like a bug that the Julia build process is finding an LLVM from elsewhere on your system. Unless you explicitly pass |
size_t SepPos = Sec.getName().find(','); | ||
if (SepPos >= 16 || (Sec.getName().size() - (SepPos + 1) > 16)) { | ||
LLVM_DEBUG({ | ||
dbgs() << "JLDebuginfoPlugin: Ignoring section '" << Sec.getName() | ||
<< "'\n"; | ||
}); | ||
continue; | ||
} | ||
auto SecName = Sec.getName().substr(SepPos + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
size_t SepPos = Sec.getName().find(','); | |
if (SepPos >= 16 || (Sec.getName().size() - (SepPos + 1) > 16)) { | |
LLVM_DEBUG({ | |
dbgs() << "JLDebuginfoPlugin: Ignoring section '" << Sec.getName() | |
<< "'\n"; | |
}); | |
continue; | |
} | |
auto SecName = Sec.getName().substr(SepPos + 1); | |
auto SecName = Sec.getName().split(',')[2]; | |
if (SecName.empty()) { | |
LLVM_DEBUG({ | |
dbgs() << "JLDebuginfoPlugin: Ignoring section '" << Sec.getName() | |
<< "'\n"; | |
}); | |
continue; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, this was more complex at first to transparently handle non-MachO formats (where the segment name doesn't appear in the JITLink section name), but I then figured we'd cross that bridge when we get there.
This is fantastic! Thank you all! |
It looks like this PR fixed many segfault issues for M1 users (in CSV, Pkg, MLJ), yayy! Would it be possible to backport this into the 1.7.2 release? |
No: #42295 (comment) |
@vchuravy for CI would it be enough to run it in a docker with a |
Julia CI already runs on aarch64-linux-gnu |
@MilesCranmer: Nice idea, but unfortunately no; apart from all the other platform-dependent issues that running on Linux would miss, we also don't currently use JITLink on Linux, as ELF support is very much WIP for it still. |
Got it, thanks for the explanation |
This fixes #41440, #43285 and similar runtime crashes, which stem from
CodeModel::Large
not being correctly implemented on MachO/ARM64.With it, the only testsuite failures remaining on my M1 Max machine are occasional thread.jl hangs, which appear to be the same as #41820.
Now that master is on LLVM 13, this should work against the default BinaryBuilder LLVM. LLVM 14 (Git main) is still preferred, as debugger integration (
ENABLE_GDBLISTENER
) is not supported on 13.x. I tested dnadlinger/llvm-project@57eb8c1, which is llvm/llvm-project@1dd5e6fed5db with our 13.x patches. Our LLVM patch set (julia-releases/13.x) is absolutely required so that__eh_frame
unwind information is generated and passed through the stack, without which backtraces silently won't work (see llvm/llvm-project#52921).Profiler integration (
ENABLE_JITPROFILING
) is not currently supported on either version; this is an upstream LLVM restriction.As pointed out in the comments, this currently doesn't use the custom Julia RTDyld memory manager, as the API has changed quite significantly. We'll probably want to port the dual-map optimisations to JITLinkMemoryManager soon, but I believe the random segfaults are much more disruptive, and thus the JITLink switch shouldn't be held up by that.