Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store global variables and function addresses as 32bit offsets #22472

Merged
merged 2 commits into from
Jun 23, 2017

Conversation

yuyichao
Copy link
Contributor

Yet another PR split out of #21849 ....

This makes the two tables static data and reduces the number of dynamic relocations by ~100x
(~40k to ~400). Before @JeffBezanson asks, this reduces the size of the .rela.dyn section from 1144776 to 9432 while other sections remains roughly the same. Total size reduction is ~1.25MB. Apparently each pointer takes ~32bytes.

@yuyichao yuyichao added the compiler:codegen Generation of LLVM IR and native code label Jun 22, 2017
@yuyichao yuyichao force-pushed the yyc/codegen/function_offset branch 2 times, most recently from 1b797c4 to 93a7578 Compare June 22, 2017 04:15
@@ -1137,24 +1144,21 @@ void jl_dump_native(const char *bc_fname, const char *unopt_bc_fname, const char
}
}

ValueToValueMapTy VMap;
Module *clone = shadow_output;
Copy link
Member

@vtjnash vtjnash Jun 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the pass managers below aren't idempotent immutable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does that matter?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right nevermind; we've dropped support for versions of LLVM where jl_add_to_shadow wasn't already making the clone.

@yuyichao
Copy link
Contributor Author

Local test shows that the change has a subtle effect on symbol lookup on windows. Not having to reference them for dynamic relocation somehow removes it from whatever table SymFromAddr reads, which basically only contains the first function now and so all functions shows up as that function.

This probably also means that we were just doing a closest match lookup for functions in the sysimg fvars so we can just get the old behavior back by just doing that. Any idea how to make force SymFromAddr to find the functions in the sysimg or what's a better lookup function to use?

@vtjnash
Copy link
Member

vtjnash commented Jun 22, 2017

Any idea how to make force SymFromAddr to find the functions in the sysimg or what's a better lookup function to use?

Just get rid of this chunk of code. It doesn't work right, but the replacement code path (using LLVM to read the symbol table) only started to be available recently in LLVM 3.5.

@yuyichao yuyichao force-pushed the yyc/codegen/function_offset branch from 93a7578 to 812707a Compare June 23, 2017 00:57
@yuyichao
Copy link
Contributor Author

So I believe what's happening is that the first method used on 32bit and 64bit windows never worked as expected since they were only looking for exported function addresses and there was none before this PR. This PR exported one function as an alias so suddenly that broken method starts to return bad result and breaks the tests everywhere.

I fixed this by fixing #17251 and #20798 by adding the method mentioned in #20798 (comment) (Thanks to @vtjnash for telling me that we are already using the necessary classes). Although I just realized that a simpler fix for the symptom is actually simply removing the broken method since it was never doing anything. Oh, well :trollface:

The broken test in cmdlineargs is now passing for me on win32 and win64 locally. I assume it works on FreeBSD too.

@yuyichao
Copy link
Contributor Author

The first commit should be backportable to 0.6 if anyone wants it.

@yuyichao
Copy link
Contributor Author

llvm-config missing error again? https://travis-ci.org/JuliaLang/julia/jobs/246025402#L4155

@yuyichao yuyichao force-pushed the yyc/codegen/function_offset branch from 812707a to a9e60bb Compare June 23, 2017 02:38
@vtjnash
Copy link
Member

vtjnash commented Jun 23, 2017

never worked as expected since they were only looking for exported function addresses and there was none before this PR.

yes and no. I never expected it to work (it was present for getting lookup of functions in ntdll and msvcrt correct), I just also didn't expect it to break the sysimg lookup.

@yuyichao yuyichao force-pushed the yyc/codegen/function_offset branch 2 times, most recently from c5e9f41 to fea7fd1 Compare June 23, 2017 06:17
yuyichao added 2 commits June 23, 2017 15:02
* Add a (slow but cross-platform) fallback lookup method of function name and base address
  using LLVM debug info reader
* Disable windows exported symbol lookup that never worked for sysimg function
  address lookup (since they are never exported). Also move it after
  LLVM debug info reader since it is less accurate unless LLVM couldn't get any
  debug info.

Fix #17251
Fix #20798
* Require small code model for sysimg
* Makes the two tables static data and reduces the number of dynamic relocations by
  ~100x (~40k to ~400).
@yuyichao yuyichao force-pushed the yyc/codegen/function_offset branch from fea7fd1 to 92152aa Compare June 23, 2017 07:02
@@ -423,13 +423,7 @@ end
for precomp in ("yes", "no")
bt = readstring(pipeline(ignorestatus(`$(Base.julia_cmd()) --startup-file=no --precompiled=$precomp
-E 'include("____nonexistent_file")'`), stderr=catcmd))
@test contains(bt, "include_from_node1")
if ((is_windows() && Sys.WORD_SIZE == 32) || (is_bsd() && !is_apple())) && precomp == "yes"
# FIXME: Issue #17251 (Windows), #20798 (FreeBSD)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, thanks for fixing this

@yuyichao yuyichao merged commit d040151 into master Jun 23, 2017
@yuyichao yuyichao deleted the yyc/codegen/function_offset branch June 24, 2017 00:09
@iblislin
Copy link
Member

got following test error after this PR merged (https://julia.iblis.cnmc.tw/#/builders/2/builds/79):

Error During Test
  Got an exception of type LoadError outside of a @test
  LoadError: TypeError: non-boolean (Int8) used in boolean context
  Stacktrace:
   [1] typeinf(::Core.Inference.InferenceState) at ./loading.jl:513
   [2] typeinf(::Core.Inference.InferenceState) at ./sysimg.jl:14
   [3] typeinf(::Core.Inference.InferenceState) at ./boot.jl:236
   [4] filter!(::##11#29, ::Array{String,1}) at ./array.jl:1884
   [5] prepend!(::Array{String,1}, ::Array{String,1}) at ./array.jl:681
  while loading /home/julia/ci/worker/11rel-amd64/build/test/codegen.jl, in expression starting on line 6

Error in testset cmdlineargs:
Test Failed
  Expression: contains(bt, "include_from_node1(::Module, ::String) at $(joinpath(".", "loading.jl"))")

@yuyichao
Copy link
Contributor Author

The first one comes from 0737032 the version in
https://github.com/JuliaLang/julia/pull/22497/files#diff-d14f5feef27442143be917f90c9b9079R6 is correct.

The second one shows that it is still not fixed on freebsd, though the issue seems to be that the dladdr is messing it up. This agrees with the nonsense backtrace for the first issue. It is probably similar to the windows situation. Can you try if the cmdlineargs test pass on the first commit on this pr? If that fix the issue then I can probably tweak the order again so it's not broken by the second commit.

@fredrikekre
Copy link
Member

Same failure as #22472 (comment) on all 5 buildbots in #22503

@fredrikekre
Copy link
Member

@tkelman
Copy link
Contributor

tkelman commented Jun 24, 2017

The problem isn't with this PR but with 0737032#commitcomment-22740108. Fix in #22506

@iblislin
Copy link
Member

@yuyichao I checkouted 8c658c5, and the test-cmdlineargs is still failed.

Worker 1 failed running test cmdlineargs:                                                                                                  [2/1940]
Some tests did not pass: 139 passed, 1 failed, 0 errored, 0 broken.cmdlineargs: Test Failed
  Expression: contains(bt, "include_from_node1(::Module, ::String) at $(joinpath(".", "loading.jl"))")
Stacktrace:
 [1] record(::Base.Test.DefaultTestSet, ::Base.Test.Fail) at ./test.jl:587
 [2] (::##40#46)() at /usr/home/iblis/git/julia/test/runtests.jl:160
 [3] cd(::##40#46, ::String) at ./file.jl:70
 [4] include_from_node1 at ./loading.jl:549
 [5] include at ./sysimg.jl:14
 [6] process_options at ./client.jl:310
 [7] _start at ./client.jl:378

Test Summary: | Pass  Fail  Total
  Overall     |  139     1    140
    cmdlineargs |  139     1    140
    FAILURE
Error in testset cmdlineargs:
Test Failed
  Expression: contains(bt, "include_from_node1(::Module, ::String) at $(joinpath(".", "loading.jl"))")
ERROR: LoadError: Test run finished with errors
while loading /usr/home/iblis/git/julia/test/runtests.jl, in expression starting on line 29
gmake[1]: *** [Makefile:18: cmdlineargs] Error 1
gmake: *** [Makefile:568: test-cmdlineargs] Error 2
┌─[~/git/julia]
| [Venv(py36)] [-- INSERT --]
└─[iblis@ionic Oops]% g st
HEAD detached at 8c658c5bdf
...

@yuyichao
Copy link
Contributor Author

Oh, well, in that case someone else needs to figure out a fix for #20798 then.

@iblislin
Copy link
Member

@yuyichao I just found another error I overlooked previously:

compile: Test Failed
  Expression: (Base.Test.ismatch_warn)("ERROR: LoadError: Declaring __precompile__(false) is not allowed in files that are
 being precompiled.\nStacktrace:\n [1] __precompile__", (Base.Test.readstring)(#107#fname))
Stacktrace:
 [1] macro expansion at ./test.jl:452 [inlined]
 [2] (::Test67Main_compile.##1#14)() at /usr/home/iblis/git/julia/test/compile.jl:256
compile: Test Failed
  Expression: (Base.Test.ismatch_warn)("ERROR: LoadError: break me\nStacktrace:\n [1] error", (Base.Test.readstring)(#168#
fname))
Stacktrace:
 [1] macro expansion at ./test.jl:452 [inlined]
 [2] (::Test67Main_compile.##1#14)() at /usr/home/iblis/git/julia/test/compile.jl:341
compile: Error During Test 
  Test threw an exception of type Base.Test.TestSetException 
  Expression: compile

@iblislin
Copy link
Member

@yuyichao
Copy link
Contributor Author

yuyichao commented Jun 25, 2017

That's probably the nonsense bt I mentioned above.

@iblislin
Copy link
Member

iblislin commented Jun 25, 2017

For reference,
I manually ran the code from testing
On FreeBSD:

julia> Base.compilecache("Baz")
INFO: Precompiling module Baz.
ERROR: LoadError: Declaring __precompile__(false) is not allowed in files that are being precompiled.
Stacktrace:
 [1] @generated(::LineNumberNode, ::Module, ::ANY) at ./loading.jl:334
 [2] @generated(::LineNumberNode, ::Module, ::ANY) at ./loading.jl:549
 [3] @generated(::LineNumberNode, ::Module, ::ANY) at ./sysimg.jl:14
 [4] anonymous at ./<missing>:2
while loading /usr/home/iblis/git/julia/tmp/Baz.jl, in expression starting on line 1
ERROR: Failed to precompile Baz to /home/iblis/.julia/lib/v0.7/Baz.ji.
Stacktrace:
 [1] @generated(::LineNumberNode, ::Module, ::ANY) at ./loading.jl:689

On Linux:

julia> Base.compilecache("Baz")
INFO: Precompiling module Baz.
ERROR: LoadError: Declaring __precompile__(false) is not allowed in files that are being precompiled.
Stacktrace:
 [1] __precompile__(::Bool) at ./loading.jl:334
 [2] include_from_node1(::Module, ::String) at ./loading.jl:549
 [3] include(::Module, ::String) at ./sysimg.jl:14
 [4] anonymous at ./<missing>:2
while loading /home/iblis/git/julia/tmp/Baz.jl, in expression starting on line 1
ERROR: Failed to precompile Baz to /home/iblis/.julia/lib/v0.7/Baz.ji.
Stacktrace:
 [1] compilecache(::String) at ./loading.jl:689

The bt on FreeBSD become quite useless :/

@tkelman
Copy link
Contributor

tkelman commented Jun 25, 2017

were they any better before this pr?

@iblislin
Copy link
Member

before this PR, it can reveal the exception was raised from compilecache instead of @generated(::LineNumberNode, ::Module, ::ANY)

julia> Base.compilecache("Baz")
INFO: Precompiling module Baz.
ERROR: LoadError: Declaring __precompile__(false) is not allowed in files that are being precompiled.
Stacktrace:
 [1] __precompile__ at ./loading.jl:334
 [2] include_from_node1 at ./loading.jl:549
 [3] include at ./sysimg.jl:14
 [4] anonymous at ./<missing>:2
while loading /usr/home/iblis/git/julia/tmp/Baz.jl, in expression starting on line 1
ERROR: Failed to precompile Baz to /home/iblis/.julia/lib/v0.7/Baz.ji.
Stacktrace:
 [1] compilecache at ./loading.jl:689

julia> versioninfo()
Julia Version 0.7.0-DEV.679
Commit 02804aceef* (2017-06-23 01:13 UTC)
Platform Info:
  OS: FreeBSD (x86_64-unknown-freebsd12.0)
  CPU: Intel(R) Xeon(R) CPU           E5620  @ 2.40GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Nehalem)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, westmere)
Environment:

@iblislin
Copy link
Member

I just found this make bt work on FreeBSD but do not understand why

diff --git a/src/debuginfo.cpp b/src/debuginfo.cpp
index a2b3012f4a..6f8224c73a 100644
--- a/src/debuginfo.cpp
+++ b/src/debuginfo.cpp
@@ -729,6 +729,10 @@ static void get_function_name_and_base(const object::ObjectFile *object, bool in
 {
     if (!object)
         return;
+#ifdef _OS_FREEBSD_
+    *name = nullptr;
+    *saddr = nullptr;
+#endif
     // Assume we only need base address for sysimg for now
     if (!insysimage || !sysimg_fvars_base)
         saddr = nullptr;

@yuyichao
Copy link
Contributor Author

This is the ordering problem I mentioned. Doing it here is wrong but it seems that there's no way to do this correctly on freebsd.

@iblislin
Copy link
Member

@yuyichao :-o . So... can we simply disable dladdr and rely on get_function_name_and_base on FreeBSD?
Will we lose any useful info from dladdr?

@yuyichao
Copy link
Contributor Author

Not disabling it but try it later. get_function_name_and_base is apparently a no-op on FreeBSD.

@iblislin
Copy link
Member

iblislin commented Jun 25, 2017

Quote from FreeBSD dladdr(3)

BUGS 
     This implementation is bug-compatible with the Solaris implementation.
     In particular, the following bugs are present:
     
     o   If addr lies in the main executable rather than in a shared library,
         the pathname returned in dli_fname may not be correct.  The pathname
         is taken directly from argv[0] of the calling process.  When
         executing a program specified by its full pathname, most shells set
         argv[0] to the pathname.  But this is not required of shells or
         guaranteed by the operating system.
     
     o   If addr is of the form &func, where func is a global function, its
         value may be an unpleasant surprise.  In dynamically linked programs,
         the address of a global function is considered to point to its
         program linkage table entry, rather than to the entry point of the
         function itself.  This causes most global functions to appear to be
         defined within the main executable, rather than in the shared
         libraries where the actual code resides.
                               
     o   Returning 0 as an indication of failure goes against long-standing
         Unix tradition.

Do we hit the bug mentioned in manual?

get_function_name_and_base is apparently a no-op on FreeBSD.

Yes, I just observed that in gdb ( *name and *saddr are non-null usaully)

iblislin added a commit to iblislin/julia that referenced this pull request Jun 26, 2017
…n FreeBSD

There is a known bug in FreeBSD's dladdr(3):
    (Quote from manual dladdr(3))
    In dynamically linked programs,
    the address of a global function is considered to point to its
    program linkage table entry, rather than to the entry point of the
    function itself.  This causes most global functions to appear to be
    defined within the main executable, rather than in the shared
    libraries where the actual code resides.

Function `get_function_name_and_base` implemented in PR JuliaLang#22472 provides
a (slow but cross-platform) way to lookup function name and base address
via LLVM.

@yuyichao propose that getting info from `get_function_name_and_base`
first and making original `dladdr` as fallback.

Fix: JuliaLang#20798
See also: JuliaLang#22472 (comment)
DrTodd13 pushed a commit to IntelLabs/julia that referenced this pull request Jun 26, 2017
ararslan pushed a commit that referenced this pull request Sep 11, 2017
Store global variables and function addresses as 32bit offsets

Ref #22472
(cherry picked from commit d040151)
ararslan pushed a commit that referenced this pull request Sep 13, 2017
* Add a (slow but cross-platform) fallback lookup method of function name and base address
  using LLVM debug info reader
* Disable windows exported symbol lookup that never worked for sysimg function
  address lookup (since they are never exported). Also move it after
  LLVM debug info reader since it is less accurate unless LLVM couldn't get any
  debug info.

Fix #17251
Fix #20798

Ref #22472
(cherry picked from commit 8c658c5)
ararslan pushed a commit that referenced this pull request Sep 13, 2017
* Require small code model for sysimg
* Makes the two tables static data and reduces the number of dynamic relocations by
  ~100x (~40k to ~400).

PR #22472
(cherry picked from commit 92152aa)
@vtjnash vtjnash mentioned this pull request Sep 14, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:codegen Generation of LLVM IR and native code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants