Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix #17997, don't load packages in Main #23579

Merged
merged 2 commits into from
Sep 20, 2017
Merged

fix #17997, don't load packages in Main #23579

merged 2 commits into from
Sep 20, 2017

Conversation

JeffBezanson
Copy link
Member

@JeffBezanson JeffBezanson commented Sep 4, 2017

This refactors loading so that top-level modules are stored in an internal table instead of exposed in Main. It's close to working, but currently still fails a couple of the compile tests.

Each top-level module has an identifying key. Currently, this is just a Symbol of the module name like we use now, but designed to be generalized. This adds a small API to the loader to manage top-level ("root") modules:

  • is_root_module(m): Test whether m is a root module, which means it needs to be identified by a loader key.
  • root_module(key): Get the root module with the given key.
  • root_module_key(m): Get the key for a root module.
  • root_module_exists(key): Test whether a module with the given key has been loaded.
  • register_root_module(key, mod): Record that a given root module is available under a certain key.
  • require(name): In using Foo, this function accepts :Foo and returns the module it should map to. :Foo can either be used directly as a module key, or (in the near future, IIUC) combined with an environment to look up a UUID to use as the key.

This API replaces introspection of the Main module in various places.

close #17997
close #21969

Copy link
Member

@vtjnash vtjnash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

src/dump.c Outdated
b->value = v;
jl_gc_wb_binding(b, v);
case 2: {
// top level modules handled by loader
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code wasn't (isn't) for top-level modules, it's present to handle submodules

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't submodules handled just by deserializing bindings and their values?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some cases, yes. But this code also handles conditional-require (the behavior of adding them to Main was just a side-detail).

src/dump.c Outdated
if (ref_only == 1)
m_ref = jl_get_global((jl_module_t*)jl_deserialize_value(s, NULL), mname);
else
m_ref = (jl_value_t*)jl_lookup_root_module((jl_value_t*)mname);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the job of read_verify_mod_list, and should be handled there, not here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What can I return from this function then?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Er, you can't. All external state must be prepared by read_verify_mod_list, once jl_deserialize_value starts in earnest, this can only return existing references into that array.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't change any global state --- it just looks up the module by name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should just be using an index into mod_array = jl_get_loaded_modules(); (as deserialized by read_verify_mod_list)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll try that.

src/toplevel.c Outdated
jl_type_error("import or using", (jl_value_t*)jl_sym_type, (jl_value_t*)var);
if (i == jl_array_len(args)-1)
break;
m = (jl_module_t*)jl_eval_global_var(m, var);
if (!jl_is_module(m))
jl_errorf("invalid import statement");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could use a better error message here, since this sounds like the syntax is invalid: s/statement/path: %s is not a module/

src/toplevel.c Outdated
m = (jl_module_t*)jl_eval_global_var(m, (jl_sym_t*)s);
var = (jl_sym_t*)jl_array_ptr_ref(args, i);
if (!jl_is_symbol(var))
jl_type_error("import or using", (jl_value_t*)jl_sym_type, (jl_value_t*)var);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pass the right name (import or using) as an argument? not super important, but seems like it should be trivial enough

src/toplevel.c Outdated
jl_value_t *m = NULL;
if (root_module_func != NULL) {
jl_value_t *rmargs[2] = {root_module_func, key};
m = jl_apply(rmargs, 2);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the system state will get severely corrupted if you call jl_apply from this context

const loaded_modules = ObjectIdDict()
const module_keys = ObjectIdDict()

function register_root_module(key, m::Module)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a bad idea. Maybe require should set a global flag (haskey(package_locks, m) or perhaps better would be isempty(package_locks)) that disables rooting of the object in Main.m?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is really only meant for internal use. Nobody but the loading code should call it. Does that address your concern, or is there something else?

Copy link
Member

@vtjnash vtjnash Sep 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This design appears to assume there will be a 1-to-1 correlation between require and module definition, which usually (but doesn't always) happen now. I think a simpler design would be handle that decision point during construction, rather than post-factum mutation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'll be able to handle it either way. If loading something defines 0 modules, we can store nothing here instead of a module (to provide "load once" semantics). If a file defines multiple modules, you won't be able to reference all of them without a distinct using statement for each one you want anyway. But we should definitely discourage or disallow that. @StefanKarpinski even briefly floated the idea of removing the module keyword (or making it optional), and having the loader make modules instead (thus enforcing 1-to-1 file-to-module by construction).

base/loading.jl Outdated
@@ -288,6 +288,7 @@ function reload(name::AbstractString)
error("use `include` instead of `reload` to load source files")
else
# reload("Package") is ok
delete!(loaded_modules, Symbol(name))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it can be losing gc-roots. Although in practice, it will perhaps often get rooted in the Tuple type cache, I'm not sure if that'll be entirely sufficient to make sure the contained types can get garbage-collected while there is still instances of their objects around.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, we can just keep an old-modules list.

base/loading.jl Outdated
@@ -345,7 +387,7 @@ function _require(mod::Symbol)
if loading !== false
# load already in progress for this module
wait(loading)
Copy link
Member

@vtjnash vtjnash Sep 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return wait(loading)?

base/loading.jl Outdated
@@ -364,7 +406,9 @@ function _require(mod::Symbol)
if JLOptions().use_compilecache != 0
doneprecompile = _require_search_from_serialized(mod, path)
if !isa(doneprecompile, Bool)
return # success
M = doneprecompile[end]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this list isn't sorted by that key (it's sorted by load order), I think you'll need to scan it to find when the module you wanted was defined

base/loading.jl Outdated
# TODO: this is a hack to handle circular module references
tempmod = Module(temp_mod_name)
loaded_modules[mod] = tempmod
M = Base.include_relative(tempmod, path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't currently expected to return a module. We'll need to add a deprecation warning here if we want to change the API.

@JeffBezanson
Copy link
Member Author

JeffBezanson commented Sep 4, 2017

@vtjnash The main thing I'm having trouble with here is the infamous 2.0 === 97.0 failure in the compile tests. I haven't yet figured out what's causing it. Is there an invalidation step I missed that relies on things being in Main?
EDIT: solved.

@JeffBezanson JeffBezanson force-pushed the jb/loading branch 5 times, most recently from b3721f6 to a02bbb6 Compare September 6, 2017 03:31
@JeffBezanson
Copy link
Member Author

After making this change, Documenter no longer worked with just using Documenter. Somehow it requires a hard binding for Documenter in Main, possibly due to calling names. I got around this for now by adding an ad-hoc rule: if you write using X in Main then a hard binding for X is also added. We might want to do that everywhere, which would also allow removing the self binding (Base.Base) that exists in every module.

@StefanKarpinski StefanKarpinski added this to the 1.0 milestone Sep 7, 2017
@JeffBezanson JeffBezanson changed the title WIP: fix #17997, don't load packages in Main fix #17997, don't load packages in Main Sep 7, 2017
@JeffBezanson JeffBezanson force-pushed the jb/loading branch 2 times, most recently from bca2b4d to 5f52653 Compare September 12, 2017 17:45
@JeffBezanson
Copy link
Member Author

I think this is ready to go.

base/loading.jl Outdated
end
end

function _require(mod::Symbol)
if root_module_exists(mod)
return root_module(mod)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks weird to require that all exit points to this function call the function root_module(mod) and return the result. Seems like this should be handled in the caller instead.

# make sure the code loads without it.
try
using ZMQ
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is intended to work without ZMQ installed. Previously this was handled by putting a fake empty ZMQ module in Main.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we now always just put a fake module in Base.__toplevel__ instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't encourage that. We would need a better official way to mock out packages. Would probably need more design to fully work with pkg3.

@@ -566,7 +566,7 @@ end
Return an array of methods with an argument of type `typ`.

The optional second argument restricts the search to a particular module or function
(the default is all modules, starting from Main).
(the default is all modules).
Copy link
Member

@vtjnash vtjnash Sep 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New implementation doesn't do this (restricting the search) anymore. I'm not against making it recursive, but it should be consistent, and possibly in a separate commit / PR.

@JeffBezanson
Copy link
Member Author

Comments addressed.

@JeffBezanson JeffBezanson force-pushed the jb/loading branch 2 times, most recently from c5eebf2 to 9fbd631 Compare September 14, 2017 15:22
@JeffBezanson
Copy link
Member Author

@vtjnash You'll probably want to take another look to make sure I didn't make a total hash of dump.c.

src/toplevel.c Outdated
jl_array_t *ht = (jl_array_t*)jl_get_nth_field(loaded_modules, 0);
assert(jl_is_array(ht));
return (jl_module_t*)jl_eqtable_get(ht, key, NULL);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drop this function (unused now)

# make sure the code loads without it.
try
using ZMQ
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we now always just put a fake module in Base.__toplevel__ instead?

@@ -288,6 +288,7 @@ function reload(name::AbstractString)
error("use `include` instead of `reload` to load source files")
else
# reload("Package") is ok
unreference_module(Symbol(name))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be unnecessary

oldm = loaded_modules[key]
if oldm !== m
name = module_name(oldm)
warn("replacing module $name.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we doing Uppercase messages? I don't precisely remember what style we've started to settle on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't seem to be totally consistent yet. On a quick search it's close to 50/50. There are some other occurrences of this same message that are lowercase.

base/loading.jl Outdated
end
end

# just load the file normally via include
# for unknown dependencies
local M
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

M is unused?

@@ -678,8 +673,7 @@ download(url, filename)
workspace()

Replace the top-level module (`Main`) with a new one, providing a clean workspace. The
previous `Main` module is made available as `LastMain`. A previously-loaded package can be
accessed using a statement such as `using LastMain.Package`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does one do this now? / What does workspace() do? / Should we add empty!(Base. loaded_modules), and (re-)register the new modules with register_root_module? / Should we add a new function to do that (Base.reset_require() or argument to workspace) instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll update the doc string. I think workspace() should just give you a new Main module. So using X in the new Main module will point to the already-loaded copy. To re-load packages there is reload.

JeffBezanson added a commit that referenced this pull request Sep 20, 2017
@JeffBezanson JeffBezanson merged commit 8a96ed2 into master Sep 20, 2017
@JeffBezanson JeffBezanson deleted the jb/loading branch September 20, 2017 21:25
@StefanKarpinski
Copy link
Member

Oh yeah, it's party time now.

@vtjnash
Copy link
Member

vtjnash commented Sep 22, 2017

Can you fix PkgEval ASAP (I think this is the cause of the breakage)? Otherwise, I think we should temporarily revert this tomorrow, so that we prepare a fix for the breakage at a more leisurely pace.

@JeffBezanson
Copy link
Member Author

Do you have a link to a description of the problem?

I really don't think we need to revert this --- PkgEval does not need to work 100% of the time on master. Posting a fix ASAP should be good enough.

@KristofferC
Copy link
Member

KristofferC commented Sep 22, 2017

I think this also broke JLD?ff9fb48#commitcomment-24480592

@ararslan
Copy link
Member

It broke Nanosoldier via the JLD breakage

@ararslan ararslan added the breaking This change will break code label Sep 22, 2017
@timholy
Copy link
Member

timholy commented Sep 24, 2017

Probably the source of this?

julia> module A
       export foo
       foo() = 1
       end
Main.A

julia> using A
ERROR: ArgumentError: Module A not found in current path.
Run `Pkg.add("A")` to install the A package.
Stacktrace:
 [1] _require(::Symbol) at ./loading.jl:417
 [2] require(::Symbol) at ./loading.jl:320

julia> A.foo()
1

julia> foo()
ERROR: UndefVarError: foo not defined

@timholy
Copy link
Member

timholy commented Sep 24, 2017

Also worth mentioning that something (probably this?) broke Revise:

julia> using Revise
init
mod = Main
_module_name(ex) = :Revise
names(Main) = 5-element Array{Symbol,1}:
 :Base     
 :Core     
 :Main     
ERROR: UndefVarError: Revise not defined
Stacktrace:
 [1] parse_module!(::Dict{Module,DataStructures.OrderedSet{Revise.RelocatableExpr}}, ::Expr, ::Symbol, ::Module, ::String) at /home/tim/.julia/v0.7/Revise/src/Revise.jl:521
 [2] parse_expr!(::Dict{Module,DataStructures.OrderedSet{Revise.RelocatableExpr}}, ::Expr, ::Symbol, ::Module, ::Any) at /home/tim/.julia/v0.7/Revise/src/Revise.jl:456
 [3] parse_source!(::Dict{Module,DataStructures.OrderedSet{Revise.RelocatableExpr}}, ::String, ::Symbol, ::Int64, ::Module, ::Any) at /home/tim/.julia/v0.7/Revise/src/Revise.jl:402
 [4] parse_source!(::Dict{Module,DataStructures.OrderedSet{Revise.RelocatableExpr}}, ::String, ::Module, ::String) at /home/tim/.julia/v0.7/Revise/src/Revise.jl:364
 [5] parse_source(::String, ::Module, ::String) at /home/tim/.julia/v0.7/Revise/src/Revise.jl:338
 [6] parse_pkg_files(::Symbol) at /home/tim/.julia/v0.7/Revise/src/Revise.jl:302
 [7] watch_package(::Symbol) at /home/tim/.julia/v0.7/Revise/src/Revise.jl:548
 [8] (::getfield(Base, Symbol("#inner#4")){Array{Any,1},typeof(Revise.watch_package),Tuple{Symbol}})() at ./essentials.jl:660
 [9] require(::Symbol) at ./loading.jl:333

Output is from adding a println("init") at the top of the __init__ function and

@show mod _module_name(ex) names(Main)

just above the line that triggered the error. Yet

julia> using FixedPointNumbers

julia> getfield(Main, :FixedPointNumbers)
FixedPointNumbers

still seems to work fine, so it's a puzzling (scope?) error.

@fredrikekre
Copy link
Member

using .A works. And I kind of like that the dot is needed, since that is what you would have to do if the module was defined in another module, e.g.

julia> module A
       module B
       end
       using .B # . needed here, just like in the REPL now
       end

@nalimilan
Copy link
Member

Related regression: the following now fails.

module A
    module B
    end

    @eval Main using $(Symbol(@__MODULE__)).B
end

It sounds unexpected that @__MODULE__ does not give the correct name allowing to load a module. Is there a better way of doing this? (I need this kind of thing to test the REPL at #23806.)

@yuyichao
Copy link
Contributor

You can still do .A.B or Main.A.B. Note that @__MODULE__ here is just A.

Also note that this is not a safe assumption to make in the test. You should directly eval the whole thing in Main instead.

@yuyichao
Copy link
Contributor

It sounds unexpected that @__MODULE__ does not give the correct name allowing to load a module.

And FWIW, it never was....

@nalimilan
Copy link
Member

@yuyichao My problem is that Base.runtests creates modules with unpredictable names. Hence the hackish code to load a module in the fake REPL for testing purposes. Simpler solution welcome.

@yuyichao
Copy link
Contributor

That's exactly what I said, you shouldn't try to get that name in the first place. You should

directly eval the whole thing in Main instead.

@nalimilan
Copy link
Member

I can do that, though that's not pretty in the context of that REPL test.

@yuyichao
Copy link
Contributor

It's as pretty as it can possibly be (much better than hacking a import path), much closer to what the REPL is actually used for (evaluating code in the REPL and not somewhere else), and is a fundamental property of the REPL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking This change will break code modules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Precompilation fails with Iterators.jl dependency don't load packages into Main by default?
9 participants