Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better __init__ infrastructure #184

Merged
merged 4 commits into from
Dec 3, 2014
Merged

Better __init__ infrastructure #184

merged 4 commits into from
Dec 3, 2014

Conversation

timholy
Copy link
Member

@timholy timholy commented Dec 2, 2014

This is WIP to address #175. The current problem is that calling register_blosc inside of HDF5's __init__ causes a segfault with include("plain.jl") inside test/. Leaving it out (as is the current state in this PR), the ccall in h5p_set_blosc returns an error.

CC @stevengj

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling f8d07da on teh/init into 545c44c on master.

@timholy
Copy link
Member Author

timholy commented Dec 2, 2014

I should have said that most of this is targeted at getting back to the place where we can precompile HDF5. The error/segfault is something I get only when I put HDF5 in my userimg.jl. (The Travis failure has nothing to do with this PR; it's because the definition of SubArray has changed in julia 0.4.)

@stevengj
Copy link
Member

stevengj commented Dec 2, 2014

So the segfault only occurs when building the userimg? I'm a little fuzzy on what restrictions apply to code running in that context...

@timholy
Copy link
Member Author

timholy commented Dec 2, 2014

Sorry, I was kinda rushing to catch a meeting when I submitted this, and I was much less clear than I should have been.

No segfault when I build, but it does segfault when I run the tests IF I call register_blosc inside HDF5's __init__. If I don't call it, then I get this (selected lines copied directly from test/plain.jl):

julia> using HDF5  # "instantaneous" because I precompiled

julia> fn = joinpath(tempdir(),"test.h5")
"/tmp/test.h5"

julia> f = h5open(fn, "w")
HDF5 data file: /tmp/test.h5

julia> g = g_create(f, "mygroup")
HDF5 group: /mygroup (file: /tmp/test.h5)

julia> R = rand(1:20, 20, 40);

julia> g["CompressedA", "chunk", (5,6), "compress", 9] = R
HDF5-DIAG: Error detected in HDF5 (1.8.11) thread 140554979055488:
  #000: ../../../src/H5Pocpl.c line 753 in H5Pset_filter(): failed to call private function
    major: Property lists
    minor: Can't set value
  #001: ../../../src/H5Pocpl.c line 814 in H5P__set_filter(): failed to load dynamically loaded plugin
    major: Data filters
    minor: Unable to load metadata into cache
  #002: ../../../src/H5PL.c line 293 in H5PL_load(): search in paths failed
    major: Plugin for dynamically loaded library
    minor: Can't get value
  #003: ../../../src/H5PL.c line 397 in H5PL__find(): can't open directory
    major: Plugin for dynamically loaded library
    minor: Can't open directory or file
ERROR: Error setting blosc compression level                                                                                                                                                                                                                                   
 in h5p_set_blosc at /home/tim/.julia/v0.4/HDF5/src/blosc_filter.jl:145                                                                                                                                                                                                        
 in setindex! at /home/tim/.julia/v0.4/HDF5/src/plain.jl:822                                                                                                                                                                                                                   
 in setindex! at /home/tim/.julia/v0.4/HDF5/src/plain.jl:839                                                                                                                                                                                                                   

(That error is thrown only because of the 2nd commit in this PR.)

It has nothing to do with the group; you get the same thing with f["CompressedA", ...] = R.

@timholy
Copy link
Member Author

timholy commented Dec 2, 2014

I should add that I have no real reason to think we have to call blosc_register, that's just my first thought in trying to fix the above error. If there's another way, so much the better.

For reference, here's what the segfault looks like under gdb:

julia> using HDF5

julia> fn = joinpath(tempdir(),"test.h5")
"/tmp/test.h5"

julia> f = h5open(fn, "w")
HDF5 data file: /tmp/test.h5

julia> f["Float64"] = 3.2
3.2

julia> f["Int16"] = int16(4)
4

julia> R = rand(1:20, 20, 40);

julia> f["CompressedA", "chunk", (5,6), "compress", 9] = R

Program received signal SIGSEGV, Segmentation fault.
0x00002b958adfe830 in ?? ()
(gdb) bt
#0  0x00002b958adfe830 in ?? ()
#1  0x00007fffe6f63029 in ?? () from /usr/lib/x86_64-linux-gnu/libhdf5.so
#2  0x00007fffe6f6366c in H5Z_set_local () from /usr/lib/x86_64-linux-gnu/libhdf5.so
#3  0x00007fffe6dc44c1 in H5D__create () from /usr/lib/x86_64-linux-gnu/libhdf5.so
#4  0x00007fffe6dcb771 in ?? () from /usr/lib/x86_64-linux-gnu/libhdf5.so
#5  0x00007fffe6e60df9 in H5O_obj_create () from /usr/lib/x86_64-linux-gnu/libhdf5.so
#6  0x00007fffe6e51682 in ?? () from /usr/lib/x86_64-linux-gnu/libhdf5.so
#7  0x00007fffe6e25f15 in ?? () from /usr/lib/x86_64-linux-gnu/libhdf5.so
#8  0x00007fffe6e267c6 in H5G_traverse () from /usr/lib/x86_64-linux-gnu/libhdf5.so
#9  0x00007fffe6e52d55 in H5L_link_object () from /usr/lib/x86_64-linux-gnu/libhdf5.so
#10 0x00007fffe6dc3fec in H5D__create_named () from /usr/lib/x86_64-linux-gnu/libhdf5.so
#11 0x00007fffe6dad666 in H5Dcreate2 () from /usr/lib/x86_64-linux-gnu/libhdf5.so
#12 0x00007ffff7e93464 in ?? ()
#13 0x000000000a000007 in ?? ()
#14 0x0000000009b59e70 in ?? ()
#15 0x0000000000fa74e0 in ?? ()
#16 0x000000000930b960 in ?? ()
#17 0x000000000000000c in ?? ()
#18 0x00007fffffffbea8 in ?? ()
#19 0x000000000868b270 in ?? ()
#20 0x0000000000000000 in ?? ()
(gdb)

@stevengj
Copy link
Member

stevengj commented Dec 2, 2014

I don't quite understand the __init__ system (since it is almost completely undocumented at the moment: JuliaLang/julia#8923). Is the Blosc.jl module __init__ function called before the HDF5 __init__?

@timholy
Copy link
Member Author

timholy commented Dec 2, 2014

Yes. (I double-checked module initialization order by adding a jl_(m->name) here.)

I also tried commenting out the atexit() function in Blosc's __init__, but it didn't seem to change much.

@stevengj
Copy link
Member

stevengj commented Dec 2, 2014

I wonder if you have to initialize c_blosc_set_local and c_blosc_filter in __init__?

@timholy
Copy link
Member Author

timholy commented Dec 3, 2014

That did the trick. Thanks!

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling aa59f01 on teh/init into 545c44c on master.

@stevengj
Copy link
Member

stevengj commented Dec 3, 2014

@vtjnash, why would we need to run cfunction in __init__?

@timholy
Copy link
Member Author

timholy commented Dec 3, 2014

I presume it's because the pointer location of the function isn't reproducible across runs?

@stevengj
Copy link
Member

stevengj commented Dec 3, 2014

The pointer location of an Array presumably isn't reproducible across runs either, and yet you can apparently initialize const arrays outside of __init__. The location of the compiled cfunction machine code is knowable at compile-time, so in principle it shouldn't have to be set in __init__.

(It would be nice to know what the rules are, here.)

@timholy
Copy link
Member Author

timholy commented Dec 3, 2014

...and with that last commit, we have successful precompilation.

@timholy timholy changed the title WIP: better __init__ infrastructure Better __init__ infrastructure Dec 3, 2014
@timholy
Copy link
Member Author

timholy commented Dec 3, 2014

Removed the WIP tag. Before merging I'll give anyone who wishes a chance to look it over. But I think it's good to go.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.06%) when pulling df75bf9 on teh/init into 4696b34 on master.

timholy added a commit that referenced this pull request Dec 3, 2014
Better __init__ infrastructure (closes #175)
@timholy timholy merged commit f6b5589 into master Dec 3, 2014
@timholy timholy deleted the teh/init branch December 3, 2014 13:04
@vtjnash
Copy link
Contributor

vtjnash commented Dec 3, 2014

@vtjnash, why would we need to run cfunction in init?

it's an implementation issue. currently cfunction is a runtime function that just returns a raw pointer.

I don't quite understand the init system (since it is almost completely undocumented at the

indeed, i hadn't realized that. it calls __init__ once after closing the module, either after having reloaded it from cache/sysimg, or after it has executed all the code inside the module. all __init__ methods are executed in the order in which their modules were encountered, except that submodules are initialized before their parent modules.

@stevengj
Copy link
Member

stevengj commented Dec 3, 2014

Thanks @vtjnash, I've posted a PR to document __init__ with this information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants