Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deserialization error: Domains not defined #281

Closed
vtjnash opened this issue Dec 4, 2011 · 6 comments
Closed

deserialization error: Domains not defined #281

vtjnash opened this issue Dec 4, 2011 · 6 comments
Labels
bug Indicates an unexpected problem or unintended behavior

Comments

@vtjnash
Copy link
Member

vtjnash commented Dec 4, 2011

This text is repeatedly printed to the console after calling remote_call_fetch:
"deserialization error: Domains not defined"

I don't know exactly where this error comes from since it appears slowly and randomly after calling remote_call_fetch. Domains is a type that gets defined using @bcast load() on all nodes.

Where does this error get thrown? So how would I catch this error?

Strangely, it appears to work anyways, but I don't know what might have been lost, which is important in situations where I can recover from and repeat the action. Eventually too, Julia starts to throw more errors (could not connect to 18.68.147.198:9010, errno=113 and could not connect to 18.68.147.215:9009, errno=110) after several minutes and then afterwards I start seeing DisconnectException() and it crashes if I attempt to do anything parallel after that. But the commands that are still running are able to finish, so I still get typically get results. (I don't keep track of what fraction of the results that I eventually get)

Perhaps this is related to the other random failure I have seen where after calling @bcast load(), the contents of the file will still not be defined on the local system?

@JeffBezanson
Copy link
Member

The connection errors are interesting; we seem to have

julia> strerror(110)
"Connection timed out"

julia> strerror(113)
"No route to host"

so maybe some of the needed connections are not happening.

Can this be reproduced with no network, say 2 local processes?

If you send me the code that causes this I can look deeper into it.

@vtjnash
Copy link
Member Author

vtjnash commented Dec 5, 2011

I doubt I could reproduce it locally, it seems to occur just with specific nodes, and only after some sort of timeout period has passed. I've stopped using some of the nodes and those connection issues seem to have ended.

I'm still getting the "deserialization error: Domains not defined" errors printing to the console. You can get my code at https://manwe.mit.edu/rhodecode/jqueens/files/7/jqueens.j However, it is probably not clear at all how to use it. I most likely need to create a reduced test case for this.

@StefanKarpinski
Copy link
Member

Is that part of a parallel chess search program?

@JeffBezanson
Copy link
Member

Just a quick note: you shouldn't use sleep since it will block the whole process. Instead there should be a RemoteRef you can wait on.

@vtjnash
Copy link
Member Author

vtjnash commented Dec 7, 2011

Can I wait simultaneously on any RemoteRef result? I only want results from the first one to finish, and then to kill the rest.

@JeffBezanson
Copy link
Member

I'll consider this a general fault tolerance issue (#217).

StefanKarpinski pushed a commit that referenced this issue Feb 8, 2018
* Add @__DIR__ macro.

* Correct version number.
KristofferC pushed a commit that referenced this issue May 9, 2018
* Download archives by tree hash instead of tag

Since we don't do any validation of the archives, downloading based on tag presents a potential security hole whereby a compromised repository retags a version.

This should fix that by downloading the archive for the tree directly. Note that the documentation (https://developer.github.com/v3/repos/contents/#get-archive-link) says that it should be a valid git reference, but using hashes seems to work as well.

* add note to method
KristofferC pushed a commit that referenced this issue May 9, 2018
* Download archives by tree hash instead of tag

Since we don't do any validation of the archives, downloading based on tag presents a potential security hole whereby a compromised repository retags a version.

This should fix that by downloading the archive for the tree directly. Note that the documentation (https://developer.github.com/v3/repos/contents/#get-archive-link) says that it should be a valid git reference, but using hashes seems to work as well.

* add note to method
LilithHafner pushed a commit to LilithHafner/julia that referenced this issue Oct 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

3 participants