-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Golang Compatibility #46
Comments
Hello @bbengfort! Thank you very much for giving me a lot of examples (I checked your Stack Overflow question), which provided important clues for me to locate the problem. Short answerOn little-endian machines, murmur3 in Go returns a 128-bit byte sequence that is incompatible with the original MurmurHash3 code and Python mmh3. Specifically, the To solve the issue, I wrote a Go gist here, whose input is the result of This is the first time I ever wrote a Go code, so feel free to modify it so that it suits your code style! Long answerThe original MurmurHash3 code written by Austin Appleby is endian-sensitive. The Let's assume the result was On little-endian machines, least significant bits (LSB) are written first, so their actual representations on memory is
This is what However,
which differs from the result in the original C++ and Python on little-endian platforms. My gist in the above is basically the little-endian version of Go murmur3 Edit: forgive me for "Short answer" is not that short haha |
Thank you so much for taking the time to respond to my question. Your Go snippet is well written; thank you very much! I will contact the authors of the Go Library to see their thoughts on updating their package with your suggestion. Thank you also for posting the temporary notice about compatibility -- I think that will be very helpful to other Go/Python users! I also wanted to add for future readers of this issue that the Python workaround similarly flips the low and high address: hash = mmh3.hash128(name.encode('utf-8'), signed=False).to_bytes(16, byteorder='big', signed=False)
hash = hash[8:] + hash[:8] They can use the Go snippet or the Python snippet depending on which direction they want their hash compatibility to go. |
Hello @hajimes thank you so much for providing this murmur3 implementation in Python and for all the work you do in open source; we really appreciate being able to use your library!
I was recently investigating some compatibility issues with the output produced by mmh3 and a Go library we were using internally. I asked the question on Stackoverflow and got a response back:
https://stackoverflow.com/questions/75921577/murmur3-hash-compatibility-between-go-and-python
It looks like the order of the two uint64s returned by the 128-bit algorithm is reversed between the two libraries; but it's simple enough to modify the returned results in either Go or Python to produce compatible hashes.
I was wondering; would you like me to open a PR to update the README with the compatibility information? Is there any other docs I should update in the PR?
Additionally, if there is any way to reverse the order order of the uint64s returned by murmur3 (e.g. with an argument to
hash128
orhash_bytes
) I'd be happy to open a PR for that as well. Let me know how you'd like to proceed!The text was updated successfully, but these errors were encountered: