-
-
Notifications
You must be signed in to change notification settings - Fork 374
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a more idiomatic way to define hashing #202
Comments
We should definitely do something to make the hashing behavior clearer. I'm currently working on a PR to "fix" slots hashing behavior, and The docs say nothing about what should happen when My personal understanding would be for attrs to do exactly nothing, we're saying do not touch Assume a very simple class, with hashing disabled:
The equivalent of this class is: https://gist.github.com/Tinche/d8c1dadfaa3aa4618b736a61e752b3e1 Try pasting them both. The attrs one will be hashable, due to falling back to This is because in the normal class, the interpreter will see the This is the root cause of why our slots behavior is different. |
What functionalities are we aiming for exactly? I'm guessing:
(edit) Also:
|
I don't really understand the original post, but regarding the cases that need to be supported: I think it helps narrow things down to remember that for any given So if attrs is generating its own And if attrs isn't generating its own |
That said, I can see the case for having nicer shorthands for controlling which attribs go into the |
Actually, you specifically asked for being able to fall back to hashing by ID – that’s why it’s in there. :) And I think that’s a great and useful feature, so I’m not trying to push the blame in your direction – just reminding. :) The point of this ticket isn’t really to invent new behaviors, it’s just making the existing ones more understandable. Especially because one of the happened by accident (see also #205). But maybe…how about this: class Hashing(Enum):
SMART = "smart" # None → new default
UNHASHABLE = "unhashable" # same as None for unfrozen
BY_ID = "by_id" # False
BY_ATTRS = "opt_out" # True
BY_ATTRS_OPT_IN = "opt_in" # True, but the default on attr.ibs is False |
Falling back to hashing by id is definitely useful! If, and only if we are also falling back to doing |
Yeah so the fallback kind of happened by accident (cf. Tin’s explanations) but this time we have the chance to fix it properly including the deprecation year if we switch to enums. |
The thing where Beyond that I'm mostly just kinda lost about what all these enums are supposed to be or what the overall goal is here. I think I showed up in the middle of the conversation :-) |
So I guess we have to break bw-compat again and set |
Woah there, Tex. Ease your finger up off that trigger :-). In this case I'm not convinced a breaking change is necessary, because there's no alternate behavior to suggest: a warning would be sufficient. The user has explicitly asked for nonsense, and if the nonsense appears to work today then it's fine to deprecate it for a while before raising an error. |
Well the problem (which Tin explained elsewhere, I'm gonna add here for posterity) is: normal Python class behavior is setting dunder hash to None, if there's an Dundee eq. If not, it inherits object's by-id semantics. Now our eq is added after python makes this decision (except when using slots) so object's hash remains. If slots is true it behaves differently because we build the class differently and it becomes unhashable. IOW
Dunno. :| |
Oh, that's deeply unfortunate. I was wondering if the modifying-the-class-after-the-fact thing would come to bite us at some point; looks like it has. Perhaps we should be throwing away the first type and replacing it with a fresh one? (Although that's probably some even worse / more horrible compatibility break than this...) |
Focusing first on getting the exact right behavior with the new, good enum would be the right way to prioritize this. If we need another compat break (and gosh I hope not) then that's a separate thing. |
JFTR, I started to work on this and ran into another “interesting” fact: the behavior Nathaniel described is specific to Python 3. Python 2 will happily hash by ID if there's a |
The current way of using True/False/None is confusing even to me.
I propose we use an Enum (probably just class on legacy Python) a la:
The last two names are terrible of course.
I probably could be talked into depending on https://pypi.org/project/enum34/ on legacy Python.
Opinions?
P.S. Now while typing this out I realized that this could solve a long-standing complaint both I and many users have: assuming you want an attrs-generated method based on a small subset of attr.ibs, you have to write a lot of
hash|init|repr=False
. Ditching bools for enums would allow for that. 🤔 But let’s talk about that in a different ticket.The text was updated successfully, but these errors were encountered: