Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix and accelerate generated __hash__ methods #296

Merged
merged 6 commits into from
Nov 30, 2017
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions changelog.d/261.change.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Generated ``__hash__`` methods now hash the class type along with the attribute values.
Until now the hashes of two classes with the same values were identical which was a bug.

The generated method is also *much* faster now.
4 changes: 4 additions & 0 deletions changelog.d/295.change.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Generated ``__hash__`` methods now hash the class type along with the attribute values.
Until now the hashes of two classes with the same values were identical which was a bug.

The generated method is also *much* faster now.
4 changes: 4 additions & 0 deletions changelog.d/296.change.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Generated ``__hash__`` methods now hash the class type along with the attribute values.
Until now the hashes of two classes with the same values were identical which was a bug.

The generated method is also *much* faster now.
45 changes: 34 additions & 11 deletions src/attr/_make.py
Original file line number Diff line number Diff line change
Expand Up @@ -709,13 +709,37 @@ def _make_hash(attrs):
if a.hash is True or (a.hash is None and a.cmp is True)
)

def hash_(self):
"""
Automatically created by attrs.
"""
return hash(_attrs_to_tuple(self, attrs))
# We cache the generated init methods for the same kinds of attributes.
sha1 = hashlib.sha1()
sha1.update(repr(attrs).encode("utf-8"))
unique_filename = "<attrs generated hash %s>" % (sha1.hexdigest(),)
type_hash = hash(unique_filename)
lines = [
"def __hash__(self):",
" return hash((",
" %d," % (type_hash,),
]
for a in attrs:
lines.append(" self.%s," % (a.name))

lines.append(" ))")

script = "\n".join(lines)
globs = {}
locs = {}
bytecode = compile(script, unique_filename, "exec")
eval(bytecode, globs, locs)

# In order of debuggers like PDB being able to step through the code,
# we add a fake linecache entry.
linecache.cache[unique_filename] = (
len(script),
None,
script.splitlines(True),
unique_filename,
)

return hash_
return locs["__hash__"]


def _add_hash(cls, attrs):
Expand Down Expand Up @@ -866,7 +890,7 @@ def _make_init(attrs, post_init, frozen):
sha1.hexdigest()
)

script, globs = _attrs_to_script(
script, globs = _attrs_to_init_script(
attrs,
frozen,
post_init,
Expand All @@ -883,18 +907,17 @@ def _make_init(attrs, post_init, frozen):
# immutability.
globs["_cached_setattr"] = _obj_setattr
eval(bytecode, globs, locs)
init = locs["__init__"]

# In order of debuggers like PDB being able to step through the code,
# we add a fake linecache entry.
linecache.cache[unique_filename] = (
len(script),
None,
script.splitlines(True),
unique_filename
unique_filename,
)

return init
return locs["__init__"]


def _add_init(cls, frozen):
Expand Down Expand Up @@ -954,7 +977,7 @@ def validate(inst):
v(inst, a, getattr(inst, a.name))


def _attrs_to_script(attrs, frozen, post_init):
def _attrs_to_init_script(attrs, frozen, post_init):
"""
Return a script of an initializer for *attrs* and a dict of globals.

Expand Down