Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(WIP) Experiment with type checking and fix bug #138

Closed
wants to merge 4 commits into from

Conversation

tovrstra
Copy link
Member

This fixes a small bug found when working on theochem/gbasis#78.

The main purpose of this PR is discuss how we can introduce more type checking into IOData. If we had type checking as implemented in this PR, it would have prevented the fixed bug, so it seems the right thing to do. We can apply the same technique to other parts of IOData, but I first want to make sure this will not cause problematic side effects. Such changes would more or less involve the following:

  • Replace NamedTuple classes by regular classes decorated with attr machinery.
  • Add validators to some fields to impose types or other requirements where it makes sense.
  • Possibly fix bugs detected by the validators.

This commit improves code re-use between different classes that
need the same type of validator.
FarnazH
FarnazH previously approved these changes Jan 30, 2020
Copy link
Member

@FarnazH FarnazH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The attrs module was new for me, so I spent some time reading about it, and I think it's a great choice and worth giving a try.

iodata/test/test_basis.py Outdated Show resolved Hide resolved
iodata/attrutils.py Outdated Show resolved Hide resolved
@tovrstra
Copy link
Member Author

@FarnazH Fixes for your comments are included.

I've also experimented a little further and learned that attr only runs its converters and validators when constructing a new object. (See new utit tests.) Converters and validators do not get triggered when assigning attributes. This means it remains easy to create objects with invalid attributes: first create it and then modify it. 😦

To avoid this issue, the frozen=True option can be used:

@attr.s(slots=True, frozen=True)
class Wacko:
    """Just a silly class for testing convert_array_to."""

    flip: np.ndarray = attr.ib(converter=convert_array_to(float))

This would fail the following test on the last line with an attr.exceptions.FrozenInstanceError:

def test_convert_array_to_assign():
    wacko = Wacko(flip=None)
    assert wacko.flip is None
    wacko.flip = np.array([1.0, 3.0, -1.0])

That is a solution, but not a great one, because it also prohibits potentially legitimate code. That said, we had the same limitation with NamedTuple.

(I believe we ran into a similar issue before.)

@tovrstra
Copy link
Member Author

tovrstra commented Jan 31, 2020

The lack of assignment validation seems to be an often-discussed issue of attr. There are some alternatives or extensions to consider:

Alternatively, one can also write a __setattr__ method. The main catch seems to be the performance hit caused by the type checking (on the order of microseconds). To avoid performance issues, frozen=True can be used in combination with attr.evolve, see https://www.attrs.org/en/stable/api.html#attr.evolve

Even without frozen=True, something like attr.evolve is needed option to change two or more attributes who need to remain consistent. For example, assignment validation would make the following code fail, when attributes a and b are enforced to have the same shape

o = Example(a=np.array([1, 2]), b=np.array([3, 4]))
o.a = np.array([1, 2, 3])  # This line would fail because it makes a inconsistent with b.
o.b = np.array([4, 5, 6])

The following would still work (and only requires validation in the constructor):

o = Example(a=np.array([1, 2]), b=np.array([3, 4]))
o2 = attr.evolve(o, a=np.array([1, 2, 3]), b=np.array([4, 5, 6]))

@tovrstra
Copy link
Member Author

tovrstra commented Jul 8, 2020

After reading and comparing some alternatives, it seems that the only suitable alternative to Attrs is pyfields (and related libraries like autoclass and vtypes, see https://github.com/smarie). Pyfields is not as polished and popular and it seems to have a slightly larger performance hit compared to attrs. Still, pyfields does trivially support validation upon attribute assignment. A comparable feature is being proposed by the attrs developer, see python-attrs/attrs#645, which would also address the concerns above.

The following would be a good temporary solution:

Main advantages:

  • This can be implemented easily without adding or changing dependencies.
  • Negligible impact on performance.
  • No API change.
  • Users are guaranteed to get consistent data when loading from a file, e.g. the bug fixed in this PR would be detected without adding extra tests.
  • Dump functions can safely assume they receive valid IOData objects.
  • Users can still write pythonic code in which an IOData object is temporarily invalid. It only needs to be valid when interacting with the iodata library, which makes life easier for end users. (This "advantage" could also cause confusion though.)

Limitations:

  • The calls to attr.validate could be redundant and do not automatically validate attributes of attributes. For that, we'll need to write a helper function. This is not such a big deal because the overhead is negligible.

As soon as attr supports validation upon assignment, we can still decide to start using it. The main disadvantage will be that the last advantage in the list above disappears.

@tovrstra tovrstra marked this pull request as draft July 9, 2020 11:11
@tovrstra tovrstra changed the title Experiment with type checking and fix bug (WIP) Experiment with type checking and fix bug Jul 9, 2020
@tovrstra
Copy link
Member Author

tovrstra commented Jul 9, 2020

Superseded by #157.

@tovrstra tovrstra closed this Jul 9, 2020
tovrstra added a commit to tovrstra/iodata that referenced this pull request Sep 3, 2020
Fixes theochem#201

Related to theochem#138

This PR includes:
- Attribute validation (to large extent, not every detail)
- attrutil module to facilitate validation of array attributes
- Documentation of how attrs is used in IOData
- Bug fix in CP2K loader
- Minor fixes elsewhere
tovrstra added a commit to tovrstra/iodata that referenced this pull request Sep 3, 2020
Fixes theochem#201

Related to theochem#138

This PR includes:
- Attribute validation (to large extent, not every detail)
- attrutil module to facilitate validation of array attributes
- Documentation of how attrs is used in IOData
- Bug fix in CP2K loader
- Minor fixes elsewhere
tovrstra added a commit to tovrstra/iodata that referenced this pull request Sep 3, 2020
Fixes theochem#201

Related to theochem#138 and theochem#157 (which were earlier attempts)

This PR includes:
- Attribute validation (to large extent, not every detail)
- attrutil module to facilitate validation of array attributes
- Documentation of how attrs is used in IOData
- Bug fix in CP2K loader
- Minor fixes elsewhere
tovrstra added a commit to tovrstra/iodata that referenced this pull request Sep 3, 2020
Fixes theochem#201

Related to theochem#138 and theochem#157 (which were earlier attempts)

This PR includes:
- Attribute validation (to large extent, not every detail)
- attrutil module to facilitate validation of array attributes
- Documentation of how attrs is used in IOData
- Bug fix in CP2K loader
- Minor fixes elsewhere
tovrstra added a commit to tovrstra/iodata that referenced this pull request Sep 3, 2020
Fixes theochem#201

Related to theochem#138 and theochem#157 (which were earlier attempts)

This PR includes:
- Attribute validation (to large extent, not every detail)
- attrutil module to facilitate validation of array attributes
- Documentation of how attrs is used in IOData
- Bug fix in CP2K loader, related to theochem/gbasis#78
- Minor fixes elsewhere to satisfy attribute validators
@tovrstra tovrstra deleted the more_type_checking branch June 5, 2024 09:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants