differences / compatibility with attrs project #60

chadrik · 2017-11-01T21:21:30Z

It would be helpful to have a list of functional differences between dataclasses and attrs, broken down by @dataclass vs @attr.s and field vs attr.ib.

This would be useful and illuminating for a few reasons:

It would make it easier to vet the logic behind, and need for, each of the proposed differences.

@hynek and @Tinche have invested years of thought into the current design: deviating from it without fully understanding the history and reasoning behind each decision might lead to this project needlessly repeating mistakes. I'm glad to see that the attrs devs have already been brought into several issues. My hope is we can get a bird's eye view so that nothing slips through the cracks.

If the differences aren't too great (and ideally they will not be, see above) I'd like to see a dataclass compatibility mode for attrs (e.g. from attrs import dataclass, field).

I'm glad that this badly-needed feature is being worked on, but sadly I'm stuck in python 2 for at least another 2 years, so it's important to me, and surely many attrs-users, to have an easy path to adoption once this becomes part of stdlib.

The text was updated successfully, but these errors were encountered:

chadrik · 2017-11-02T06:34:42Z

First off, I found and read #19, which is a good read for anyone wondering whether attrs should be added to the stdlib (spoiler: it should not).

Here is my first attempt at an overview of the differences, starting with function arguments:

`attr.attr`	`dataclasses.field`
`default`	`default` or `default_factory`
`validator`	not present
`repr`	`repr`
`cmp`	`cmp`
`hash`	`hash`
`init`	`init`
`convert`	not present
`metadata`	not present
`type`	not applicable (uses annotations)

`attr.attributes`	`dataclasses.dataclass`
`these`	not present
`repr_ns`	not applicable in python 3.x
`repr`	`repr`
`cmp`	`compare`, and/or `eq`
`hash`	`hash`
`init`	`init`
`slots`	not present
`frozen`	`frozen`
`str`	not present

Notes / Observations:

the absence of metadata and validator from dataclasses.field are concerning for me. these are pretty crucial to my use of attrs. I could see an argument for convert and validator being merged into a single entity, but I definitely would not want to see them both missing
slots were covered in Support __slots__? #28, and the consensus was "punt this down the road. If people want slots they can manually add __slots__ = ('x', 'y', 'z') to their class"
cmp vs compare/eq was covered in Implements #46: Specify eq separately from compare, for unorderable types. #48: compare=False, eq=True generates just __eq__ and __ne__ and is used for for "unorderable types". I'm still a little hazy on why this is necessary.
default_factory vs default was covered in How to specify factory functions #24. dataclasses splits default_factory from default so that an arbitrary callable can be provided as a data factory, whereas attrs requires factories to be a attr.Factory instance.
gathering fields from annotations will soon be supported in attrs with Add option to collect annotated fields python-attrs/attrs#262 via auto_attribs=True, which removes one of the remaining differences
at a surface level, attrs has almost the superset of functionality, which gives me hope that a compatibility layer could be provided.
- the only dataclasses feature missing from attrs is eq (covered above).

If anyone is aware of deeper functional differences, I'd love to hear them. Thanks!

edit1: added notes on eq
edit2: clarified default_factory difference

ericvsmith · 2017-11-04T18:50:10Z

I think this is a useful exercise, thanks. I agree that it would be a shame to inadvertently miss something that's in attrs, especially if that locks us in to an API that we regret. I'll spend some time reviewing your table one-by-one, and comment as I go.

ericvsmith · 2017-11-04T18:51:33Z

As far as conversion functions and validators, I'd like to not support these. I'm hoping that static type checking gets us most of the way there.

ericvsmith · 2017-11-04T18:58:14Z

default / default_factory is mostly covered in issue #24. default is used to specify a default value, and default_factory is used to specify a callable that generates a default value. They need to be separate, because otherwise you'd have to do something like initial_value = default() if callable(default) else default, which precludes you from having a default value which is itself a callable. It's an error to specify both default and default_factory.

chadrik · 2017-11-05T21:44:21Z

default / default_factory is mostly covered in issue #24.

Thanks, that conversation cleared it up for me. I updated my post above with the new info.

As far as conversion functions and validators, I'd like to not support these. I'm hoping that static type checking gets us most of the way there.

I don't think that static type checking has much impact on the need for converters. Take something like this for instance:

@attr.s
class C:
    x: int = attr.ib(default=0, converter=int)
    y: int = attr.ib(default=0, converter=int)

c = C('1', 1.1)

This pattern is very common. A hypothetical mypy plugin for attrs or dataclasses could make C('1', 1.1) valid by using the converter's argument type for __init__ if present.

Without converters the best we can do this:

@dataclass
class C:
    x: int = 0
    y: int = 0

c = C(int('1'), int(1.1))

Static type checking doesn't really have much to offer here in terms of ease of use: the best it can do is nag us to cast everything to int. That does not alleviate the inconvenience of having to do that throughout your codebase, whereas a converter defined on the field does. Moreover, conversions cannot be accomplished post-init, because the converter's type needs to be understood by the static type-check plugin. Bottom line: converters are a convenience without a valid workaround, and their absence will be frustrating to users.

As for validators, static type checking gets us part of the way there, but certainly not most of the way there. Here are some example validations:

x in y
x in range(y, z)
re.match(y, x)
len(x) < y
instance(x, Y)

All of these require runtime validation except the last. That said, validation can be performed in post_init, so unlike converters, at least there is a workaround.

Is there an argument against adding metadata? It's hard to overstate how important this one is. It's a catchall for anything and everything that dataclasses cannot or should not have first class support for. In other words, it is the foundations for third-party utilities built up around dataclasses, for things such as UI presentation, database ORMs, serialization, and yes, even validation.

ilevkivskyi · 2017-11-05T22:05:29Z

I think the fact that static type checkers prohibit something like:

class C:
    x: int = ...
    y: int = ...

c = C('1', 1.1)

is rather good, not bad. What are the use cases for converters (apart form being temporary workarounds themselves)? As for validators, they can be added to __dataclass_post_init__ (I hope we will find a better name). Moreover, the latter can perform cross-field validation, so I agree with @ericvsmith here, we probably don't need validators and converters.

As for metadata, I don't have a strong opinion, but could imagine that it is indeed useful.

ericvsmith · 2017-12-01T15:44:42Z

metadata has been added.

@chadrik: where do you propose this documentation should go? Or is this just an exercise for the design phase, which I think has ended. It's not appropriate for this to go in the stdlib documentation.

chadrik · 2017-12-02T17:36:28Z

I think that attrs users are most definitely going to want this information once this project makes it into the stdlib. How about adding it to the wiki for now? I’ll gladly keep it up to date. I also want to use it to lobby for certain changes to attrs to increase compatibility (e.g. order vs cmp behavior).

…

On Fri, Dec 1, 2017 at 10:44 AM Eric V. Smith ***@***.***> wrote: metadata has been added. @chadrik <https://github.com/chadrik>: where do you propose this documentation should go? Or is this just an exercise for the design phase, which I think has ended. It's not appropriate for this to go in the stdlib documentation. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#60 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAD3E6j_nnE9uTo_770gRS0E_P-o0CTgks5s8B7rgaJpZM4QO1z0> .

ericvsmith · 2017-12-02T18:15:28Z

Either the Wiki (which I have no access to) or maybe under attrs' documentation (ditto).

And note that you can use dataclasses today, from PyPI, on 3.6. So let the lobbying begin, once the PEP is accepted.

ericvsmith · 2017-12-02T18:33:08Z

Also, note that attrs' these parameter is roughly equivalent to the dataclasses.make_dataclass() function. So I think the only real difference in your table is __slots__, validate, and convert. I deliberately don't want to support validation and conversion, instead leaving that to static type checkers (see #60 (comment) above).

As for __slots__, that's a deliberate decision. Although I have another decorator which I'm not including in the PEP that adds __slots__ and returns a new class. See add_slots() in dataclass_tools.py in this repo. Because it's the only parameter that causes dataclass() to return a new class, I thought it was best to leave it out, at least for now. I'd like to make sure dataclass() is seen as something that just adds methods to a class, not returns a new class. Maybe that will change over time.

Tinche · 2017-12-02T20:16:57Z

I think that the "return a new class" approach is fundamentally incompatible with metaclasses and especially PEP 487. Since there is no way to add slots to an existing class, I'm considering a different API for slot classes in attrs too. Or, you know, Python could grow a better __slots__ interface itself, but I'm not holding my breath.

gvanrossum · 2017-12-02T20:47:45Z

Actually we should design a new slots interface. The original was designed before we had class decorators.

…

On Dec 2, 2017 12:17 PM, "Tin Tvrtković" ***@***.***> wrote: I think that the "return a new class" approach is fundamentally incompatible with metaclasses and especially PEP 487. Since there is no way to add slots to an existing class, I'm considering a different API for slot classes in attrs too. Or, you know, Python could grow a better __slots__ interface itself, but I'm not holding my breath. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#60 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACwrMvziqyw8W_mLHtcHaBa-2yW1IKRLks5s8bA6gaJpZM4QO1z0> .

Tinche · 2017-12-02T20:59:45Z

Actually we should design a new slots interface. The original was designed before we had class decorators.

Yes please!

gvanrossum · 2017-12-02T23:08:12Z

That won't be easy though -- it means that the instance layout has to be made changeable after the class object has been created (which happens when the metaclass creates it -- before the class decorator runs). Mayby there are some folks on python-ideas interested in brainstorming on how to do this.

chadrik · 2017-12-05T06:25:54Z

One last effort on this topic:

I think the fact that static type checkers prohibit something like:
class C:
    x: int = ...
    y: int = ...

c = C('1', 1.1)
is rather good, not bad.

What if say, over half of the uses of C required converting a variable to int, and what if that conversion was not as simple as calling a builtin but also required an import from some other module? This doesn't seem like a question of correctness to me, but rather one of convenience. Very many classes in the real world perform some conversion of arguments within their __init__ methods, and unlike validators I don't see a good alternative for those who don't want to perform conversions all over their code instead of in one place. There's the possibility of casting and re-binding the attributes in __post_init__, but that would break static type-checking: for that to work the mypy plugin needs to integrate converter annotations into the __init__ annotations, which means dataclasses needs first class support for converters.

ilevkivskyi · 2017-12-05T14:22:14Z

@chadrik

What if say, over half of the uses of C required converting a variable to int

I think such situations are relatively rare (like legacy API or similar). And IIUC this use case is covered by a combination of InitVar and __post_init__:

@dataclass
class C:
    a: str
    b: str = field(init=False)
    _b: InitVar[bytes]
    def __post_init__(self, _b) -> None:
        self.b = convert_from_legacy_api(_b)

aa: str = 'a test'
bb: bytes = b'b test'

c = C(aa, bb)  # OK

And this will work well with static type checkers.

ilevkivskyi · 2017-12-05T16:40:49Z

(I think you started with a/b/_b and then continued with x/y/_y?)

Indeed :-) Fixed!

ericvsmith · 2018-05-18T11:18:00Z

I think there's nothing else to add here. Closing this issue.

EhsanKia · 2021-06-18T05:25:33Z

I honestly don't see how the dummy InitVar + extra var + post_init is a Pythonic replacement to the simple and clean converter. And it's also, as far as I can tell, not a solution for frozen dataclasses.

Take this very simple and common dataclass

@dataclasses.dataclass(frozen=True)`
class Group:
    names: Sequence[str]

How do you insure names is not mutable itself? Normally, a simpler converter=tuple would do the job, but now, you have to do all sorts of hacks and object.__setattr__ and so on. None of it is pythonic, clean or user-friendly.

gvanrossum · 2021-06-18T14:44:36Z

It’s unpythonic to expect “deep” frozen-ness. A frozen object disallows attribute assignment but doesn’t care about modifying attribute values.

ericvsmith closed this as completed May 18, 2018

kaapstorm mentioned this issue Nov 22, 2019

Functional refactoring of ValueSource dimagi/commcare-hq#25961

Closed

gegnew mentioned this issue Oct 13, 2020

Draft Release 0.1.0 cellengine/cellengine-python-toolkit#56

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

differences / compatibility with attrs project #60

differences / compatibility with attrs project #60

chadrik commented Nov 1, 2017

chadrik commented Nov 2, 2017 •

edited

Loading

ericvsmith commented Nov 4, 2017

ericvsmith commented Nov 4, 2017

ericvsmith commented Nov 4, 2017

chadrik commented Nov 5, 2017

ilevkivskyi commented Nov 5, 2017

ericvsmith commented Dec 1, 2017

chadrik commented Dec 2, 2017 via email

ericvsmith commented Dec 2, 2017

ericvsmith commented Dec 2, 2017

Tinche commented Dec 2, 2017

gvanrossum commented Dec 2, 2017 via email

Tinche commented Dec 2, 2017

gvanrossum commented Dec 2, 2017

chadrik commented Dec 5, 2017

ilevkivskyi commented Dec 5, 2017 •

edited

Loading

ilevkivskyi commented Dec 5, 2017

ericvsmith commented May 18, 2018

EhsanKia commented Jun 18, 2021 •

edited

Loading

gvanrossum commented Jun 18, 2021

differences / compatibility with attrs project #60

differences / compatibility with attrs project #60

Comments

chadrik commented Nov 1, 2017

chadrik commented Nov 2, 2017 • edited Loading

ericvsmith commented Nov 4, 2017

ericvsmith commented Nov 4, 2017

ericvsmith commented Nov 4, 2017

chadrik commented Nov 5, 2017

ilevkivskyi commented Nov 5, 2017

ericvsmith commented Dec 1, 2017

chadrik commented Dec 2, 2017 via email

ericvsmith commented Dec 2, 2017

ericvsmith commented Dec 2, 2017

Tinche commented Dec 2, 2017

gvanrossum commented Dec 2, 2017 via email

Tinche commented Dec 2, 2017

gvanrossum commented Dec 2, 2017

chadrik commented Dec 5, 2017

ilevkivskyi commented Dec 5, 2017 • edited Loading

ilevkivskyi commented Dec 5, 2017

ericvsmith commented May 18, 2018

EhsanKia commented Jun 18, 2021 • edited Loading

gvanrossum commented Jun 18, 2021

chadrik commented Nov 2, 2017 •

edited

Loading

ilevkivskyi commented Dec 5, 2017 •

edited

Loading

EhsanKia commented Jun 18, 2021 •

edited

Loading