Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify hard to maintain copyright notice #1441

Merged
merged 21 commits into from
Mar 12, 2022

Conversation

Pierre-Sassoulas
Copy link
Member

Description

git is the source of truth for the copyright, copyrite (the tool) was taking exponentially longer with each release, and it's
polluting the code with sometime as much as 50 lines of names. I replaced it with a single line linking to the contributors shown by github.

Also added a pre-commit hook to check and fix the copyright notice, and fixed the existing files so they have a notice

Type of Changes

Type
✨ New feature
🔨 Refactoring
📜 Docs

@Pierre-Sassoulas Pierre-Sassoulas added the Maintenance Discussion or action around maintaining astroid or the dev workflow label Mar 3, 2022
@Pierre-Sassoulas Pierre-Sassoulas added this to the 2.11.0 milestone Mar 3, 2022
@Pierre-Sassoulas Pierre-Sassoulas force-pushed the better-handling-of-copyright-notice branch 2 times, most recently from 25639a5 to d1facf2 Compare March 3, 2022 18:34
Copy link
Member Author

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made some comments so it's easier to see what need to be reviewed with more care

tbump.toml Outdated Show resolved Hide resolved
doc/release.md Show resolved Hide resolved
.pre-commit-config.yaml Outdated Show resolved Hide resolved
@cdce8p cdce8p self-requested a review March 3, 2022 22:45
@DanielNoord DanielNoord self-requested a review March 4, 2022 07:52
@DanielNoord
Copy link
Collaborator

@Pierre-Sassoulas Could you fix the pre-commit?

@Pierre-Sassoulas Pierre-Sassoulas force-pushed the better-handling-of-copyright-notice branch from d1facf2 to 57d6a85 Compare March 4, 2022 08:20
@Pierre-Sassoulas Pierre-Sassoulas added the Blocked 🚧 A PR or issue blocked by another PR or issue label Mar 4, 2022
@Pierre-Sassoulas
Copy link
Member Author

Blocked because we need to merge #1443 first.

Pierre-Sassoulas added a commit that referenced this pull request Mar 4, 2022
Some file were formatted for Windows, which was a problem for #1441
@Pierre-Sassoulas Pierre-Sassoulas force-pushed the better-handling-of-copyright-notice branch from 57d6a85 to e5e1f02 Compare March 4, 2022 08:57
@Pierre-Sassoulas Pierre-Sassoulas removed the Blocked 🚧 A PR or issue blocked by another PR or issue label Mar 4, 2022
astroid/_ast.py Outdated Show resolved Hide resolved
Copy link
Member

@cdce8p cdce8p left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's ok to remove the list of contributors from each file. Just not sure a link to Github is all that useful. The contributor graph probably doesn't show everyone.

In pylint, we have a CONTRIBUTORS file. I think that would make sense a bit more sense. Tbh I don't know how much work it would be to gather all individual contributors and generate that file once. After that we could ask new ones to add their names to it themselves.

In contrast to pylint, I do however think that only name / username + email (optional) should be stored. Not necessary to mention individual contributions there.

If we do decide to go that route, we should make sure to add the filename to license_files in setup.cfg so it can be included in the distribution package.

@Pierre-Sassoulas
Copy link
Member Author

Pierre-Sassoulas commented Mar 5, 2022

The contributor graph probably doesn't show everyone.

True. I just checked. That's a problem.

Tbh I don't know how much work it would be to gather all individual contributors and generate that file once.

Not much, see git shortlog --summary --numbered --email. We can use the copyrite config for aliases and we're good to go. I'll do that.

@Pierre-Sassoulas
Copy link
Member Author

Not much

I can't believe how much in underestimated the number of person who did 2 commits total and managed to still use a different email each time 😄

@Pierre-Sassoulas Pierre-Sassoulas force-pushed the better-handling-of-copyright-notice branch from e1cdd37 to 253852e Compare March 5, 2022 10:22
.pre-commit-config.yaml Show resolved Hide resolved
astroid/nodes/const.py Show resolved Hide resolved
script/.copyrigth_aliases.json Outdated Show resolved Hide resolved
script/.copyrigth_aliases.json Outdated Show resolved Hide resolved
script/create_contributor_list.py Outdated Show resolved Hide resolved
script/create_contributor_list.py Outdated Show resolved Hide resolved
script/create_contributor_list.py Outdated Show resolved Hide resolved
script/create_contributor_list.py Outdated Show resolved Hide resolved
script/create_contributor_list.py Outdated Show resolved Hide resolved
@Pierre-Sassoulas Pierre-Sassoulas force-pushed the better-handling-of-copyright-notice branch 3 times, most recently from aec3e5c to e8652b6 Compare March 5, 2022 19:13
Copy link
Member

@cdce8p cdce8p left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add CONTRIBUTORS.txt to setup.cfg so it's included in the distribution package.
This is how pylint does it. pylint -> setup.cfg

.pre-commit-config.yaml Outdated Show resolved Hide resolved
CONTRIBUTORS.txt Outdated Show resolved Hide resolved
copyright.txt Outdated Show resolved Hide resolved
setup.py Outdated Show resolved Hide resolved
Comment on lines 1 to +3
# Licensed under the LGPL: https://www.gnu.org/licenses/old-licenses/lgpl-2.1.en.html
# For details: https://github.com/PyCQA/astroid/blob/main/LICENSE
# Copyright (c) https://github.com/PyCQA/astroid/blob/main/CONTRIBUTORS.txt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest we remove the license header from all test files.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It send the message that the test files are not copyrighted, no ? And they were originally. How about astroid.* and tests* but not custom scripts ?

Copy link
Member

@cdce8p cdce8p Mar 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just wonder what the benefit of having it in test files would be. IMO it's more or less useless. Those files aren't part of the distribution package and mostly boilerplate. In the end, there is still the LICENSE file which applies to all files, regardless if a file has a license header or not.

Standard disclaimer: I'm not a lawyer, this is not legal advice.

Copy link
Member Author

@Pierre-Sassoulas Pierre-Sassoulas Mar 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also not a lawyer 😄 To be frank I also wondered what is the use of having it in any files at all. Apparently the need to do that was removed in march 1989 when the US joined the Berne Convention (before that anything not explicitly copyrighted could be stolen !). When pylint was created the remnant of the practice of putting copyright everywhere were still very much alive, probably the reason why it's done that way. That being said, I'd like to be pretty conservative and careful here as I'd prefer to be safe than sorry. (Fun fact an actual law person at my job told us developer to add a disclaimer in every files of internal unpublished code in 2021. Guess no lawyers ever got sacked for asking to put copyright and trademark everywhere)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't add too much usually 😄 Some projects include the whole license text in every file. Others, like Python, don't add any headers.

In our case it's much more a question what is practical. My argument would be that it's quite unlikely anyone will ever pursue the copyright or GPL license for pylint / astroid code. Much less so for test code. It's already incredibly difficult to do these things for linux, a much bigger project.

Anyway, I guess a few lines more or less don't really hurt.
To get back to pylint. There we have added header to general test files, but not the individual functional tests. I think we should keep it that way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we should not touch pylint's functional files especially as it would modify all the result txt files and do a massive diff for no reason.

@Pierre-Sassoulas
Copy link
Member Author

Please add CONTRIBUTORS.txt to setup.cfg so it's included in the distribution package.

👍 db0bff2

Upgrade contributors list
@Pierre-Sassoulas Pierre-Sassoulas requested a review from cdce8p March 11, 2022 21:46
CONTRIBUTORS.txt Outdated Show resolved Hide resolved
CONTRIBUTORS.txt Outdated
Comment on lines 6 to 12
- Claudiu Popa <[email protected]>
- LOGILAB S.A. (Paris, FRANCE) <[email protected]>
- Sylvain Thénault <[email protected]>
- Pierre Sassoulas <[email protected]>
- hippo91 <[email protected]>
- Marc Mueller <[email protected]>
- Daniël van Noord <[email protected]>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering about the sorting. Does it make sense to sort by number of commits? That might cause a more extensive diff.

Maybe just alphabetical would be better? Or two different sections Team and Contributors? Section names can be improved probably.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind the little diff that will happen when Daniel surpass us all 😄

two different sections Team and Contributors?

This is what is done for pylint more or less (just really out of date), but we'd need to have some configuration with the list of admin though. Is it worth it ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're ok with the diff, then it's fine by me 🤷🏻‍♂️

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just FYI I might try to do something a little fancier for pylint to preserves the comments about what was done by whom but I did not do anything yet. I'm trying to release astroid first 😄

"authoritative_mail": "[email protected]",
"mails": ["[email protected]", "[email protected]"]
},
"Ville Skytt\u00e4": {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be normalised? It is in CONTRIBUTORS.txt

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CONTRIBUTORS.txt Outdated
# please do not modify manually

Contributors
------------
Mainteners
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Mainteners
Maintainers

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Urgh, costly typo. 2 commits a full release and a regeneration 😄

"Areveny": {
"authoritative_mail": "[email protected]",
"mails": ["[email protected]", "[email protected]"],
"team": "true"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A true / false value here is quite inflexible. What about the actual team / group name? That would allow for more than two groups if we ever want that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Pierre-Sassoulas Pierre-Sassoulas merged commit 1622459 into main Mar 12, 2022
@Pierre-Sassoulas Pierre-Sassoulas deleted the better-handling-of-copyright-notice branch March 12, 2022 07:45
@sandrotosi
Copy link

git is the source of truth for the copyright,

I'm sorry i'm late to the party, and please note i'm coming from Debian's perspective, but this statement is not correct.

copyright is the list of people who can apply a license to their work (ie they will need to agree to change the license)

Now it looks like every single person in https://github.com/PyCQA/astroid/blob/main/CONTRIBUTORS.txt bears equal copyright claims to every single file in the astroid project (ie, all of them would need to agree to change the license of a single file, even tho they never contributed to it)

this is a worst situation than before and, while IANAL it may not even be legally appropriate.

Take my notes with caution, i just package astroid for debian and i have no standing on how astroid is developed and maintained.

@Pierre-Sassoulas
Copy link
Member Author

Thank you for the insight @sandrotosi, I'm not a lawyer either and I know this subject can be touchy ! (But a list of 20 peoples on top of file that we had to update in quadrinomial time was a lot to deal with). Would you think it's better if we specify the actual git command to get the contributor for a file for each file, and remove the misleading link to the full list of contributor ?

I.e. a light version of that would be simply the github link to the file as in https://github.com/PyCQA/astroid/blob/main/astroid/_ast.py you have a list of 4 contributors in interface, Maybe there's an API I can find for this so we only get the actual list.

As an aside I think changing the license is never going to happen (even if I'd prefer a MIT license myself) as asking everyone is near impossible, but there's probably other use cases too ?

@sandrotosi
Copy link

sorry i didnt reply sooner

Thank you for the insight @sandrotosi, I'm not a lawyer either and I know this subject can be touchy !

IANAL either

(But a list of 20 peoples on top of file that we had to update in quadrinomial time was a lot to deal with).

i simply cant gauge the level of pain this caused, from an outside POV, this list could be updated just once before a release: would that help?

Would you think it's better if we specify the actual git command to get the contributor for a file for each file, and remove the misleading link to the full list of contributor ?

i'm afraid that's not enough (the git repo is not what get's distributed, which is what matters here): the way i use to clarify copyright&license is that the copyright holders are the only legal entity that can set a given license for the piece of code they have copyright rights over.

now, if you simply state

# Copyright (c) https://github.com/PyCQA/astroid/blob/main/CONTRIBUTORS.txt

that has simply no legal value, because it's an ever changing file, and it does not specify a legal entity able to hold a copyright over code (a URL in this case).

But even if we would consider that file a list of contributors and ship it with the code tarball (which we do for debian, since we fetch it from the github tags, and they contain all the files in the repo), do really all the people listed in that file have legal claim to change the license of every single line of code in astroid? i think not.

Other projects i think use CLA, and/or other mechanisms to handle copyright claims, and from a personal perspective i always liked the complete list of copyright holders for each files (it makes my debian work so much easier), so i may be more than biased in this discussion.

As an aside I think changing the license is never going to happen (even if I'd prefer a MIT license myself) as asking everyone is near impossible, but there's probably other use cases too ?

a given set of copyright holders can decided to change the license of their code, as long as it remains compatible with LGPL2.1+ (which is the global license for the whole project). It's a bit of a stretch, but legally possible

@Pierre-Sassoulas
Copy link
Member Author

that has simply no legal value, because it's an ever changing file, and it does not specify a legal entity able to hold a copyright over code (a URL in this case).

Say we specify the commit like this for each file so it's not changing anymore (https://github.com/PyCQA/astroid/blob/e5f7488f9370b8f69c3605aa1be8650c7a3d76e4/astroid/_ast.py). Would it work for you ?

Result at url:
copyrightholfder

We could maybe find an API to have a json directly.

CLA, and/or other mechanisms to handle copyright claims,

This could work, but for me working on the copyright does not make pylint better so I really don't want to spend time on this (unless paid).

The license has been LGPL for years, it's contaminating, so whatever is done is under LGPL. Having very detailed and exact copyright feel like law theater to be honest. And I have also a feeling it's only benefiting corporation, more precisely some over-cautious corporate lawyer that do not pay for pylint because everyone working on it do it for free anyway. I'm not going to sue some multinational company if they make a private fork and do not share the result, I won't even know about it. This is not going to be prevented by me painfully upgrading every file with a 2 screen long list of copyright holder that everyone will then have to scroll past every time they open a file to edit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Maintenance Discussion or action around maintaining astroid or the dev workflow
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants