-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validate PEP-8 in docstring examples #23154
Comments
I'll make an attempt at this |
As flake8 is already being used, wouldn't it make sense to use flake8 plugins to check for those things?
|
Using a plugin makes sense, better not reinvent the wheel. But it'd be good if the plugin is called from the validator script, so devs working on docstrings don't need to execute more than a command to validate that the docstring is all right. |
I'm currently including it in ./ci/code_checks.sh under lint section. |
Feel free to open issues for the PEP-8 problems if in your local branch you can already get the list of problems. If there are many I'd have more than one issue, so different people can work on it at the same time. |
commit so it would get run with Many mistakes are:
Also it doesn't catch all occurences of python only
|
@FHaase thanks for looking into it!
Where is this one coming from? |
The problem is that in a context of a documentation it would make sense to use names that are not defined, however the plugin checks the code-block as if it was a normal python file. Forcing to include all names in the documentation would in my opinion lead to cluttered documentation and therefore I would ignore F821 completely. |
I think we should import |
Short answer: Yes
bootstrapping pandas as pd, numpy as np reduces the output already. The question that rises is: Does a reader of a documentation require all these names to be defined in order to understand the example? In my opinion: No. And the boilerplate code around every example in the doc makes it even more unreadable and confusing. Documented code-blocks are not meant to be understood by a computer, but by a human. Most of the variables explain themeselves by what they are called: I think from all the output above So as a conclusion: Forcing the documentation to have all names defined would force a kind of documentation that is less readable without being more clear. |
I disagree. I think all examples should be explicit in defining or importing all objects, so they run, and users can reproduce them excatly in the way they are presented on the documentation. If I'm new to pandas and I see in the documentation something like:
I don't see how this can be useful to me. I'd rather have:
Which provides all the required information to understand, run, and edit and play with what is being shown. We need to ignore |
Okay I see what you mean. My point is forcing every use of from
If I read that, it's absolutely clear what is meant, although it's invalid Syntax [E999]. If I imagine a class, imports, and something written within the function itself, it no longer correlates to Having to use invalid Syntax is less common. from
what exactly obj, key, key1, key2 is is not important at this point of the documentation. And I think it should be the choice of the writer how abstract the code should be to be most clear. After writing a section with mostly comments like this explaining how it works, someone could include a complete working example in a regular .py file included with .. literalinclude:: That way it's possible to run tests on the code so it can get updated when the api changes. And its flake8 and not flake8-rst that checks the code_style. |
Note that a bunch of the warnings you listed above in #23154 (comment) are due to not properly importing the used packages / not properly using the imported name (eg all the But, that said, I agree with @FHaase that not all code blocks should be exactly running code. Often, it can be more educational to show a certain snippet. But other things:
|
Yes, currently not everything gets checked. The regex to find the parts is
I've marked the PR to close this issue, because once flake8-rst finds some python-code it gets checked by flake8 just like any other .py file. |
Fixed in #23399 |
At the moment we are automatically validating PEP-8 in our code (using
flake8 .
in the CI), and we also have the scriptscripts/validate_docstrings.py
that reports errors for different formatting errors in our docstrings, and it also runs the tests to see if they work and generate the expected input. But we are not validating whether the examples follow PEP-8.In the next example:
Everything will be reported as correct, because the formatting of the docstring is the expected, and when running the examples, they work and they generated the presented output.
According to PEP-8, there should be spaces around the
=
in an assignment. So, we should haves = pd.Series([1, 2, 3, 4])
instead ofs=pd.Series([1, 2, 3, 4])
. So, the validation should report it as an error.What we need to do is:
scripts/validate_docstrings.py
that obtains the code in the examples (and if possible their line in the code to be presented to the user if there are errors). For what I know, the moduledoctest
in the Python standard library provides a way to extract the code.pyflakes
as a dependency, so using it is probably the best option. But if it's better to usepycodestyle
or something else, that could be an option.scripts/tests/test_validate_docstrings.py
The text was updated successfully, but these errors were encountered: