Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: General ignore F821 within documentation #23508

Closed
FHaase opened this issue Nov 5, 2018 · 4 comments · Fixed by #23847
Closed

Discussion: General ignore F821 within documentation #23508

FHaase opened this issue Nov 5, 2018 · 4 comments · Fixed by #23847
Labels
Code Style Code style, linting, code_checks Docs
Milestone

Comments

@FHaase
Copy link
Contributor

FHaase commented Nov 5, 2018

With PR #23381 flake8_rst will get merged allowing to have the flake8 checks being run on code inside *.rst files within doc/source subfolder.

In order to make the ./ci/code_shecks.sh script used by travis working, I had to ignore all F821 undefined name errors. @datapythonista and I had a brief discussion #23154 whether there should be raised an issue to fix all those errors ( @datapythonista ) or leave F821 ignored, relying on documentation editors to design meaningful documentation without constraining F821 issues to be fixed ( @FHaase ).

Before I open an issue to fix all F821 issues, I'd like to ask for a general opinion in respect of how the documentation would look like when the fixes are finished.

Forcing F821 issues to be fixed:

  • Pro:
    • Code can be copied and pasted straight from the docs.
  • Contra:
    • Adds noise to the code-snippets and if overdone might lead to a more confusing documentation than without.

"Pro" Examples

In these examples the documentation would benefit of not general ignoring F821 issues:

Finding typos within code:

.. code-block:: python
   df = pd.DataFrame(np.random.randn(10, 4))
   dt['F'] = pd.Series([1, 2, 3, 4, 5, 6], index=pd.date_range('20130102', periods=6))

Adding clearness to the code:

.. code-block:: python
   left = pd.DataFrame({'key': ['foo', 'bar'], 'lval': [1, 2]})
   right = pd.DataFrame({'key': ['foo', 'bar'], 'rval': [4, 5]})
   pd.merge(left, right, on='key')

instead of

.. code-block:: python
   pd.merge(left_data, right_data, on='key')

"Contra" Examples

These examples would have to manually ignoring F821 issues by adding # noqa: F821 to the code what is beeing rendered in the documentation.

Code with semantical clear variablenames

By default the memory usage of the ``DataFrame``'s index is shown in the
returned ``Series``, the memory usage of the index can be suppressed by passing
the ``index=False`` argument:

.. ipython:: python
    df.memory_usage(index=False)

or

.. code-block:: python
   with connect_to_url(url) as connection:
       pass

Intermediate explanations [a better example would be when s changed in block 1]

Using the Python ``in`` operator on a ``Series`` tests for membership in the
index, not membership among the values.

.. ipython:: python
    s = pd.Series(range(5), index=list('abcde'))
    2 in s
    'b' in s

If this behavior is surprising, keep in mind that using ``in`` on a Python
dictionary tests keys, not values, and ``Series`` are dict-like.
To test for membership in the values, use the method :meth:`~pandas.Series.isin`:

.. ipython:: python
    s.isin([2])
    s.isin([2]).any()

Current technical limitations

  • As flake8_rst calls flake8 with each of the .. code-blocks:: or .. ipython:: blocks individually, needing to define all variables in each block would complicate working with this directive.

  • pep8 requires 2 blank lines between functions in top-level scope. ipython refuses to run properly with them. So [E302, E305] will become ignored as well.

Possible compromise

Having a dedicated folder with code-snippets in *.py files which can be included via .. includeliteral::(?) As they represent complete examples they would be checked via flake8 and it's possible to run tests on them ensuring up to date examples even when apis change.

@gfyoung gfyoung added Docs Code Style Code style, linting, code_checks labels Nov 5, 2018
@gfyoung
Copy link
Member

gfyoung commented Nov 5, 2018

For the time being, I would prefer that we disable F821 for docs checking. Regardless of how we go about it, it seems from your description that we would have to do some substantive retooling / rewriting to get the docs to play nicely with flake8_rst on this rule.

Perhaps another to go about it is to create a list of "expected errors" (ones that we know we can safely ignore) and compare it against what flake8_rst outputs for that rule on a build. Any diff is cause to alert the end developer of a mistake.

@datapythonista
Copy link
Member

My opinion is that we should be explicit in all the examples, and not make the users guess anything (and that includes the import pandas).

As this is not shared by the rest of the team, my next favorite option is at some point stop ignoring F821 in setup.cfg, and instead ignore the specific cases where someone considers having a variable defined is worse with # noqa: F821.

But surely ok to have F821 fully ignored for now, and forget about this until we fix more easy/obvious problems in the docs.

Similar discussion in: #22900

@FHaase
Copy link
Contributor Author

FHaase commented Nov 21, 2018

I'm currently waiting for 2 PRs to be merged into flake8-rst.
After they are merged we can upgrade the flake8-version. Then it's possible to check for F821 as all code-blocks are merged into one big one and therefore names defined in previous blocks remain known.

This however raises another issue as imports may occur in the middle of a block, although they are at the top of an subblock:

.. code-block:: python
 
   filename = 'something'

.. code-block:: python
  
   import os  # <--- raises E402
   os.path.exists(filename) # <--- no longer raises F821

I think it might be best to require all pandas and numpy related stuff to be referred to from pd (np).
Like df = pd.DataFrame(...) instead of df = DataFrame(...)

I'm unsure what to do with imports from pandas unrelated stuff.

@datapythonista
Copy link
Member

I'm happy to ignore E402 in flake8-rst. In code it makes sense, but in the docs I don't expect to see many imports in the middle of the block, and I don't think it's a big issue anyway.

What we've been doing with pd and np is to "inject" the imports at the beginning (this is done for example in validate_docstrings.py or in conftest.py for the doctests). I think in the .rst files we usually have the imports in a supress block at the beginning, so we may not even need to do it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Code Style Code style, linting, code_checks Docs
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants