Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC/STYLE: Fix "F821 undefined name 'pd'" errors #22900

Open
datapythonista opened this issue Sep 29, 2018 · 7 comments
Open

DOC/STYLE: Fix "F821 undefined name 'pd'" errors #22900

datapythonista opened this issue Sep 29, 2018 · 7 comments
Labels
Code Style Code style, linting, code_checks Docs Needs Discussion Requires discussion from core team before further action

Comments

@datapythonista
Copy link
Member

It'd be great to start linting the doctests (validate the docstring examples for pep8 issues).

To start with core/generic.py, core/series.py and core/frame.py, I created the next issues to address the current errors: #22892, #22893, #22894, #22895, #22896, #22897 and #22898.

After those are fixed, the only error that will remain in these files is F821 undefined name 'pd'. I checked for simple ways to avoid it, but I don't think we can prepend code for flake8 (like what we do in sphinx). So, the solutions I see are:

  1. Add import pandas as pd (or even better import pandas) to every example
  2. Do not lint the doctests
  3. Ignore F821 from the doctest linting (this will also ignore errors like df + 1 where df is not defined, as it's also an F821 error).
  4. Implement a way to grep -v those exact errors, and change the exit code of flake8 based on that, so the linting passes when the output is clean after filtering these errors

I know this has been discussed before, and I'm not sure if anyone else besides me is in favor of option 1. But I think it's worth considering it again in the context of linting doctests. Option 1 has the advantage that the examples are self-contained (they can be copy-pasted, or in the future we can even add a button to open them in binder, or something similar).

@datapythonista datapythonista added Docs Code Style Code style, linting, code_checks Needs Discussion Requires discussion from core team before further action labels Sep 29, 2018
@WillAyd
Copy link
Member

WillAyd commented Oct 3, 2018

I think option 1 would just be a lot of churn and add a lot of repetition to the code samples, so I'm -1 with that. Option 4 seems like the most logical on initial glance - do you see a downside to that outside of the added grep call?

@jreback
Copy link
Contributor

jreback commented Oct 3, 2018

we always ‘import pandas as pd’

option 1 seems pretty verbose as well

@TomAugspurger
Copy link
Contributor

As long as we're running the examples, option 3 should be fine, right? The example df + 1 will fall the doctest.

So my vote is for option 3.

@jbrockmendel
Copy link
Member

@datapythonista how do i check if this is fixed?

@datapythonista
Copy link
Member Author

This is not yet fixed. What we'd like is to have this passing:

$ flake8 --doctests pandas/

This would prevent errors in the examples in our docstrings. But if you execute that, you'll see we've got plenty of:

./plotting/_core.py:595:18: F821 undefined name 'pd'
./plotting/_core.py:1286:24: F821 undefined name 'np'

I guess we could start by adding F821 to the list of flake8 errors we ignore (option 3 in the description), but there are some other errors we need to have a look at before we have flake8 validation in doctests passing:

pandas/core/generic.py:3343:33: F721 syntax error in doctest
pandas/errors/__init__.py:458:17: F721 syntax error in doctest
pandas/errors/__init__.py:461:17: F721 syntax error in doctest
pandas/errors/__init__.py:568:9: F401 'pandas.io.stata.StataReader' imported but unused
pandas/errors/__init__.py:571:30: F721 syntax error in doctest
pandas/io/formats/style.py:966:19: F721 syntax error in doctest
pandas/io/formats/style.py:987:19: F721 syntax error in doctest
pandas/io/formats/style.py:1874:23: F721 syntax error in doctest
pandas/io/formats/style.py:1882:30: F721 syntax error in doctest
pandas/io/formats/style.py:1884:23: F721 syntax error in doctest
pandas/io/formats/style.py:2787:23: F721 syntax error in doctest
pandas/io/formats/style.py:2793:23: F721 syntax error in doctest
pandas/io/formats/style.py:2799:23: F721 syntax error in doctest
pandas/io/formats/style.py:2805:23: F721 syntax error in doctest
pandas/io/formats/style.py:2811:23: F721 syntax error in doctest
pandas/io/formats/style.py:2820:23: F721 syntax error in doctest
pandas/io/formats/style.py:3508:21: F721 syntax error in doctest
pandas/io/formats/style_render.py:1081:17: F721 syntax error in doctest

@jbrockmendel
Copy link
Member

Looks like a lot of the syntax errors are from multi-line statements missing either parens or backslashes

@datapythonista
Copy link
Member Author

I created a "good first issue" issue for the errors other than the import. When those are fixed, we can see if we really want to skip F821, or maybe we can add a validation to the CI to make sure only pd and np can be missing. Something like the next command, which shows that we'd ignore other variable names if we simply ignore F821:

$ flake8 --doctest --ignore="" --select=F821 pandas/ | grep -v "F821 undefined name 'pd'" | grep -v "F821 undefined name 'np'"
pandas/core/apply.py:1339:9: F821 undefined name '_relabel_result'
pandas/core/apply.py:1339:33: F821 undefined name 'func'
pandas/core/arrays/datetimelike.py:262:13: F821 undefined name 'self'
pandas/core/dtypes/base.py:266:27: F821 undefined name 're'
pandas/core/generic.py:2247:17: F821 undefined name 'df'
pandas/core/generic.py:5890:13: F821 undefined name 'func'
pandas/core/generic.py:5890:18: F821 undefined name 'g'
pandas/core/generic.py:5890:20: F821 undefined name 'h'
pandas/core/generic.py:5890:22: F821 undefined name 'df'
pandas/core/generic.py:5890:32: F821 undefined name 'a'
pandas/core/generic.py:5890:41: F821 undefined name 'b'
pandas/core/generic.py:5890:49: F821 undefined name 'c'
pandas/core/generic.py:5894:14: F821 undefined name 'df'
pandas/core/generic.py:5894:22: F821 undefined name 'h'
pandas/core/generic.py:5895:22: F821 undefined name 'g'
pandas/core/generic.py:5895:30: F821 undefined name 'a'
pandas/core/generic.py:5896:22: F821 undefined name 'func'
pandas/core/generic.py:5896:33: F821 undefined name 'b'
pandas/core/generic.py:5896:41: F821 undefined name 'c'
pandas/core/generic.py:5903:14: F821 undefined name 'df'
pandas/core/generic.py:5903:22: F821 undefined name 'h'
pandas/core/generic.py:5904:22: F821 undefined name 'g'
pandas/core/generic.py:5904:30: F821 undefined name 'a'
pandas/core/generic.py:5905:23: F821 undefined name 'func'
pandas/core/generic.py:5905:43: F821 undefined name 'a'
pandas/core/generic.py:5905:51: F821 undefined name 'c'
pandas/io/formats/style.py:1873:25: F821 undefined name 'ret'
pandas/io/formats/style_render.py:1279:49: F821 undefined name 'upper'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Code Style Code style, linting, code_checks Docs Needs Discussion Requires discussion from core team before further action
Projects
None yet
Development

No branches or pull requests

5 participants