DOC: Fix flake8 issues in doc/source/groupby.rst #24178

datapythonista · 2018-12-09T18:53:42Z

We didn't start validating the format of PEP8 and other code standards in the documentation examples until recently. We still have some files with errors, that we need to skip, and that we should fix, so we can start validating them.

The first step of this issue would be edit setup.cfg in the pandas home, and in the flake8-rst section, remove from the exclude list the file doc/source/groupby.rst

After that, running the next command will report the errors in the file (note that syntax error usually prevent to validate other errors, and the list of errors to fix can become much longer when the syntax error is fixed (please make sure that you are using flake8-rst version 0.7.0 or higher):

$ flake8-rst doc/source/groupby.rst 
doc/source/groupby.rst:242:15: E225 missing whitespace around operator
doc/source/groupby.rst:242:15: E999 SyntaxError: invalid syntax
doc/source/groupby.rst:242:19: E225 missing whitespace around operator

Once all the errors are addressed, please open a pull request with the fixes in the file, and removing the file from setup.cfg. If you need to do something that feels wrong to fix an error, please ask in a comment to this issue. Please avoid other unrelated changes, which can be addressed in a separate pull request.

The text was updated successfully, but these errors were encountered:

LJArendse · 2018-12-09T19:17:06Z

@datapythonista I would like to give this issue a try :)

datapythonista · 2018-12-09T19:20:54Z

please do, and let me know if you have questions or need help, thanks @LJArendse

… setup.cfg (pandas-dev#24178)

LJArendse · 2018-12-11T18:09:28Z

@datapythonista
I have a question about the following 'syntax error' found on line 242:

doc/source/groupby.rst:242:15: E225 missing whitespace around operator
doc/source/groupby.rst:242:15: E999 SyntaxError: invalid syntax
doc/source/groupby.rst:242:19: E225 missing whitespace around operator

Line 242 looks as follows:

239		.. ipython::
240
241		@verbatim
242		In [1]: gb.<TAB>
243		gb.agg        gb.boxplot    gb.cummin     gb.describe   gb.filter     gb.get_group  gb.height     gb.last       gb.median     gb.ngroups    gb.plot       gb.rank       gb.std        gb.transform
244		gb.aggregate  gb.count      gb.cumprod    gb.dtype      gb.first      gb.groups     gb.hist       gb.max        gb.min        gb.nth        gb.prod       gb.resample   gb.sum        gb.var
245		gb.apply      gb.cummax     gb.cumsum     gb.fillna     gb.gender     gb.head       gb.indices    gb.mean       gb.name       gb.ohlc       gb.quantile   gb.size       gb.tail       gb.weight

The <TAB> is throwing the SyntaxError. My fix is the following:

239		.. ipython::
240
241		@verbatim
242		# After typing "gd." in the ipython terminal, click the <Tab> button on your keyboard which will allow you to tab complete any of the commands below.
243		In [1]: gb.ClickTab
244		gb.agg        gb.boxplot    gb.cummin     gb.describe   gb.filter     gb.get_group  gb.height     gb.last       gb.median     gb.ngroups    gb.plot       gb.rank       gb.std        gb.transform
245		gb.aggregate  gb.count      gb.cumprod    gb.dtype      gb.first      gb.groups     gb.hist       gb.max        gb.min        gb.nth        gb.prod       gb.resample   gb.sum        gb.var
246		gb.apply      gb.cummax     gb.cumsum     gb.fillna     gb.gender     gb.head       gb.indices    gb.mean       gb.name       gb.ohlc       gb.quantile   gb.size       gb.tail       gb.weight

Do you have any suggestions for how we can address this better? I don't think gb.ClickTab is an ideal way to practically show the tab completion...

datapythonista · 2018-12-11T23:05:29Z

I think we already fixed it with a noqa somewhere else, I think a grep "<TAB>" *.rst should tell you quickly

LJArendse · 2018-12-12T11:27:48Z

Thanks for the help, that's awesome will give it a try

LJArendse · 2018-12-13T13:28:22Z

@datapythonista thanks for the help, found in computation.rst
Could you explain what the noqa does in:

266                        In [14]: r.<TAB>                                          # noqa: E225, E999

datapythonista · 2018-12-13T14:00:17Z

when flake8 finds a comment with a noqa in a line, it does not report as an error the specified error codes

…func' (pandas-dev#24178)

LJArendse · 2018-12-15T08:51:38Z

@datapythonista What do you suggest is the best way to fix the following errors:

flake8-rst doc/source/groupby.rst
doc/source/groupby.rst:72:18: F821 undefined name 'obj'
doc/source/groupby.rst:72:30: F821 undefined name 'key'
doc/source/groupby.rst:73:18: F821 undefined name 'obj'
doc/source/groupby.rst:73:30: F821 undefined name 'key'
doc/source/groupby.rst:74:18: F821 undefined name 'obj'
doc/source/groupby.rst:74:31: F821 undefined name 'key1'
doc/source/groupby.rst:74:37: F821 undefined name 'key2'

The lines in question are:

   >>> grouped = obj.groupby(key)
   >>> grouped = obj.groupby(key, axis=1)
   >>> grouped = obj.groupby([key1, key2])

It is a very good generic explanation of how to groupby an object. Should I keep it as is? or Should I rather show the same example but with an actual example dataframe and dummy data?

LJArendse · 2018-12-15T09:09:53Z

@datapythonista A suggested actual example would be something like this:

A groupby can be applied in the following ways to a pandas object

grouped = obj.groupby(key)
grouped = obj.groupby(key, axis=1)
grouped = obj.groupby([key1, key2])

Below you can see groupby applied to a dataframe object

import pandas as pd
import numpy as np

n = 1000

df = pd.DataFrame({'Store': np.random.choice(['Store_1', 'Store_2'], n),
                      'Product': np.random.choice(['Product_1',
                                                   'Product_2'], n),
                      'Revenue': (np.random.random(n) * 50 + 10).round(2),
                      'Quantity': np.random.randint(1, 10, size=n)})
key = 'Product'
key1 = 'Store'
key2 = 'Product'

grouped = df.groupby(key)
grouped = df.groupby(key, axis=1)
grouped = df.groupby([key1, key2])

datapythonista · 2018-12-18T11:12:44Z

Looks good, but not a big fan of random data. May be you can use the same example as in: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.reset_index.html

Also,

I wouldn't define key, key1..., just use the value directly
axis='columns' is more explicit than axis=1
You don't need the imports, they are defined at the beginning of the file already

doc/source/groupby.rst:72:18: F821 undefined name 'obj' doc/source/groupby.rst:72:30: F821 undefined name 'key' doc/source/groupby.rst:73:18: F821 undefined name 'obj' doc/source/groupby.rst:73:30: F821 undefined name 'key' doc/source/groupby.rst:74:18: F821 undefined name 'obj' doc/source/groupby.rst:74:31: F821 undefined name 'key1' doc/source/groupby.rst:74:37: F821 undefined name 'key2'

LJArendse · 2018-12-19T20:54:56Z

@datapythonista Done, flake8-rst doc/source/groupby.rst is not reporting any more errors. I will open a pull request for you to review my changes.

This reverts commit 41a2e47.

…-dev#24178

datapythonista · 2018-12-29T23:35:08Z

Closed by #24363

datapythonista added Docs Effort Low Clean good first issue labels Dec 9, 2018

datapythonista mentioned this issue Dec 9, 2018

DOC: Fix all flake8 issues and warning in the documentation pages #24173

Closed

LJArendse added a commit to LJArendse/pandas that referenced this issue Dec 10, 2018

DOC: Remove doc/source/groupby.rst from flake8-rst exclude section in…

e6c2d59

… setup.cfg (pandas-dev#24178)

LJArendse added a commit to LJArendse/pandas that referenced this issue Dec 13, 2018

DOC: Fix doc/source/groupby.rst:1306:42: F821 undefined name 'report_…

9209528

…func' (pandas-dev#24178)

LJArendse mentioned this issue Dec 19, 2018

DOC: fix flake8 issue in groupby.rst #24363

Merged

LJArendse added a commit to LJArendse/pandas that referenced this issue Dec 27, 2018

Revert "DOC: Fix the following 'errors' (pandas-dev#24178):"

b80ff17

This reverts commit 41a2e47.

LJArendse added a commit to LJArendse/pandas that referenced this issue Dec 27, 2018

Merge branch 'master' into doc-fix-flake8-issue-in-groupby.rst-pandas…

fc91118

…-dev#24178

datapythonista closed this as completed Dec 29, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOC: Fix flake8 issues in doc/source/groupby.rst #24178

DOC: Fix flake8 issues in doc/source/groupby.rst #24178

datapythonista commented Dec 9, 2018

LJArendse commented Dec 9, 2018

datapythonista commented Dec 9, 2018

LJArendse commented Dec 11, 2018

datapythonista commented Dec 11, 2018

LJArendse commented Dec 12, 2018

LJArendse commented Dec 13, 2018

datapythonista commented Dec 13, 2018

LJArendse commented Dec 15, 2018

LJArendse commented Dec 15, 2018 •

edited

Loading

datapythonista commented Dec 18, 2018

LJArendse commented Dec 19, 2018

datapythonista commented Dec 29, 2018

DOC: Fix flake8 issues in doc/source/groupby.rst #24178

DOC: Fix flake8 issues in doc/source/groupby.rst #24178

Comments

datapythonista commented Dec 9, 2018

LJArendse commented Dec 9, 2018

datapythonista commented Dec 9, 2018

LJArendse commented Dec 11, 2018

datapythonista commented Dec 11, 2018

LJArendse commented Dec 12, 2018

LJArendse commented Dec 13, 2018

datapythonista commented Dec 13, 2018

LJArendse commented Dec 15, 2018

LJArendse commented Dec 15, 2018 • edited Loading

datapythonista commented Dec 18, 2018

LJArendse commented Dec 19, 2018

datapythonista commented Dec 29, 2018

LJArendse commented Dec 15, 2018 •

edited

Loading