Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC GH22897 Fix docstring of join in pandas/core/frame.py #22904

Closed
wants to merge 8 commits into from
Closed

DOC GH22897 Fix docstring of join in pandas/core/frame.py #22904

wants to merge 8 commits into from

Conversation

JustinZhengBC
Copy link
Contributor

@JustinZhengBC JustinZhengBC commented Sep 30, 2018

#22900 was dealt with by adding an "import pandas as pd" line to the start of the examples section.

@pep8speaks
Copy link

Hello @JustinZhengBC! Thanks for submitting the PR.

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great changes, thanks a lot @JustinZhengBC. I added some comments mainly about the formatting of orignial the docstring. If you can fix those too, that would be great.

@@ -6440,6 +6440,8 @@ def append(self, other, ignore_index=False,
def join(self, other, on=None, how='left', lsuffix='', rsuffix='',
sort=False):
"""
Append columns of another DataFrame.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you replace append by join? I think in general they mean the same, but in this context may give the idea that the they are being added at the end of the DataFrame, or that they are not aligned.

@@ -6449,31 +6451,31 @@ def join(self, other, on=None, how='left', lsuffix='', rsuffix='',
other : DataFrame, Series with name field set, or list of DataFrame
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind removing the with name field set from the types, and have it in the description instead?

@@ -6449,31 +6451,31 @@ def join(self, other, on=None, how='left', lsuffix='', rsuffix='',
other : DataFrame, Series with name field set, or list of DataFrame
Index should be similar to one of the columns in this one. If a
Series is passed, its name attribute must be set, and that will be
used as the column name in the resulting joined DataFrame
used as the column name in the resulting joined DataFrame.
on : name, tuple/list of names, or array-like
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the types, we try to have just the Python (or numpy...) types, and with a specific format that we can at some point parse and validate. Could you replace it by something like str, list of str or array-like please?

rsuffix : string
Suffix to use from right frame's overlapping columns
Suffix to use from right frame's overlapping columns.
sort : boolean, default False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you replace the type string by str and boolean by bool

pandas/core/frame.py Show resolved Hide resolved
@@ -6485,70 +6487,67 @@ def join(self, other, on=None, how='left', lsuffix='', rsuffix='',

Examples
--------
>>> import pandas as pd

>>> caller = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3', 'K4', 'K5'],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mind replacing the name caller by df? we use this convention everywhere, I think it makes sense here too.

2 K2 A2 B2
3 K3 A3 NaN
4 K4 A4 NaN
5 K5 A5 NaN

See also
--------
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move the See Also section before the examples, use a capital A for Also, and add a period at the end of the merge description.

@datapythonista datapythonista added Docs Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Sep 30, 2018
@codecov
Copy link

codecov bot commented Sep 30, 2018

Codecov Report

Merging #22904 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #22904   +/-   ##
=======================================
  Coverage   92.18%   92.18%           
=======================================
  Files         169      169           
  Lines       50830    50830           
=======================================
  Hits        46860    46860           
  Misses       3970     3970
Flag Coverage Δ
#multiple 90.6% <ø> (ø) ⬆️
#single 42.37% <ø> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/frame.py 97.2% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 14598c6...05988d4. Read the comment docs.

@JustinZhengBC
Copy link
Contributor Author

Alright, I believe I have taken care of all the other issues now.

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, just couple of small things that I didn't see in the previous review, and I think we can merge this. Thanks!

used as the column name in the resulting joined DataFrame
on : name, tuple/list of names, or array-like
used as the column name in the resulting joined DataFrame.
on : str, list of str, or array-like
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add that it's optional: on : str, list of str, or array-like, optional

of the calling's one.
lsuffix : str
Suffix to use from left frame's overlapping columns.
rsuffix : str
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add the default: str, default ''

@@ -6483,76 +6485,72 @@ def join(self, other, on=None, how='left', lsuffix='', rsuffix='',
Support for specifying index levels as the `on` parameter was added
in version 0.23.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind capitalizing the first letter of each sentence, and ending them in period in the Notes. For the First paragraph, if you can add backticks around the parameters on, lsuffix... that would be great too. So: "Options `on`, `lsuffix` and `rsuffix` are not supported [...]"

2 K2 A2 B2
3 K3 A3 NaN
4 K4 A4 NaN
5 K5 A5 NaN

Returns
-------
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I missed before that the Returns was here. Can you move it after the parameters, and use this format:

Returns
-------
DataFrame
    A short description on what is being returned.

@JustinZhengBC
Copy link
Contributor Author

I fixed those issues. I also added some backticks around parameter names in other sections, as that was what other docstrings seemed to be doing.


Another option to join using the key columns is to use the `on`
parameter. DataFrame.join always uses `other`'s index but we can use
any column in ``df``. This method preserves the original DataFrame's
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The double backticks is for code, and the single backticks for objects that are being referenced. We don't have an exact convention for things like the parameters or the variables. Can you leave them with single backticks for now, as it's inconsistent to have df with double, and the rest like other with single. Other than that, the PR looks great.

@JustinZhengBC
Copy link
Contributor Author

I changed the backticks around the df's to be single again.

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks for the PR @JustinZhengBC

We'll merge it once the CI checks are on green. Hope to see more of your contributions.

@datapythonista
Copy link
Member

Superseded by #23471

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DOC: Fix the docstring of join in pandas/core/frame.py
3 participants