-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC GH22897 Fix docstring of join in pandas/core/frame.py #22904
Conversation
Hello @JustinZhengBC! Thanks for submitting the PR.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great changes, thanks a lot @JustinZhengBC. I added some comments mainly about the formatting of orignial the docstring. If you can fix those too, that would be great.
pandas/core/frame.py
Outdated
@@ -6440,6 +6440,8 @@ def append(self, other, ignore_index=False, | |||
def join(self, other, on=None, how='left', lsuffix='', rsuffix='', | |||
sort=False): | |||
""" | |||
Append columns of another DataFrame. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you replace append
by join
? I think in general they mean the same, but in this context may give the idea that the they are being added at the end of the DataFrame, or that they are not aligned.
pandas/core/frame.py
Outdated
@@ -6449,31 +6451,31 @@ def join(self, other, on=None, how='left', lsuffix='', rsuffix='', | |||
other : DataFrame, Series with name field set, or list of DataFrame |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mind removing the with name field set
from the types, and have it in the description instead?
pandas/core/frame.py
Outdated
@@ -6449,31 +6451,31 @@ def join(self, other, on=None, how='left', lsuffix='', rsuffix='', | |||
other : DataFrame, Series with name field set, or list of DataFrame | |||
Index should be similar to one of the columns in this one. If a | |||
Series is passed, its name attribute must be set, and that will be | |||
used as the column name in the resulting joined DataFrame | |||
used as the column name in the resulting joined DataFrame. | |||
on : name, tuple/list of names, or array-like |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the types, we try to have just the Python (or numpy...) types, and with a specific format that we can at some point parse and validate. Could you replace it by something like str, list of str or array-like
please?
pandas/core/frame.py
Outdated
rsuffix : string | ||
Suffix to use from right frame's overlapping columns | ||
Suffix to use from right frame's overlapping columns. | ||
sort : boolean, default False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you replace the type string
by str
and boolean
by bool
pandas/core/frame.py
Outdated
@@ -6485,70 +6487,67 @@ def join(self, other, on=None, how='left', lsuffix='', rsuffix='', | |||
|
|||
Examples | |||
-------- | |||
>>> import pandas as pd | |||
|
|||
>>> caller = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3', 'K4', 'K5'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you mind replacing the name caller
by df
? we use this convention everywhere, I think it makes sense here too.
pandas/core/frame.py
Outdated
2 K2 A2 B2 | ||
3 K3 A3 NaN | ||
4 K4 A4 NaN | ||
5 K5 A5 NaN | ||
|
||
See also | ||
-------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move the See Also
section before the examples, use a capital A
for Also
, and add a period at the end of the merge
description.
Codecov Report
@@ Coverage Diff @@
## master #22904 +/- ##
=======================================
Coverage 92.18% 92.18%
=======================================
Files 169 169
Lines 50830 50830
=======================================
Hits 46860 46860
Misses 3970 3970
Continue to review full report at Codecov.
|
Alright, I believe I have taken care of all the other issues now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, just couple of small things that I didn't see in the previous review, and I think we can merge this. Thanks!
pandas/core/frame.py
Outdated
used as the column name in the resulting joined DataFrame | ||
on : name, tuple/list of names, or array-like | ||
used as the column name in the resulting joined DataFrame. | ||
on : str, list of str, or array-like |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add that it's optional: on : str, list of str, or array-like, optional
pandas/core/frame.py
Outdated
of the calling's one. | ||
lsuffix : str | ||
Suffix to use from left frame's overlapping columns. | ||
rsuffix : str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add the default: str, default ''
pandas/core/frame.py
Outdated
@@ -6483,76 +6485,72 @@ def join(self, other, on=None, how='left', lsuffix='', rsuffix='', | |||
Support for specifying index levels as the `on` parameter was added | |||
in version 0.23.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mind capitalizing the first letter of each sentence, and ending them in period in the Notes. For the First paragraph, if you can add backticks around the parameters on
, lsuffix
... that would be great too. So: "Options `on`, `lsuffix` and `rsuffix` are not supported [...]"
pandas/core/frame.py
Outdated
2 K2 A2 B2 | ||
3 K3 A3 NaN | ||
4 K4 A4 NaN | ||
5 K5 A5 NaN | ||
|
||
Returns | ||
------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I missed before that the Returns was here. Can you move it after the parameters, and use this format:
Returns
-------
DataFrame
A short description on what is being returned.
I fixed those issues. I also added some backticks around parameter names in other sections, as that was what other docstrings seemed to be doing. |
pandas/core/frame.py
Outdated
|
||
Another option to join using the key columns is to use the `on` | ||
parameter. DataFrame.join always uses `other`'s index but we can use | ||
any column in ``df``. This method preserves the original DataFrame's |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The double backticks is for code, and the single backticks for objects that are being referenced. We don't have an exact convention for things like the parameters or the variables. Can you leave them with single backticks for now, as it's inconsistent to have df with double, and the rest like other with single. Other than that, the PR looks great.
I changed the backticks around the df's to be single again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks for the PR @JustinZhengBC
We'll merge it once the CI checks are on green. Hope to see more of your contributions.
Superseded by #23471 |
git diff upstream/master -u -- "*.py" | flake8 --diff
#22900 was dealt with by adding an "import pandas as pd" line to the start of the examples section.