Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: get PR description from googleapis commits #2531

Merged
merged 42 commits into from
Mar 7, 2024

Conversation

JoeWang1127
Copy link
Collaborator

@JoeWang1127 JoeWang1127 commented Mar 1, 2024

In this PR:

  • Add generate_pr_description.py to get commit messages from googleapis repository.
    • Parse configuration to get latest googleapis commit (C) and proto_path (P).
    • Combine commit message from C to a baseline commit (B), exclusively. Only commits that change files in P are considered.
  • Add unit tests for utility functions.
  • Add integration tests for generate_pr_description.py.

This is step 5 in milestone 2 of hermetic build project.

@product-auto-label product-auto-label bot added the size: s Pull request size is small. label Mar 1, 2024
@product-auto-label product-auto-label bot added size: m Pull request size is medium. and removed size: s Pull request size is small. labels Mar 2, 2024
@@ -45,6 +47,24 @@


class IntegrationTest(unittest.TestCase):
def test_get_commit_message_success(self):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this as an integration test because it required network connection for git clone and checkout of googleapis repository.

@JoeWang1127 JoeWang1127 added the owlbot:run Add this label to trigger the Owlbot post processor. label Mar 2, 2024
@gcf-owl-bot gcf-owl-bot bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label Mar 2, 2024
@JoeWang1127 JoeWang1127 added the owlbot:run Add this label to trigger the Owlbot post processor. label Mar 3, 2024
@gcf-owl-bot gcf-owl-bot bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label Mar 3, 2024
@JoeWang1127 JoeWang1127 marked this pull request as ready for review March 4, 2024 14:54
@JoeWang1127 JoeWang1127 requested a review from a team as a code owner March 4, 2024 14:54
Copy link
Contributor

@diegomarquezp diegomarquezp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one comment on the IT. Otherwise LGTM.

library_generation/test/integration_tests.py Outdated Show resolved Hide resolved
@JoeWang1127 JoeWang1127 requested a review from blakeli0 March 4, 2024 17:15
@@ -0,0 +1,10 @@
7a9a855287b5042410c93e5a510f40efd4ce6cb1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we don't have the logic for telling exactly which library is affected, but maybe in the meantime we can format this commit hash to be an actual link to the commit in googleapis for further inspection. WDYT?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, that's a good idea.

I'll create a follow up PR to format the commit message.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good idea, but we don't want to include them in the release notes, so please put it outside of BEGIN_COMMIT_OVERRIDE and END_COMMIT_OVERRIDE if you want to include them in the PR description.

Copy link
Collaborator Author

@JoeWang1127 JoeWang1127 Mar 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. The commit link is outside of BEGIN_NESTED_COMMIT and END_NESTED_COMMIT.

Comment on lines 9 to 23
BEGIN_NESTED_COMMIT

feat: [document-ai] expose model_type in v1 processor, so that user can see the model_type after get or list processor version

PiperOrigin-RevId: 603727585


feat: [document-ai] add model_type in v1beta3 processor proto

PiperOrigin-RevId: 603726122


feat: Regenerate with the Java code generator (gapic-generator-java) v2.34.0

END_NESTED_COMMIT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supposed to be coming from 3 commits? If so, the nested commit usage should be something like:

Suggested change
BEGIN_NESTED_COMMIT
feat: [document-ai] expose model_type in v1 processor, so that user can see the model_type after get or list processor version
PiperOrigin-RevId: 603727585
feat: [document-ai] add model_type in v1beta3 processor proto
PiperOrigin-RevId: 603726122
feat: Regenerate with the Java code generator (gapic-generator-java) v2.34.0
END_NESTED_COMMIT
BEGIN_NESTED_COMMIT
feat: [document-ai] expose model_type in v1 processor, so that user can see the model_type after get or list processor version
PiperOrigin-RevId: 603727585
END_NESTED_COMMIT
BEGIN_NESTED_COMMIT
feat: [document-ai] add model_type in v1beta3 processor proto
PiperOrigin-RevId: 603726122
END_NESTED_COMMIT
BEGIN_NESTED_COMMIT
feat: Regenerate with the Java code generator (gapic-generator-java) v2.34.0
END_NESTED_COMMIT

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've change the format, PTAL.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chingor13 What is the advantage of using three BEGIN_NESTED_COMMIT over one BEGIN_COMMIT_OVERRIDE other than keeping the extended description(which would be ignored in the release notes anyway)?

[googleapis/googleapis@7a9a855](https://github.com/googleapis/googleapis/commit/7a9a855287b5042410c93e5a510f40efd4ce6cb1)
[googleapis/googleapis@c7fd8bd](https://github.com/googleapis/googleapis/commit/c7fd8bd652ac690ca84f485014f70b52eef7cb9e)
BEGIN_NESTED_COMMIT
feat: [document-ai] expose model_type in v1 processor, so that user can see the model_type after get or list processor version
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a test case for a commit that contains multiple conventional commit messages? Like this one? I think the current logic that adds library name may not take this scenario into consideration.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

)

for commit, library_name in commits.items():
convention, _, summary = commit.message.partition(":")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to consider a scenario that the commit message may include multiple conventional commits.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, verified in unit test.

),
]
)
def test_format_commit_message_success(self, message, expected, is_monorepo):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This parameterized test can verify multi-line commit message can be formatted correctly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure parameterized tests is the best choice here, because they are not intuit to read. Another developer has to count the position of the input to know which parameter is it.
In addition, it groups all the test cases together, which make us harder to troubleshoot the unit tests if one particular test case fails, see internal doc for a few best practices for unit tests.

Ideally, we should have multiple test cases like

test_format_commit_message_should_add_library_name_if_regex_matches():

test_format_commit_message_should_not_add_library_name_if_regex_does_not_match():

test_format_commit_message_should_not_add_library_name_if_monorepo():

...
...
...


Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a parameterized test is suitable here because

  1. We need common setup for a mock commit object.
  2. All different tests are duplicated in most parts.

I found python tips about parameterized tests saying we should parameterize test in this case.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Common setup or duplicated logics could be extracted to helper methods or ideally in a setUp(). This is actually another code smell that this unit test and/or the utilities.py class is doing too much things. e.g. We could have a dedicated CommitMessageFormater.
In addition, some duplication in unit tests is fine if it is simpler and more readable. See Prefer simplicity in tests even when it might lead to some duplication in go/unit-testing-practices#logic-dos-and-donts.

That being said, I noticed Advanced Parameterization in go/python-tips/011#advanced, which you can provide a proper test name and tell which parameter it is for, which would solves my concern regarding readability in previous comments.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed Advanced Parameterization in go/python-tips/011#advanced, which you can provide a proper test name and tell which parameter it is for

This feature is provided in google3.

I'll create a separate file to host the formatting functions and remove the parameterized test.

@JoeWang1127 JoeWang1127 requested a review from blakeli0 March 7, 2024 00:40
Copy link

sonarqubecloud bot commented Mar 7, 2024

Quality Gate Passed Quality Gate passed for 'gapic-generator-java-root'

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
No data about Duplication

See analysis details on SonarCloud

Copy link

sonarqubecloud bot commented Mar 7, 2024

Quality Gate Passed Quality Gate passed for 'java_showcase_integration_tests'

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
No data about Duplication

See analysis details on SonarCloud

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metrics tracker links the release PRs with the source PRs either via the commit SHA or the PR# reference

from git import Commit


def format_commit_message(commits: Dict[Commit, str], is_monorepo: bool) -> List[str]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: with a dict, you may not be guaranteed on the order that the commits are processed. it may not be a big deal for you, but something to consider if you potentially get inconsistent behavior for the "same" input.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to dictionary doc, the order of items is insertion order by default.

@JoeWang1127 JoeWang1127 enabled auto-merge (squash) March 7, 2024 21:39
@JoeWang1127 JoeWang1127 merged commit c2ea697 into main Mar 7, 2024
45 checks passed
@JoeWang1127 JoeWang1127 deleted the feat/get-pr-description branch March 7, 2024 21:56
JoeWang1127 added a commit that referenced this pull request Mar 8, 2024
In this PR:
- Write PR description into a file which lives the same directory with
the configuration.

Context:
After #2531, we can generate pull request description from generation
config and a baseline commit.

However, the return value (string) can not be retrieved when running in
docker environment. Therefore, write the description to a file. Put
multi-line description to a file also helps editing the pr body
([link](cli/cli#595 (comment))).
diegomarquezp pushed a commit that referenced this pull request Mar 15, 2024
🤖 I have created a release *beep* *boop*
---


<details><summary>2.38.0</summary>

##
[2.38.0](v2.37.0...v2.38.0)
(2024-03-15)


### Features

* [common-protos] add `api_version` extension to `ServiceOptions`, for
collaborative versioning
([d343be9](d343be9))
* [common-protos] add `api_version` extension to `ServiceOptions`, for
collaborative versioning
([#2551](#2551))
([d343be9](d343be9))
* add `ErrorReason.LOCATION_POLICY_VIOLATED` enum value
([d343be9](d343be9))
* add `ErrorReason.LOCATION_POLICY_VIOLATED` enum value
([d343be9](d343be9))
* add `Publishing.rest_reference_documentation_uri` to aid client
library publication
([d343be9](d343be9))
* add `Publishing.rest_reference_documentation_uri` to aid client
library publication
([d343be9](d343be9))
* Add shopping and chat common protos.
([#2553](#2553))
([5f2d4e7](5f2d4e7)),
closes
[#2018](#2018)
* get PR description from googleapis commits
([#2531](#2531))
([c2ea697](c2ea697))
* Introduce OpenTelemetry Metrics Recording
([#2500](#2500))
([b936580](b936580))
* skip build only commit
([#2555](#2555))
([180c8a9](180c8a9))
* Universe Domain Environment Variable Support
([#2485](#2485))
([1463d64](1463d64))


### Dependencies

* normalize dependencies
([#2574](#2574))
([6622238](6622238))
* update arrow.version to v15.0.1
([#2565](#2565))
([b2c3f6a](b2c3f6a))
* update dependency com.fasterxml.jackson:jackson-bom to v2.17.0
([#2564](#2564))
([40ae7f9](40ae7f9))
* update dependency com.google.api-client:google-api-client-bom to
v2.4.0
([#2570](#2570))
([f60441f](f60441f))
* update dependency com.google.errorprone:error_prone_annotations to
v2.26.1
([#2530](#2530))
([7c1aaab](7c1aaab))
* update dependency com.google.errorprone:error_prone_annotations to
v2.26.1
([#2532](#2532))
([447b4e1](447b4e1))
* update dependency io.netty:netty-tcnative-boringssl-static to
v2.0.65.final
([#2547](#2547))
([46e0e0f](46e0e0f))
* update dependency net.bytebuddy:byte-buddy to v1.14.12
([#2522](#2522))
([edfec32](edfec32))
* update google api dependencies
([#2484](#2484))
([92e91bc](92e91bc))
* update google api dependencies
([#2538](#2538))
([d9355cf](d9355cf))
* update googleapis/java-cloud-bom digest to 3f93d58
([#2499](#2499))
([5fd4d5e](5fd4d5e))
* update googleapis/java-cloud-bom digest to 659764f
([#2545](#2545))
([d6c8be6](d6c8be6))
* update netty dependencies
([#2480](#2480))
([40753c3](40753c3))
* update opentelemetry-java monorepo to v1.35.0
([#2477](#2477))
([3ecefff](3ecefff))
* update opentelemetry-java monorepo to v1.36.0
([#2550](#2550))
([9669c21](9669c21))
* update opentelemetry-java monorepo to v1.36.0
([#2573](#2573))
([f5f201e](f5f201e))
* update slf4j monorepo to v2.0.12
([#2481](#2481))
([363a354](363a354))


### Documentation

* minor tweaks to various comments
([d343be9](d343be9))
* minor tweaks to various comments
([d343be9](d343be9))
</details>

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size: l Pull request size is large.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants