Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The PostCommit Java IO Performance Tests job is flaky #30527

Open
github-actions bot opened this issue Mar 5, 2024 · 9 comments · Fixed by #33338
Open

The PostCommit Java IO Performance Tests job is flaky #30527

github-actions bot opened this issue Mar 5, 2024 · 9 comments · Fixed by #33338

Comments

@github-actions
Copy link
Contributor

github-actions bot commented Mar 5, 2024

The PostCommit Java IO Performance Tests is failing over 50% of the time
Please visit https://github.com/apache/beam/actions/workflows/beam_PostCommit_Java_IO_Performance_Tests.yml?query=is%3Afailure+branch%3Amaster to see the logs.

@kennknowles
Copy link
Member

Also perma-red

@github-actions github-actions bot added this to the 2.59.0 Release milestone Aug 20, 2024
@github-actions github-actions bot reopened this Aug 21, 2024
Copy link
Contributor Author

Reopening since the workflow is still flaky

@ahmedabu98
Copy link
Contributor

@Abacn another test that is using the wrong secret

@damondouglas
Copy link
Contributor

damondouglas commented Dec 10, 2024

Situation

The Gradle task :it:google-cloud-platform:GCSPerformanceTest repeatedly fails with error:

com.google.auth.oauth2.GoogleAuthException at FileBasedIOLT.java
org.apache.beam.it.gcp.storage.FileBasedIOLT > testTextIOWriteThenRead FAILED	
    java.lang.IllegalArgumentException at FileBasedIOLT.java:191	
        Caused by: java.lang.RuntimeException at FileBasedIOLT.java:191	
            Caused by: java.io.IOException at FileBasedIOLT.java:191	
                Caused by: com.google.auth.oauth2.GoogleAuthException at FileBasedIOLT.java:191	
                    Caused by: com.google.api.client.http.HttpResponseException at FileBasedIOLT.java:191	
org.apache.beam.it.gcp.storage.FileBasedIOLT > classMethod FAILED	
    java.lang.RuntimeException at FileBasedIOLT.java:128	
        Caused by: com.google.cloud.storage.StorageException at FileBasedIOLT.java:128	
            Caused by: com.google.auth.oauth2.GoogleAuthException at FileBasedIOLT.java:128	
                Caused by: com.google.api.client.http.HttpResponseException at FileBasedIOLT.java:128

Background

History of failures go back as far as 9 months, most recently related to an error in GCP credentials. Notably, beam_PostCommit_Java_IO_Performance_Tests.yml has an authentication step using static credentials. Searching through the repository we see this google-github-actions/auth@v1 pattern used in only a few places while there are numerous tests and workflows requiring GCP authentication.

Assessment

The test may be improperly configured using incorrect credentials. More importantly, it uses an outdated pattern via static credentials to authenticate to Google cloud.

Recomendation

Remove the use of google-github-actions/auth@v1.

@damondouglas
Copy link
Contributor

@damondouglas
Copy link
Contributor

damondouglas commented Dec 11, 2024

#33338 fixed the GCS Performance test which enabled the BigQueryStorageApiStreamingPerformanceTest to run but we see this breaking. https://github.com/apache/beam/actions/runs/12262712166/job/34225821488

org.apache.beam.it.gcp.bigquery.BigQueryStreamingLT > testExactlyOnceStreaming FAILED
    dev.failsafe.FailsafeException at BigQueryStreamingLT.java:374
        Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException at BigQueryStreamingLT.java:374

@damondouglas damondouglas reopened this Dec 11, 2024
@damondouglas
Copy link
Contributor

I'm keeping this ticket assigned to me. Catching up on other tasks in my queue and will research what went wrong with testExactlyOnceStreaming. I feel like we are close to fixing this flakyness and thus worth it to invest the time when available.

@damondouglas
Copy link
Contributor

damondouglas commented Dec 13, 2024

I see that the BigQuery performance test is failing with this message:

Access Denied: Project <external project>: User does not have bigquery.datasets.create permission in project <external project>.

I'm not including what that external project is in this comment but the point is is that it is not apache-beam-testing for which the IAM roles are bound by the GitHub actions runner node service account.

The solution is either:

  • Add the GitHub action runner node service account as an IAM member of the BigQuery dataset/project
  • Reconfigure the test to use BigQuery in apache-beam-testing

@damondouglas
Copy link
Contributor

@ahmedabu98 found that BigQueryStreamingLT has a hard coded project name. We just need to configure the test to acquire the GCP project using a Gradle property.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants