-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The PostCommit Java PVR Spark Batch job is flaky #30512
Comments
Random tests in this test suite failing due to tmp file get deleted half way, likely a racing issue. This is recurring for a long time.
|
Reopening since the workflow is still flaky |
Taking a closer look, tests are failing on initializing TestPipeline:
the failing call is "setDefaultPipelineOptions", which loads filesystem registrar that are autoService. It sounds similar to google/auto#718 Another observation is that the test trying to read from |
Run tests locally (macOS). There are lots of this log:
and there is also intermittent test failure, but now FileNotFoundException has a hint (Too many files open) |
Reopening since the workflow is still flaky |
Likely due to worker upgraded to Java 11, however the actual error is not surfaced. A first step would be making the "FnHarness Startup failed" report the underlying Exception, making it debuggable. This test has been a trouble maker as Spark in memory runner doing dirty ops on class loader and often caused conflict. Remove milestone for now until we have bandwidth to look into it UPDATE: added log to serface the error: 76a320e It reveals the root cause:
This is very similar to #30512 (comment), the same ClassLoader issue now happened in |
The PostCommit Java PVR Spark Batch is failing over 50% of the time
Please visit https://github.com/apache/beam/actions/workflows/beam_PostCommit_Java_PVR_Spark_Batch.yml?query=is%3Afailure+branch%3Amaster to see the logs.
The text was updated successfully, but these errors were encountered: