Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Downgrade Scala version in Spark job-server to prevent Scala serialization bug #23522

Merged

Conversation

mosche
Copy link
Member

@mosche mosche commented Oct 6, 2022

Downgrade the Scala version of the job-server to match the Scala version of a Spark 3.1.2 cluster to prevent
a Scala bug (InvalidClassException when deserializing WrappedArray)

(fixes #21092)


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Choose reviewer(s) and mention them in a comment (R: @username).
  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI.

@mosche
Copy link
Member Author

mosche commented Oct 6, 2022

R: @aromanenko-dev
R: @ibzib

@github-actions
Copy link
Contributor

github-actions bot commented Oct 6, 2022

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control

@mosche
Copy link
Member Author

mosche commented Oct 11, 2022

@aromanenko-dev ping, should be quick to review ... just a small change to force the right Scala version

@aromanenko-dev
Copy link
Contributor

Run Python Spark ValidatesRunner

@mosche
Copy link
Member Author

mosche commented Oct 11, 2022

@aromanenko-dev Unfortunately there's no tests that would cover this ... such classpath issues can only be detected if NOT running Spark in local mode.

I just noticed that the Spark 2 job-server / runner is broken as well, what a mess 😭 😡

java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.type.TypeBindings.emptyBindings()Lcom/fasterxml/jackson/databind/type/TypeBindings;
	at org.apache.beam.sdk.options.PipelineOptionsFactory.createBeanProperty(PipelineOptionsFactory.java:1708)
	at org.apache.beam.sdk.options.PipelineOptionsFactory.computeDeserializerForMethod(PipelineOptionsFactory.java:1732)
	at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1660)
	at org.apache.beam.sdk.options.PipelineOptionsFactory.getDeserializerForMethod(PipelineOptionsFactory.java:1782)
	at org.apache.beam.sdk.options.PipelineOptionsFactory.deserializeNode(PipelineOptionsFactory.java:1806)

That issue is even worse, I'm pretty convinced this isn't only affecting the portable runner. Attempting to deserialize pipeline options will fail on the cluster using either PortableRunner or SparkRunner because the expected Jackson API is newer than the one available on the Spark cluster

@aromanenko-dev
Copy link
Contributor

@mosche Do we still support Spark 2 runner? I believed it was deprecated and finally will be dropped out.

@mosche
Copy link
Member Author

mosche commented Oct 11, 2022

Yes, we deprecated it ... but apparently it cannot be used anymore since long.

Copy link
Contributor

@aromanenko-dev aromanenko-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, sorry for delay

@aromanenko-dev aromanenko-dev merged commit d2b0a26 into apache:master Oct 11, 2022
@mosche mosche deleted the 21092_fix_spark_job_server_classpath branch October 11, 2022 08:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

java.io.InvalidClassException with Spark 3.1.2
2 participants