-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Calcite native library that parses sql and returns a binary substrait representation #21
Comments
In the DataFusion CLI we chose to support the Hive
|
Some updates here. Made some progress and will post a wip PR soon (hopefully). It is proving non-trivial to get Calcite to compile within a GraalVM native image. I've opened up CALCITE-4786 to track work that should make that easier. Good news is when I've gotten things to work, I've seen sub-millisecond sql > rel > substrait conversions for very simple plans. |
Can this be closed since we have |
Moved to substrait-io/substrait-java. Closing here. |
@cpcloud, @jacques-n: Could you elaborate a bit on what substrait-java does and how it addresses the use cases defined in CALCITE-4786? (I'd love to have a well-designed SQL parser for standard SQL, like Calcite, but without involving a JVM for the user, i.e., unlike Calcite unless through native image. I was very hopeful at the beginning of this issue but don't understand the connection to substrait-java.) |
…ubstrait-io#21) * Update substrait submodule to point to latest release * Add AggregateFunctionInvocation to pojo model to track distinct
The idea here is to provide a reasonable way for people to give users immediate access to the high quality SQL parsing of Calcite with minimal effort. We'd use GraalVM for AOT compilation and start with a fairly simple function similar to
substrait parse(string)
It would be nice if part of this effort was to expose this library with a command line tool that could be piped to other future tools. (For example, create an additional cli that will take a plan and return the results with Datafusion.)
A big question is what catalog to expose in an example cli. Some ideas:
A second fun thing to add would be a separate library that is plan in and out and applies a list of optimization rules using one of the existing Calcite optimizers. Lower priority than the sql parser initially but could be intersting to evaluate different optimization patterns and start exposing nice Calcite interfaces for things like python/rust/etc.
The text was updated successfully, but these errors were encountered: