-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[EPIC] Support LogicalPlan --> SQL String
#8661
Comments
Hi @tempbottle -- no sadly there is no way that I know if to go from logical plan --> SQL I think it would be possible in theory, though probably a non trivial amount of code A required step would be to go from Expr --> SQL expression, which might be a good place to start |
I looked into this myself recently and also concluded that DataFusion has no way currently to recover the SQL strings that generated a LogicalPlan or parts of it (Exprs). Likewise, I don't think sqlparser-rs has any mechanism to go from AST back to raw SQL. Either case would be non trivial to implement, and raises the question of what SQL dialect you want to support. I'm not sure what your use case is @tempbottle, but in my case I am looking at implementing TableProviders over top of other engines which accept SQL. For this use case, it may be more fruitful to look at https://substrait.io/. DataFusion does have the capability to convert a LogicalPlan to a Substrait plan. If the engine of interest does not consume Substrait, you are probably stuck writing custom glue code to map a LogicalPlan or Exprs into whatever SQL dialect or DSL that the target engine supports. I'll also point out that GlareDb has some examples of mapping Exprs to SQL strings for the purposes of implementing a TableProvider. See e.g. https://github.com/GlareDB/glaredb/blob/ee4b440d9a594150926ce0a5fd96db400185c881/crates/datasources/src/postgres/mod.rs#L1242 |
I am pretty sure the I agree substrait.io is a good idea for connecting other databases. |
@tempbottle @devinjdangelo FYI Ibis has a to_sql method that can be used to generate SQL and DataFusion is a backend for Ibis however I'm not sure if it has all the features required for |
That's very interesting i'll have to look more closely at that. Thanks @alamb ! |
SQL String
FYI I filed #9495 for the Expr --> String feature. That would likely be a necesary first step for a full LogicalPlan --> SQL conversion |
I split out datafusion-federation's SQL Writer code into its own package and added an example for expressions and plans. I intend to mature it over time. It's open to contributions and/or moving to a more canonical place. |
For anyone else following along, there is substantial discussion on #9495 (comment) worth reading |
Does it make sense for me to lay the same 'groundwork' for the
Any early thoughts or feedback on the datafusion-federation code sql-write code is appreciated, so I can address it in the process. |
I am in support of this. We should be cautious about what we expose via public interfaces at first, since I think it is likely we will require many breaking changes to support all possible logical plans. The top level
I am a bit surprised there isn't an ergonomic way to build an AST already in If the AST builder functionality is not particularly useful in |
I agree with @devinjdangelo that I think it does.
I am not sure, to be honest. I think if we upstreamed a AST builder style thing in sqlparser, we would also want to migrate the existing parser to use it (and therefore take advantage of the existing tests). I think as long as it were backwards compatible it would make a nice contribution. However, I would also say there is no compelling need to upstream it either. Having it start life in DataFusion would also be fine and then maybe upstream it in sqlparser |
Independently I have spent a fair bit of time on this problem. I think this current design of going to the sqlparser AST is a better approach than trying to go to SQL directly. One thing that really helped was that I went and copied ~150ish queries from https://www.w3resource.com/sql-exercises/adventureworks/adventureworks-exercises.php to do a roundtrip test against. These are licensed https://creativecommons.org/licenses/by-nc-sa/3.0/deed.en so could be added to the test suite. As you start to add more queries you start to see a lot of edge cases (particularly relating to how aggregations work) that need to be dealt with. |
SQL String
SQL String
I think we can claim this is complete with all the great work from @devinjdangelo @backkem @goldmedal @phillipleblanc and others. Thank you |
Part of #9494
Is your feature request related to a problem or challenge?
No response
Describe the solution you'd like
No response
Describe alternatives you've considered
No response
Additional context
No response
Sub tasks
Using
constraint to SQL String #10652LogicalPlan::Distinct
toDISTINCT
#10663LogicalPlan::Window
to SQL #10664The text was updated successfully, but these errors were encountered: