Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: unparser generates wrong sql for derived table with columns #11505

Merged

Conversation

y-f-u
Copy link
Contributor

@y-f-u y-f-u commented Jul 17, 2024

Which issue does this PR close?

Unparser creates invalid sqls for LogicPlan that generated from sql with derived table with columns.

Rationale for this change

See "Which issue does this PR close", also check below comment.

Copy/pasted from the code comments.

// A roundtrip example for table alias with columns
//
// query: SELECT id FROM (SELECT j1_id from j1) AS c (id)
//
// LogicPlan:
// Projection: c.id
//   SubqueryAlias: c
//     Projection: j1.j1_id AS id
//       Projection: j1.j1_id
//         TableScan: j1
//
// Before introducing this logic, the unparsed query would be `SELECT c.id FROM (SELECT j1.j1_id AS
// id FROM (SELECT j1.j1_id FROM j1)) AS c`.
// The query is invalid as `j1.j1_id` is not a valid identifier in the derived table
// `(SELECT j1.j1_id FROM j1)`

What changes are included in this PR?

Introduce a method that probe the SubqueryAlias plan to see if it contains a double layer of projections. If so, extract column alias from the outer layer projection into table alias and use the inner layer projection for relation construction.

Are these changes tested?

Yes

Are there any user-facing changes?

No

* fix unparser for derived table with columns

* refactoring

* renaming

* case in tests
@github-actions github-actions bot added the sql SQL Planner label Jul 17, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @y-f-u -- this seems like an improvement to me.

I left a suggestions for an additional test but I also think we could do that as a follow on

cc @goldmedal

//
// Caveat: this won't handle the case like `select * from (select 1, 2) AS a (b, c)`
// as the parser gives a wrong plan which has mismatch `Int(1)` types: Literal and
// Column in the Projections. Once the parser side is fixed, this logic should work
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the comments

// more tests around subquery/derived table roundtrip
TestStatementWithDialect {
sql: "SELECT string_count FROM (
SELECT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend you add a query that has an additional test that selects more than one column and an expression

something like

select string_count * 100, id FROM ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. This query is actually not working as "CAST" will be presented in the inner projection and the projections matching actually breaks.

Maybe another PR to fix this case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might also be worth filing an issue to track the defect

@alamb
Copy link
Contributor

alamb commented Jul 19, 2024

Thanks again @y-f-u

@alamb alamb merged commit f195352 into apache:main Jul 19, 2024
23 checks passed
@phillipleblanc phillipleblanc deleted the fix-unparser-for-derived-table-with-columns branch July 22, 2024 00:35
Lordworms pushed a commit to Lordworms/arrow-datafusion that referenced this pull request Jul 23, 2024
…che#17) (apache#11505)

* fix unparser for derived table with columns

* refactoring

* renaming

* case in tests
wiedld pushed a commit to influxdata/arrow-datafusion that referenced this pull request Jul 31, 2024
apache#11505)

* fix unparser for derived table with columns

* refactoring

* renaming

* case in tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sql SQL Planner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants