Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix subquery alias table definition unparsing for SQLite #12331

Merged
merged 6 commits into from
Sep 6, 2024

Conversation

sgrebnov
Copy link
Member

@sgrebnov sgrebnov commented Sep 5, 2024

Which issue does this PR close?

SQLite does not support defining column aliases as part of alias table definition, for example

SELECT "customers2"."c_name2" FROM
(
	SELECT "customer"."c_name", "customer"."c_address"  FROM "customer"
) AS "customers2"("c_name2", "c_address2")

Must be rewritten as shown below so it can be executed on SQLite.

SELECT "customers2"."c_name2" FROM
(
	SELECT "customer"."c_name" as "c_name2", "customer"."c_address" as "c_address2"  FROM "customer"
) AS "customers2"

PRs address this issue by adding an option for unparser to specify column aliases as part of internal query projection instead of outer table alias definition.

TPC-H Q13 is an example of where such aliasing is used.

select
    c_count,
    count(*) as custdist
from
    (
        select
            c_custkey,
            count(o_orderkey)
        from
            customer left outer join orders on
                        c_custkey = o_custkey
                    and o_comment not like '%special%requests%'
        group by
            c_custkey
    ) as c_orders (c_custkey, c_count)
group by
    c_count
order by
    custdist desc,
    c_count desc;

Rationale for this change

PR introduces supports_column_alias_in_table_alias parameter to control desired behavior and adds logic to produce the right SQL for SQLite dialect.

The logic is applied only if column aliases are provided in the table alias definition and supports_column_alias_in_table_alias is set to false (not supported).

What changes are included in this PR?

See above

Are these changes tested?

Tested manually and added unit tests.

Are there any user-facing changes?

PR introduces new dialect parameter supports_column_alias_in_table_alias that can be used to control the behavior of column aliasing in table aliasing.

@github-actions github-actions bot added the sql SQL Planner label Sep 5, 2024
@sgrebnov sgrebnov force-pushed the sgrebnov/sqlite-subquery-alias-fix branch from 4f5d93b to fb173a1 Compare September 5, 2024 05:45
@sgrebnov sgrebnov force-pushed the sgrebnov/sqlite-subquery-alias-fix branch from fb173a1 to 0dd2e68 Compare September 5, 2024 05:49
datafusion/sql/src/unparser/plan.rs Outdated Show resolved Hide resolved
datafusion/sql/src/unparser/plan.rs Outdated Show resolved Hide resolved
datafusion/sql/src/unparser/rewrite.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @sgrebnov and @phillipleblanc -- this looks great to me

@alamb alamb merged commit cad4146 into apache:main Sep 6, 2024
24 checks passed
@phillipleblanc phillipleblanc deleted the sgrebnov/sqlite-subquery-alias-fix branch September 7, 2024 07:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sql SQL Planner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants