-
Notifications
You must be signed in to change notification settings - Fork 554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
With Order Support for Memory Tables #1401
Changes from 21 commits
1375a21
a30ad56
21cd618
320f7e1
d3d12e9
e85062f
acbcd2c
79e0e18
3fb59e9
122f273
cc3c4c7
10a7b90
c8a6f14
ed705c9
8fb1958
1c2eb75
15d84d7
0f31847
7d222ff
1575534
e878149
17ba408
565e637
147cd9e
5a97a46
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -129,6 +129,9 @@ pub struct CreateTable { | |
pub default_charset: Option<String>, | ||
pub collation: Option<String>, | ||
pub on_commit: Option<OnCommit>, | ||
/// Datafusion "WITH ORDER" clause | ||
/// <https://datafusion.apache.org/user-guide/sql/ddl.html#create-external-table/> | ||
pub with_order: Vec<Vec<OrderByExpr>>, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we include test cases demonstrating the behavior introduced by the changes? |
||
/// ClickHouse "ON CLUSTER" clause: | ||
/// <https://clickhouse.com/docs/en/sql-reference/distributed-ddl/> | ||
pub on_cluster: Option<Ident>, | ||
|
Original file line number | Diff line number | Diff line change | ||||||
---|---|---|---|---|---|---|---|---|
|
@@ -5589,7 +5589,23 @@ impl<'a> Parser<'a> { | |||||||
let clustered_by = self.parse_optional_clustered_by()?; | ||||||||
let hive_formats = self.parse_hive_formats()?; | ||||||||
// PostgreSQL supports `WITH ( options )`, before `AS` | ||||||||
let with_options = self.parse_options(Keyword::WITH)?; | ||||||||
let mut with_options: Vec<SqlOption> = vec![]; | ||||||||
let mut order_exprs: Vec<Vec<OrderByExpr>> = vec![]; | ||||||||
if self.parse_keyword(Keyword::WITH) { | ||||||||
if self.parse_keyword(Keyword::ORDER) { | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Also I think we can keep the postgres path separate since this feature doesn't apply to that dialect? - so that this part remains the same let with_options = self.parse_options(Keyword::WITH)?; We can make the fuctionality conditional with e.g. |
||||||||
self.expect_token(&Token::LParen)?; | ||||||||
loop { | ||||||||
order_exprs.push(vec![self.parse_order_by_expr()?]); | ||||||||
if !self.consume_token(&Token::Comma) { | ||||||||
self.expect_token(&Token::RParen)?; | ||||||||
break; | ||||||||
} | ||||||||
} | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can the |
||||||||
} else { | ||||||||
with_options = self.parse_options_in_parentheses()?; | ||||||||
} | ||||||||
} | ||||||||
|
||||||||
let table_properties = self.parse_options(Keyword::TBLPROPERTIES)?; | ||||||||
|
||||||||
let engine = if self.parse_keyword(Keyword::ENGINE) { | ||||||||
|
@@ -5738,6 +5754,7 @@ impl<'a> Parser<'a> { | |||||||
.cluster_by(create_table_config.cluster_by) | ||||||||
.options(create_table_config.options) | ||||||||
.primary_key(primary_key) | ||||||||
.with_order(order_exprs) | ||||||||
.strict(strict) | ||||||||
.build()) | ||||||||
} | ||||||||
|
@@ -6369,15 +6386,19 @@ impl<'a> Parser<'a> { | |||||||
|
||||||||
pub fn parse_options(&mut self, keyword: Keyword) -> Result<Vec<SqlOption>, ParserError> { | ||||||||
if self.parse_keyword(keyword) { | ||||||||
self.expect_token(&Token::LParen)?; | ||||||||
let options = self.parse_comma_separated(Parser::parse_sql_option)?; | ||||||||
self.expect_token(&Token::RParen)?; | ||||||||
Ok(options) | ||||||||
self.parse_options_in_parentheses() | ||||||||
} else { | ||||||||
Ok(vec![]) | ||||||||
} | ||||||||
} | ||||||||
|
||||||||
fn parse_options_in_parentheses(&mut self) -> Result<Vec<SqlOption>, ParserError> { | ||||||||
self.expect_token(&Token::LParen)?; | ||||||||
let options = self.parse_comma_separated(Parser::parse_sql_option)?; | ||||||||
self.expect_token(&Token::RParen)?; | ||||||||
Ok(options) | ||||||||
} | ||||||||
|
||||||||
pub fn parse_options_with_keywords( | ||||||||
&mut self, | ||||||||
keywords: &[Keyword], | ||||||||
|
@@ -12364,6 +12385,43 @@ mod tests { | |||||||
); | ||||||||
} | ||||||||
|
||||||||
#[test] | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. hmm I think this would probably not be the ideal place for the test to live - if the feature isn't bound to any dialect, or if it uses a dialect method to guard the feature, we can add the tests to common e.g. https://github.com/sqlparser-rs/sqlparser-rs/blob/main/tests/sqlparser_common.rs#L4405 |
||||||||
fn parse_create_table_with_order() { | ||||||||
let sql = "CREATE TABLE test (foo INT, bar VARCHAR(256)) WITH ORDER (foo ASC)"; | ||||||||
let ast = Parser::parse_sql(&GenericDialect {}, sql).unwrap(); | ||||||||
match ast[0].clone() { | ||||||||
Statement::CreateTable(CreateTable { with_order, .. }) => { | ||||||||
assert_eq!( | ||||||||
with_order, | ||||||||
vec![vec![OrderByExpr { | ||||||||
expr: Expr::Identifier(Ident::from("foo")), | ||||||||
asc: Some(true), | ||||||||
nulls_first: None, | ||||||||
with_fill: None, | ||||||||
}]] | ||||||||
); | ||||||||
} | ||||||||
_ => unreachable!(), | ||||||||
} | ||||||||
|
||||||||
let sql = "CREATE TABLE test (foo INT, bar VARCHAR(256)) WITH ORDER (bar DESC NULLS FIRST)"; | ||||||||
let ast = Parser::parse_sql(&GenericDialect {}, sql).unwrap(); | ||||||||
match ast[0].clone() { | ||||||||
Statement::CreateTable(CreateTable { with_order, .. }) => { | ||||||||
assert_eq!( | ||||||||
with_order, | ||||||||
vec![vec![OrderByExpr { | ||||||||
expr: Expr::Identifier(Ident::from("bar")), | ||||||||
asc: Some(false), | ||||||||
nulls_first: Some(true), | ||||||||
with_fill: None, | ||||||||
}]] | ||||||||
); | ||||||||
} | ||||||||
_ => unreachable!(), | ||||||||
} | ||||||||
} | ||||||||
|
||||||||
#[test] | ||||||||
fn test_parse_multipart_identifier_positive() { | ||||||||
let dialect = TestedDialects { | ||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we include here a link to the docs where this syntax comes from? (e.g. similar to the on_cluster below)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was inspired by Datafusion's
WITH ORDER
statement, so I've added the link and tests, thank you!There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I did some research and could not find any existing databases that use the
WITH ORDER
syntax.ClickHouse does seem to have a way to specify order as part of a
CREATE TABLE
statement: https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/mergetree@mertak-synnada rather than introducing special DataFusion only syntax support, what do you think about extending DataFusion to use the existing ClickHouse syntax?
We might have to change the
GenericDialect
or add some feature to theDialect
trait to permit other dialects to parse such syntax, but it would be nice to align to some other existing syntax rather than creating something newThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see two possible ways forward:
ORDER BY
afterCREATE
and deprecateWITH ORDER
. This way we would have Clickhouse-like syntax for both memory and external tables, andWITH ORDER
syntax only for external tables. The latter asymmetry is a little weird but we can explain it away by deprecatingWITH ORDER
.ORDER BY
andWITH ORDER
for both memory and external tables in DF (they would be synonyms). We'd need to addWITH ORDER
for ordinaryCREATE
s here though.I am OK with both, with a slight preference to 2 because DF has had
WITH ORDER
for a while now. What do you think?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in this case we should distinguish between DataFusion's needs and sqlparser-rs's needs
If we are going to introduce
WITH ORDER
to sqlparser-rs I think it should follow the existing pattern of being connected to a specific dialect (in this caseDatafusionDialect
).Thus I would personally vote for option 1 (change DF) so we avoid creating a new SQL dialect (both in this crate and in general) as much as possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense - let's do it