For installation and running instructions, see Get started.
Main changes
SQL features
- Query syntax:
- Technical preview: Supports
ASOF JOIN
to join and find the closest matching record by the event time or another ordered property. #18683 - Supports
AGGREGATE:
prefixed scalar function in streaming aggregation. #18205 - Supports using user-defined aggregate functions as window function. #18181
- Supports blocking subscription cursors and configuring cursor timeouts. #18675
- Technical preview: Supports
- SQL commands:
- Enhances observability of cursors and subscription cursors by improving output results of
SHOW SUBSCRIPTION CURSORS
andSHOW CURSORS
commands. #18896
- Enhances observability of cursors and subscription cursors by improving output results of
- SQL functions & operators:
- Technical Preview: Supports the TVF
postgres_query
. #18811
- Technical Preview: Supports the TVF
- System catalog:
Connectors
- Breaking change: Changes
scan.startup.mode=latest
for NATS source connector to start consuming from next available message instead of last one. #18733 - Public preview: Supports shared Kafka sources, which can be disabled by session variable
streaming_use_shared_source
. #18749 - Supports recursively scanning file sources. #18324
- Supports schemaless ingestion for data in JSON format from Kafka sources by using
INCLUDE payload
clause. #18437 - Adds a set of options for NATS source connector based on the async_nats crate. #17615
- Adds a required option,
consumer.durable_name
, for NATS source connector. #18873 - Supports option
max_packet_size
for MQTT sources. #18520 - Supports option
database.encrypt
for SQL Server CDC sources. #18912 - Supports ingesting data from a partitioned table for PostgreSQL CDC sources. #18456
- Supports option
auto.schema.change
for PostgreSQL CDC sources to enable replicating Postgres table schema change. #18760 - Requires upstream table name to also be prefixed with database name when creating a SQL Server CDC table. #18868
- Adds
JSON
encode for file sinks, allowing users to sink JSON files into object storage. #18744 - Supports
create_table_if_not_exists
option for Iceberg sink connector. #18362 - Supports WebHDFS sinks. #18293
- Removes option
bulk_write_max_entries
for MongoDB sink and optiondefault_max_batch_rows
for DynamoDB sink. Adds optionsmax_batch_item_nums
andmax_future_send_nums
for DynamoDB sink. #17645 - Sets sink decoupling as the default policy for MongoDB, DynamoDB, and Redis sink connectors. #17645
- Supports option
routing_column
for ElasticSearch sinks, allowing a column to be set as a routing key. #18698 - Supports specifying batching strategy when sinking data in Parquet format. #18472
Installation and deployment
- Adds a CLI argument of
--license-key-path
for the meta node, enabling a background task to watch and reload license key from the specified file. #18768
Cluster configuration changes
- When
visibility_mode
is set toall
, the latest uncommitted data will be queried, but consistency is no longer guaranteed between the tables. #18230 - Supports
SET TIME ZONE INTERVAL '+00:00' HOUR TO MINUTE
as equivalent toSET TIME ZONE UTC
. #18705 - The etcd metastore is fully deprecated and unsupported. Users previously utilizing etcd metastore must manually migrate to a SQL backend (PostgreSQL, MySQL, or SQLite) to upgrade to v2.1.0
Full Changelog: v2.0.4...v2.1.0