Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking: jsonb operations #7714

Closed
23 tasks done
xiangjinwu opened this issue Feb 6, 2023 · 5 comments
Closed
23 tasks done

Tracking: jsonb operations #7714

xiangjinwu opened this issue Feb 6, 2023 · 5 comments
Assignees
Labels
help wanted Issues that need help from contributors type/feature

Comments

@xiangjinwu
Copy link
Contributor

xiangjinwu commented Feb 6, 2023

Listing all jsonb operations from PostgreSQL, and which ones to implement now, soon, or until requested.

Basic

  • TEXT format input / output (same as cast from / to string)
  • BINARY format input / output

General for all types:

  • IS [NOT] NULL / CASE WHEN / COALESCE / etc

Eq / Hash / Ord

  • There is no plan to support this type of operations right now.
  • Also includes IS [NOT] DISTINCT / NULLIF / IN / etc

Cast to simple types

  • bool
  • smallint / int / bigint / decimal / real / double precision
  • There is no cast from them to jsonb, but a function to_jsonb.

Accessing array / object

Informative / Debugging

Predicate

Construction / Mutation

Subscripting

  • Avoid j[index] / j[key] and use #> / #>> / -> / ->> instead.

Aggregation

Table Function (aka Set Returning Function)

Advanced

jsonpath type

Additional Details

While SQL array is 1-indexed, JSON array as part of SQL is still 0-indexed.

Our goal is to ingest JSON data and then extract strongly-typed columns from it. So operations on constructing / mutating / comparison of JSON are not of interest right now.

Dedicated string member access operator vs Cast to string

with t(v1) as (values (null::jsonb), ('null'), ('true'), ('1'), ('"a"'), ('[]'), ('{}')),
     j(v1) as (select ('{"k":' || v1::varchar || '}')::jsonb from t)
select
    v1 ->> 'k',
    (v1 -> 'k')::varchar,
    jsonb_typeof(v1 -> 'k')
from j order by 2;
----
a "a" string
1 1 number
[] [] array
NULL null null
true true boolean
{} {} object
NULL NULL NULL

Operations on object keys also work for array string elements
(TODO)

@xiangjinwu xiangjinwu self-assigned this Feb 6, 2023
@github-actions github-actions bot added this to the release-0.1.17 milestone Feb 6, 2023
mergify bot pushed a commit that referenced this issue Feb 16, 2023
…7977)

Part of the `jsonb` type support (preview all on [this branch](https://github.com/risingwavelabs/risingwave/compare/XJ-jsonb-WIP-2?expand=1)):
* introduce `Scalar` and `ScalarRef` **(this PR)**
* Introduce `ArrayBuilder` and `Array`
* Introduce `DataType::Jsonb`
* Add more expressions #7714

`serde_json` is chosen instead of `simd_json` for the following reasons:
* Better interoperability with other libraries (e.g. `postgres_types`).
* Despite the name `BorrowedValue`, it is not `Copy` and not suitable for `JsonbRef`.
* `simd_json` uses `HashMap` but `serde_json` defaults to `BTreeMap`, which has a deterministic ordering and string representation.

Approved-By: TennyZhuang
mergify bot pushed a commit that referenced this issue Feb 17, 2023
Part of the `jsonb` type support (preview all on [this branch](https://github.com/risingwavelabs/risingwave/compare/XJ-jsonb-WIP-2?expand=1)):
* #7977
* Introduce `ArrayBuilder` and `Array` **(this PR)**
* Introduce `DataType::Jsonb`
* Add more expressions #7714

In this PR:
* The in-memory layout of `JsonbArray` is simply a `Vec` or roots of json trees. This is easier to operate but do have space for optimization.
* The protobuf layout is the same as `bytea` variable length array, with each element storing its value encoding. In case we switch to a newer protobuf layout, a new ArrayType enum value can be introduced without affecting parts other than `to_protobuf` and `from_protobuf`.
* Refactored `VarSizedValueReader` from returning `RefItem` into accepting `&mut ArrayBuilder`. This is to use `JsonbArrayBuilder::append_move` on the deserialized `OwnedItem`. It is impossible to get a `JsonbArray::RefItem` from `&[u8]`.
* Blanket implementation for `arrow` / `HashKeySerDe` / `RandValue`.

Approved-By: BugenZhao
mergify bot pushed a commit that referenced this issue Feb 17, 2023
Part of the `jsonb` type support:
* #7977
* #7986
* Introduce `DataType::Jsonb` **(this PR)**
* Add more expressions #7714

In this PR:
* Add `DataType::Jsonb`.
* Support constructing it from string and displaying it as string.
* Add e2e tests for `insert` and `select`.

Also tested for the following but not added in CI:
* prepared statement with BINARY format
* kafka+json source with a `jsonb` field

Approved-By: BugenZhao
@xiangjinwu
Copy link
Contributor Author

Removing the milestone as the remaining operators are optional / long-term.

@xiangjinwu xiangjinwu removed this from the release-0.1.18 milestone Mar 1, 2023
@fuyufjh fuyufjh added the help wanted Issues that need help from contributors label Jun 12, 2024
@st1page
Copy link
Contributor

st1page commented Jul 8, 2024

jsonb_populate_record / jsonb_populate_recordset / jsonb_to_record / jsonb_to_recordset

Also, We might need another expression to optimize the ARRAY(jsonb_populate_recordset()) and ARRAY(jsonb_populate_recordset()) for user to fast cast from jsonb array to array of structs

@xiangjinwu
Copy link
Contributor Author

Also, We might need another expression to optimize the ARRAY(jsonb_populate_recordset()) and ARRAY(jsonb_populate_recordset()) for user to fast cast from jsonb array to array of structs

Opened a dedicated issue: #17617

@xxchan
Copy link
Member

xxchan commented Aug 22, 2024

What's still missing? Looks mostly finished

@xiangjinwu
Copy link
Contributor Author

What's still missing? Looks mostly finished

PostgreSQL 16 (and maybe the upcoming 17) is still actively adding more JSON expressions. But I am okay with either close this or keep it open. We have no plan to add those proactively in the near future.

@xxchan xxchan closed this as completed Aug 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Issues that need help from contributors type/feature
Projects
None yet
Development

No branches or pull requests

5 participants