Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core,Open-API: Don't expose the last-column-id #11514

Merged
merged 9 commits into from
Nov 25, 2024

Conversation

Fokko
Copy link
Contributor

@Fokko Fokko commented Nov 11, 2024

Okay, I've added this to the spec a while ago:

#7445

But I think this was a mistake, and we should not expose this to the public APIs, as it is much better to track this internally.

I noticed this while reviewing apache/iceberg-rust#587

Removing this as part of the APIs in Java, and the Open-API update makes it much more resilient, and don't require the clients to compute this value. For example. when there are two conflicting schema changes, the last-column-id must be recomputed correctly when doing the retry operation.

@RussellSpitzer
Copy link
Member

I did not understand why this was there before. Do we have anyone or any implementations which benefit from having it there?

@Fokko
Copy link
Contributor Author

Fokko commented Nov 12, 2024

Do we have anyone or any implementations which benefit from having it there?

I have the same question, I can't think of why you need to set this externally.

@Fokko Fokko force-pushed the fd-remove-last-column-id branch from c8fcb68 to 6870e60 Compare November 14, 2024 09:00
@danielcweeks
Copy link
Contributor

@Fokko as part of the proposed deprecation, we should update all tests that use this (I found multiple references).

Okay, I've added this to the spec a while ago:

apache#7445

But I think this was a mistake, and we should not expose this
to the public APIs, as it is much better to track this internally.

I noticed this while reviewing apache/iceberg-rust#587

Removing this as part of the APIs in Java, and the Open-API
update makes it much more resilient, and don't require the
clients to compute this value. For example. when there are two conflicting
schema changes, the last-column-id must be recomputed correctly when doing
the retry operation.
@Fokko Fokko force-pushed the fd-remove-last-column-id branch from 0d935f5 to f79cb11 Compare November 15, 2024 07:23
@Fokko Fokko marked this pull request as ready for review November 15, 2024 07:24
@Fokko
Copy link
Contributor Author

Fokko commented Nov 15, 2024

@danielcweeks You're right! I wanted to show that after updating the code, all the existing tests still pass.

I've updated the tests in a separate commit f79cb11

Copy link
Contributor

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danielcweeks
Copy link
Contributor

@Fokko since this changes the REST Spec, you should probably hold a quick vote, but LGTM

Copy link
Contributor

@aihuaxu aihuaxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Fokko and others added 3 commits November 20, 2024 18:00
Co-authored-by: Eduard Tudenhoefner <[email protected]>
Co-authored-by: Eduard Tudenhoefner <[email protected]>
Copy link
Member

@hussein-awala hussein-awala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deprecated since 1.8.0, will be removed 1.9.0 or 2.0.0, use AddSchema(schema).

I wonder if this interface is part of Iceberg public API, because from v1.0.0 release note:

From 1.0.0 forward, the project will follow semver in the public API module, iceberg-api.

we cannot introduce breaking changes to the public API in a patch or minor version, and we need to wait for the next major version to introduce it, in our case 2.0.0, and if it's considered as a part of the public API, then you need to replace will be removed 1.9.0 or 2.0.0 by will be removed in 2.0.0

@nastra
Copy link
Contributor

nastra commented Nov 21, 2024

@hussein-awala the Schema API is part of iceberg-core and thus allows things to be deprecated and then removed in the next minor release (see also https://iceberg.apache.org/contribute/#minor-version-deprecations-required). I'm guessing @Fokko mentioned 1.9.0 or 2.0.0 because we don't know yet whether there will be a 1.9.0 release or whether we go straight to 2.0.0 after 1.8.0

Fokko added a commit to Fokko/iceberg-python that referenced this pull request Nov 25, 2024
This should not be part of the public API:

apache/iceberg#11514

This PR depends on a later version of the REST
catalog for the integration tests.
@Fokko Fokko merged commit 4b52dbd into apache:main Nov 25, 2024
50 checks passed
@Fokko Fokko deleted the fd-remove-last-column-id branch November 25, 2024 09:27
Fokko added a commit to apache/iceberg-python that referenced this pull request Nov 26, 2024
This should not be part of the public API:

apache/iceberg#11514

This PR depends on a later version of the REST
catalog for the integration tests.
sungwy pushed a commit to sungwy/iceberg-python that referenced this pull request Dec 7, 2024
This should not be part of the public API:

apache/iceberg#11514

This PR depends on a later version of the REST
catalog for the integration tests.
sungwy pushed a commit to sungwy/iceberg-python that referenced this pull request Dec 7, 2024
This should not be part of the public API:

apache/iceberg#11514

This PR depends on a later version of the REST
catalog for the integration tests.
JE-Chen pushed a commit to JE-Chen/iceberg-python that referenced this pull request Dec 23, 2024
This should not be part of the public API:

apache/iceberg#11514

This PR depends on a later version of the REST
catalog for the integration tests.
zachdisc pushed a commit to zachdisc/iceberg that referenced this pull request Dec 23, 2024
* Core,Open-API: Don't expose the `last-column-id`

Okay, I've added this to the spec a while ago:

apache#7445

But I think this was a mistake, and we should not expose this
to the public APIs, as it is much better to track this internally.

I noticed this while reviewing apache/iceberg-rust#587

Removing this as part of the APIs in Java, and the Open-API
update makes it much more resilient, and don't require the
clients to compute this value. For example. when there are two conflicting
schema changes, the last-column-id must be recomputed correctly when doing
the retry operation.

* Update the tests as well

* Add `deprecation` flag

* Wording

Co-authored-by: Eduard Tudenhoefner <[email protected]>

* Wording

Co-authored-by: Eduard Tudenhoefner <[email protected]>

* Wording

* Thanks Ryan!

* Remove `LOG`

---------

Co-authored-by: Eduard Tudenhoefner <[email protected]>
sungwy pushed a commit to sungwy/iceberg-python that referenced this pull request Dec 24, 2024
This should not be part of the public API:

apache/iceberg#11514

This PR depends on a later version of the REST
catalog for the integration tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants