Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue: Writing iceberg tables #346

Closed
2 of 7 tasks
ZENOTME opened this issue Apr 23, 2024 · 8 comments
Closed
2 of 7 tasks

Tracking issue: Writing iceberg tables #346

ZENOTME opened this issue Apr 23, 2024 · 8 comments

Comments

@ZENOTME
Copy link
Contributor

ZENOTME commented Apr 23, 2024

Inspired by #275 (comment), I created this issue to track all our write task. It's based on doc.

@liurenjie1024 liurenjie1024 changed the title Tracking:: Writing iceberg tables Tracking issue:: Writing iceberg tables Apr 24, 2024
@liurenjie1024 liurenjie1024 changed the title Tracking issue:: Writing iceberg tables Tracking issue: Writing iceberg tables Apr 24, 2024
@c-thiel
Copy link
Collaborator

c-thiel commented Nov 16, 2024

Anyone else passing through here, check #700 for a more fine-grained overview and future planning.


Dear all,
we have been talking to many people recently that are working in the Rust data ecosystem for which writes or compaction would be very important. Many are also willing to take up some of the issues here - me included.

My goal is to align with the community on good next steps and then distribute issues among us to get more focus, attention and developer time for the issues that are needed to implement writes and compaction.

In addition to the issues listed above, there are currently open PRs that are also important for writes:

Related PRs in iceberg-rust

Related other Issues / Discussions

I am sure I missed a few - please feel free to extend.

My proposal for a way forward would be as follows:

  1. It would be great to get some eyes on on the already open PRs - I think especially feat: support append data file and add e2e test #349 deserves some attention.
  2. If there are any issues that are ready to be implemented from the list above - that nobody else is working on already, it would be great to highlight a few as an answer to this post that people can take and start working on. Maybe @Xuanwo, @liurenjie1024, @Fokko you have an overview?
  3. We will have the first Iceberg Rust Community sync shortly on 28th of November! (Public Google Calendar, Mailing List). In my opinion this would be a great opportunity to align on next steps and maybe hand out some tasks.

Let me know what you think!

@adisheshkishore, @amitgilad3, @JanKaul, @jaychia, @kevinzwang, @mkarbo, @mehmetozsoy-synnada, @rampage644, @twuebi

@sdd
Copy link
Contributor

sdd commented Nov 16, 2024

I've been intending on switching focus from reads to writes once my delete file read support PR is merged. I'll do my best to attend the meet on the 28th (day before my birthday! 😁) and look forward to collaborating with whoever else gets involved to land writes in iceberg-rust!

@ZENOTME
Copy link
Contributor Author

ZENOTME commented Nov 16, 2024

Thank @c-thiel for raising! I am also working on #340 and can send it as a PR soon. Recently, I will also dedicate more effort to writing support. I think #349 is the first step to writing support in iceberg-rust so welcome review.

@kevinzwang
Copy link

kevinzwang commented Nov 28, 2024

Thank you for compiling this @c-thiel @ZENOTME! I'm part of the Daft team and we're eagerly looking forward to write support. The tasks enumerated by @ZENOTME match our priorities well -- we would need at least appends, overwrites, and partitioned writes in order to migrate to iceberg-rust.

Let me know if this is being worked on already, but one thing we'd like to see is a public interface for building and committing snapshots. Daft handles writing data itself, and would use iceberg-rust only for the metadata operations to commit a write. Since this could also be used by the iceberg-rust writer, it makes sense to to solidify that interface first.

Thanks for the work so far! We would love to further our collaboration on this, @jaychia and I will also be at the upcoming community sync to discuss more.

@ZENOTME
Copy link
Contributor Author

ZENOTME commented Nov 28, 2024

Thank you for compiling this @c-thiel @ZENOTME! I'm part of the Daft team and we're eagerly looking forward to write support. The tasks enumerated by @ZENOTME match our priorities well -- we would need at least appends, overwrites, and partitioned writes in order to migrate to iceberg-rust.

Let me know if this is being worked on already, but one thing we'd like to see is a public interface for building and committing snapshots. Daft handles writing data itself, and would use iceberg-rust only for the metadata operations to commit a write. Since this could also be used by the iceberg-rust writer, it makes sense to to solidify that interface first.

Thanks for the work so far! We would love to further our collaboration on this, @jaychia and I will also be at the upcoming community sync to discuss more.

Hi @kevinzwang , recently I'm working on #340 #342 #343 to support delete write, partition write.

And #349 has merged, so I think now the iceberg-rust support to commit data file.

@Fokko
Copy link
Contributor

Fokko commented Nov 28, 2024

@ZENOTME I missed this issue, sorry for that. Should we merge this one into #700 ?

@ZENOTME
Copy link
Contributor Author

ZENOTME commented Nov 28, 2024

@ZENOTME I missed this issue, sorry for that. Should we merge this one into #700 ?

Sure! I think these are more about the writer.

@Xuanwo
Copy link
Member

Xuanwo commented Nov 28, 2024

Tracked at #700

@Xuanwo Xuanwo closed this as completed Nov 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants