-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
destination-s3: don't reuse names of existing objects #45143
destination-s3: don't reuse names of existing objects #45143
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
This stack of pull requests is managed by Graphite. Learn more about stacking. Join @stephane-airbyte and the rest of your teammates on Graphite |
97d6945
to
1d1843c
Compare
640fbf4
to
0e56f8d
Compare
objectNameByPrefix.computeIfAbsent( | ||
objectPath, | ||
) { | ||
var objectList: Set<String> = setOf() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can probably use this method here right ?
0e56f8d
to
f812f4f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
/publish-java-cdk
|
f812f4f
to
00c93c1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense:
- when writing a new file
- try to get a partId for the prefix
- start at 0
- first time for the prefix: build a map of prefix -> { all existing full names with prefix }
- while there's a conflict: increment and try again
- (no need to store the object you just made, because all subsequent calls should just increment any time there's a confict)
- try to get a partId for the prefix
So:
Sync 1: write(0, 1, 2)
Sync 2: write(3, 4, 5, 6) delete(0, 1, 2)
Sync 3: write(0, 1, 2, 7, 8)
etc
Instead of counting the number of files and starting creating file based on that counter, we create files starting at 0 and avoid overriding files that were already present.
The problem was that in case of an overwrite sync, the 1st sync would create files
1, 2, 3
. Sync 2 would notice there's 3 files and would create files4, 5 , 6
and delete1, 2, 3
at the end of the sync. Sync3 and after would see there's 3 files, would overwrite4, 5, 6
and delete them because they were here before the sync started, leaving us with no files.fixes #6417