-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support of reading from archives in S3 #62259
Add support of reading from archives in S3 #62259
Conversation
This is an automated comment for commit 851a849 with description of existing statuses. It's updated for the latest CI running ❌ Click here to open a full report in a separate page
Successful checks
|
|
Hey! This is an amazing feature. Which Clickhouse version is it gonna be in? |
@StashOfCode 24.4 most likely. |
src/Storages/StorageS3.cpp
Outdated
@@ -1794,7 +2047,8 @@ namespace | |||
|| getContext()->getSettingsRef().schema_inference_mode != SchemaInferenceMode::UNION) | |||
return; | |||
|
|||
String source = fs::path(configuration.url.uri.getHost() + std::to_string(configuration.url.uri.getPort())) / configuration.url.bucket / current_key_with_info->key; | |||
String source = fs::path(configuration.url.uri.getHost() + std::to_string(configuration.url.uri.getPort())) | |||
/ configuration.url.bucket / current_key_with_info->getPath(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getPath()
returns bucket + "/" + key_with_info->key
(as an example, for non-archive case), so here as a result of this change we will at bucket
twice.
081688a
to
62c27b9
Compare
8c08471
to
af76003
Compare
af76003
to
d468a0a
Compare
tests/queries/0_stateless/03036_schema_inference_cache_s3_archives.sql
Outdated
Show resolved
Hide resolved
9b09550
Ah, sorry, not after this one... |
Contination: #64703 |
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Earlier our s3 storage and s3 table function didn't support selecting from archive files. I created a solution that allows to iterate over files inside archives in S3.
Documentation entry for user-facing changes
Modify your CI run:
NOTE: If your merge the PR with modified CI you MUST KNOW what you are doing
NOTE: Checked options will be applied if set before CI RunConfig/PrepareRunConfig step
Include tests (required builds will be added automatically):
Exclude tests:
Extra options:
Only specified batches in multi-batch jobs: