-
Notifications
You must be signed in to change notification settings - Fork 8
Antalya 25.6: Ports of #938, #931, #1005, #933 #1007
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Antalya 25.6: Ports of #938, #931, #1005, #933 #1007
Conversation
Make DataLake metadata more lazy
…/25.6.5/795_parquet_metadata_caching
…pport_compressed_metadata
…oad_metadata' into ports/25.6/amalgamation_of_metadata_prs
…orts/25.6.5/795_parquet_metadata_caching' into ports/25.6/amalgamation_of_metadata_prs
…bjects_object_storage_cache' into ports/25.6/amalgamation_of_metadata_prs
…eberg' into ports/25.6/amalgamation_of_metadata_prs
if (local_context->getSettingsRef()[Setting::use_object_storage_list_objects_cache] && object_storage->supportsListObjectsCache()) | ||
{ | ||
auto & cache = ObjectStorageListObjectsCache::instance(); | ||
ObjectStorageListObjectsCache::Key cache_key {object_storage->getDescription(), configuration->getNamespace(), configuration->getRawPath().cutGlobs(false)}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and below, I'm not entirely sure if configuration->getRawPath().cutGlobs(false)
is a right alternative to what used to be configuration->getPathWithoutGlobs()
. @arthurpassos please take a look
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should provide true
to supports_partial_prefix
on Path::cutGlobs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is object storage, partial prefixes are supported. Consider the following path: root/key1=val1/year=202{1..9}
.
We need root/key1=val1/year=202
, but by providing false
, we will get: root/key1=val1/
else | ||
object_info->metadata = object_storage->getObjectMetadata(path); | ||
} | ||
object_info->loadMetadata(object_storage, query_settings.ignore_non_existent_file); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the original PRs it was just object_info->loadMetadata(object_storage)
, but signature was changed, used query_settings.ignore_non_existent_file
to plug the hole. Again, not sure if that's the right thing to do (@arthurpassos please check, and other instances of calling object_info->loadMetadata
too)
All stateless test failures seem to be caused by changes introduced in this PR.
It looks like either there is no required data files, or tests failed to produce those (minio access problem?) @arthurpassos Please take a look |
Amalgamation of multiple related PRs (in that order):
#933
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Lazy load metadata for metadata for DataLake. (#742 by @ianton-ru )
#938
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Parquet Metadata caching (#795 by @arthurpassos)
#931
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Cache for listobjects calls (#743 by @arthurpassos )
#1005
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Now clickhouse supports compressed
metadata.json
files for Iceberg. Fixes ClickHouse#70874. (ClickHouse#81451 by @arthurpassos)closes: #938, #931, #1005, #933
Exclude tests: