-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(ingest/snowflake): Create all structured propery templates before assignation #12469
base: master
Are you sure you want to change the base?
fix(ingest/snowflake): Create all structured propery templates before assignation #12469
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
... and 5 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
|
lint fix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple requests but looks good to me otherwise!
@@ -249,6 +249,11 @@ class SnowflakeV2Config( | |||
description="If enabled along with `extract_tags`, extracts snowflake's key-value tags as DataHub structured properties instead of DataHub tags.", | |||
) | |||
|
|||
structured_properties_template_cache_invalidation_interval: int = Field( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe hide this one from docs, feels more like an implementation detail
for tag in self.data_dictionary.get_all_tags(): | ||
if not self.config.tag_pattern.allowed(tag.tag_identifier()): | ||
continue | ||
# Do we need to filter based on database and schema or is it enough if we filter based on tag pattern? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say no, because you can apply tags from other databases / schemas
tags = [] | ||
for tag in cur: | ||
snowflake_tag = SnowflakeTag( | ||
database=tag["TAG_DATABASE"], | ||
schema=tag["TAG_SCHEMA"], | ||
name=tag["TAG_NAME"], | ||
value="", | ||
) | ||
tags.append(snowflake_tag) | ||
return tags |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be a list comprehension a bit more succinctly
for workunit in self.gen_tag_as_structured_property_workunits(tag): | ||
yield workunit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for workunit in self.gen_tag_as_structured_property_workunits(tag): | |
yield workunit | |
yield from self.gen_tag_as_structured_property_workunits(tag) |
@@ -219,6 +219,7 @@ def test_snowflake_tags_as_structured_properties( | |||
include_column_lineage=False, | |||
include_usage_stats=False, | |||
include_operational_stats=False, | |||
structured_properties_template_cache_invalidation_interval=1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For test I'd prob put this at 0 unless it's needed just to avoid unnecessary sleeps
@@ -59,6 +78,46 @@ def _get_tags_on_object_without_propagation( | |||
raise ValueError(f"Unknown domain {domain}") | |||
return tags | |||
|
|||
def create_structured_property_templates(self) -> Iterable[MetadataWorkUnit]: | |||
for tag in self.data_dictionary.get_all_tags(): | |||
if not self.config.tag_pattern.allowed(tag.tag_identifier()): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I think structured_property_pattern
makes more sense here, right? Our tag_pattern
expects a key value pair, not just a key
Checklist