Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Manifest v12 and run-results v6 json schemas #143

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

SumanMaharana
Copy link

@SumanMaharana SumanMaharana commented Jan 16, 2025

User description

Update Manifest v12 and run-results v6 json schemas


PR Type

Enhancement, Bug fix


Description

  • Updated dbt_version default values in multiple schemas to 1.10.0a1.

  • Introduced new freshness property in manifest_v12 schema with nested configurations.

  • Replaced enumerations with string types for granularity and grain-related fields in schemas.

  • Added new no-op status to run-results_v6 schema.


Changes walkthrough 📝

Relevant files
Enhancement
manifest_v12.py
Enhance manifest parser with freshness and granularity updates

dbt_artifacts_parser/parsers/manifest/manifest_v12.py

  • Updated dbt_version default to 1.10.0a1.
  • Added freshness property with nested build_after configuration.
  • Replaced enums with string types for granularity-related fields.
  • Adjusted depends_on references to new classes.
  • +169/-179
    run_results_v6.py
    Update run-results parser with new status                               

    dbt_artifacts_parser/parsers/run_results/run_results_v6.py

  • Updated dbt_version default to 1.10.0a1.
  • Added no-op status to Status enum.
  • +2/-1     
    manifest_v12.json
    Update manifest schema with freshness and tags                     

    dbt_artifacts_parser/resources/manifest/manifest_v12.json

  • Updated dbt_version default to 1.10.0a1.
  • Added freshness property with nested build_after configuration.
  • Replaced enums with string types for granularity-related fields.
  • Added tags property with string or array options.
  • +280/-288
    run-results_v6.json
    Update run-results schema with new status                               

    dbt_artifacts_parser/resources/run-results/run-results_v6.json

  • Updated dbt_version default to 1.10.0a1.
  • Added no-op status to status property.
  • +4/-3     

    Need help?
  • Type /help how to ... in the comments thread for any question about Qodo Merge usage.
  • Check out the documentation for more information.
  • Copy link

    coderabbitai bot commented Jan 16, 2025

    Walkthrough

    The pull request introduces significant changes to the manifest_v12.py and run_results_v6.py files in the dbt artifacts parser. In manifest_v12.py, new classes like Period, DependsOn5, BuildAfter, and Freshness are added, and multiple existing classes are updated to use new dependency and freshness-related types. The run_results_v6.py file sees minor updates, including a dbt version change and the addition of a no_op status to the Status enum.

    Changes

    File Changes
    dbt_artifacts_parser/parsers/manifest/manifest_v12.py - Added new classes: Period, DependsOn5, BuildAfter, Freshness
    - Updated Nodes class with optional freshness attribute
    - Numerous classes updated to use new DependsOn6 and DependsOn18 dependency types
    dbt_artifacts_parser/parsers/run_results/run_results_v6.py - Updated dbt_version from '1.9.0b2' to '1.10.0a1'
    - Added no_op status to Status enum

    Sequence Diagram

    sequenceDiagram
        participant Manifest as ManifestV12
        participant Nodes as NodesClass
        participant Freshness as FreshnessClass
        participant Dependency as DependencyClass
    
        Manifest->>Nodes: Add optional freshness attribute
        Nodes->>Freshness: Create freshness configuration
        Manifest->>Dependency: Update dependency types
        Dependency->>Nodes: Apply new dependency management
    
    Loading

    Possibly related PRs

    Poem

    🐰 Manifest magic, version twelve's dance,
    Dependencies twirl in a new-found trance
    Freshness blooms like a rabbit's spring leap
    Parsing artifacts, no detail too deep!
    Code evolves with each hoppy embrace 🌱


    📜 Recent review details

    Configuration used: CodeRabbit UI
    Review profile: CHILL
    Plan: Pro

    📥 Commits

    Reviewing files that changed from the base of the PR and between b7d4dba and da28ae4.

    ⛔ Files ignored due to path filters (2)
    • dbt_artifacts_parser/resources/manifest/manifest_v12.json is excluded by !**/*.json
    • dbt_artifacts_parser/resources/run-results/run-results_v6.json is excluded by !**/*.json
    📒 Files selected for processing (2)
    • dbt_artifacts_parser/parsers/manifest/manifest_v12.py (63 hunks)
    • dbt_artifacts_parser/parsers/run_results/run_results_v6.py (2 hunks)
    🔇 Additional comments (28)
    dbt_artifacts_parser/parsers/manifest/manifest_v12.py (26)

    20-20: Update dbt_version in Metadata class

    The dbt_version default value has been updated to '1.10.0a1' in the Metadata class. This change ensures the metadata reflects the correct dbt version corresponding to the updated schemas.


    825-850: Add model freshness configurations

    New classes Period, DependsOn5, BuildAfter, and Freshness have been introduced to support model freshness configurations. These additions enhance the ability to specify when models should be rebuilt based on time periods and dependencies.


    902-902: Include freshness attribute in Nodes4 class

    The optional freshness attribute has been added to the Nodes4 class, allowing models to define freshness criteria directly. This is consistent with the additions of the freshness-related classes.


    967-973: Update dependency structures in node classes

    The new DependsOn6 class has been defined, and the depends_on attribute in Nodes5, Nodes6, and Nodes7 has been updated to use DependsOn6. This update reflects the changes in dependency management and improves consistency across node types.

    Also applies to: 1018-1018, 1124-1124, 1330-1330


    1367-1367: Enhance source freshness configurations

    The Freshness1 class has been added to define freshness criteria for sources with warn_after, error_after, and filter attributes. The Sources class now includes loaded_at_query and freshness attributes. These enhancements provide more flexibility in source freshness definitions.

    Also applies to: 1455-1456


    1472-1472: Update macro dependency tracking

    A new DependsOn9 class has been introduced for macros, and the depends_on attribute in the Macros class now uses DependsOn9. This change unifies the dependency tracking mechanism for macros.

    Also applies to: 1504-1504


    1558-1558: Update dependencies in Exposures

    The DependsOn10 class has been added, and the depends_on attribute in Exposures now uses it. This ensures that exposures correctly reference their dependencies.

    Also applies to: 1587-1587


    Line range hint 1657-1668: Enhance metric type parameters

    Updates have been made to metric-related classes such as Numerator, Denominator, OffsetWindow, and Metric, including new attributes like offset_to_grain. These enhancements provide greater flexibility in metric calculations.

    Also applies to: 1685-1686, 1694-1694, 1725-1726


    1800-1801: Update CumulativeTypeParams with optional window

    The CumulativeTypeParams class now includes an optional window attribute, allowing more control over cumulative metric calculations.


    1873-1879: Adjust Metrics time granularity type

    The time_granularity attribute in the Metrics class has been changed from a specific Granularity enum to a generic str type. This change simplifies the specification of time granularity for metrics.


    1980-1980: Update dependency management in disabled nodes

    The DependsOn12 and DependsOn13 classes have been introduced, and the depends_on attribute in various disabled node classes now uses them. This ensures consistent dependency structures across all node types, including disabled ones.

    Also applies to: 2069-2069, 2135-2135, 2186-2186, 2283-2283


    2406-2406: Update depends_on in Disabled3

    The depends_on attribute in Disabled3 has been updated to use DependsOn13, aligning it with the new dependency classes.


    2578-2598: Add BuildAfter1 and Freshness2 classes

    New classes DependsOn17, BuildAfter1, and Freshness2 have been added to support freshness configurations in disabled models. These classes mirror earlier additions for model freshness.


    Line range hint 2633-2649: Include freshness in Disabled4

    The freshness attribute has been added to the Disabled4 model definitions, enabling freshness configurations for disabled models.


    2714-2721: Update dependencies in disabled SQL operations and tests

    The DependsOn18 class has been introduced, and the depends_on attribute in Disabled5 and Disabled6 now uses it. This ensures consistent dependency tracking across disabled SQL operations and tests.

    Also applies to: 2765-2765, 2862-2862


    3057-3057: Update depends_on in Disabled7

    The depends_on attribute in Disabled7 uses DependsOn18, maintaining consistency in dependency management for disabled snapshots.


    3084-3084: Enhance freshness configurations in disabled sources

    The Freshness3 class has been added, and the loaded_at_query and freshness attributes are now included in Disabled8. This supports freshness configurations in disabled sources.

    Also applies to: 3161-3162


    3215-3215: Update depends_on in disabled exposures

    The depends_on attribute in Disabled9 now uses DependsOn18, aligning with the updated dependency structure.


    Line range hint 3280-3317: Enhance metric parameters in disabled metrics

    Updates have been made to metric type parameters in disabled metrics, including new attributes like offset_window and offset_to_grain in Numerator1, Denominator1, and Metric1. These changes enhance the functionality of disabled metrics.

    Also applies to: 3364-3373, 3386-3386


    3435-3441: Update dependencies in disabled metrics

    The depends_on attribute in Disabled10 now uses DependsOn18, ensuring consistent dependency tracking.


    3537-3540: Update depends_on in disabled saved queries

    The depends_on attribute in Disabled11 uses DependsOn18, maintaining consistency in dependency management for disabled saved queries.


    3567-3573: Introduce entities, measures, and dimensions in semantic models

    New classes and enums such as Entity, Measure2, Dimension, and TimeGranularity have been added to define entities, measures, and dimensions within semantic models. These additions provide a structured approach to semantic model definitions.

    Also applies to: 3584-3584, 3642-3642, 3650-3663, 3700-3700, 3711-3711, 3741-3744


    Line range hint 3784-3819: Add definitions for disabled unit tests

    Classes related to unit tests, including GivenItem, Expect, Overrides, and Disabled13, have been added. These allow the definition of unit tests for models, even when they are disabled.


    3901-3907: Update depends_on in SavedQueries

    The depends_on attribute in SavedQueries now uses DependsOn18, ensuring consistent dependency tracking.


    3917-3923: Enhance semantic models with additional classes

    Additional classes and updates have been made to semantic models, including Entity1, Measure3, and Dimension1. These enhancements expand the capabilities and definitions within semantic models.

    Also applies to: 3934-3934, 3959-3959, 3995-3995, 4006-4006, 4036-4039


    Line range hint 4064-4091: Add unit test definitions

    New classes for unit tests (GivenItem1, Expect1, UnitTests) have been added, allowing for the definition and tracking of unit tests within the manifest.

    dbt_artifacts_parser/parsers/run_results/run_results_v6.py (2)

    19-19: Update dbt_version in Metadata class

    The dbt_version default value has been updated to '1.10.0a1' in the Metadata class. This change ensures that run_results_v6 aligns with the updated dbt version.


    30-30: Add no_op status to Status enum

    The no_op value has been added to the Status enum. This addition allows the parser to handle tasks with a 'no-op' status, improving the representation of operation statuses.

    Finishing Touches

    • 📝 Generate Docstrings (Beta)

    Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

    ❤️ Share
    🪧 Tips

    Chat

    There are 3 ways to chat with CodeRabbit:

    • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
      • I pushed a fix in commit <commit_id>, please review it.
      • Generate unit testing code for this file.
      • Open a follow-up GitHub issue for this discussion.
    • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
      • @coderabbitai generate unit testing code for this file.
      • @coderabbitai modularize this function.
    • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
      • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
      • @coderabbitai read src/utils.ts and generate unit testing code.
      • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
      • @coderabbitai help me debug CodeRabbit configuration file.

    Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

    CodeRabbit Commands (Invoked using PR comments)

    • @coderabbitai pause to pause the reviews on a PR.
    • @coderabbitai resume to resume the paused reviews.
    • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
    • @coderabbitai full review to do a full review from scratch and review all the files again.
    • @coderabbitai summary to regenerate the summary of the PR.
    • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
    • @coderabbitai resolve resolve all the CodeRabbit review comments.
    • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
    • @coderabbitai help to get help.

    Other keywords and placeholders

    • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
    • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
    • Add @coderabbitai anywhere in the PR title to generate the title automatically.

    CodeRabbit Configuration File (.coderabbit.yaml)

    • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
    • Please see the configuration documentation for more information.
    • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

    Documentation and Community

    • Visit our Documentation for detailed information on how to use CodeRabbit.
    • Join our Discord Community to get help, request features, and share feedback.
    • Follow us on X/Twitter for updates and announcements.

    Copy link

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Data Validation

    The removal of strict enums for granularity fields in favor of string types reduces type safety and validation. Consider adding runtime validation to ensure only valid granularity values are used.

    count: int
    granularity: str
    Backwards Compatibility

    The addition of new freshness property with nested build_after configuration may require careful validation to ensure backwards compatibility with existing manifests.

    class BuildAfter(BaseParserModel):
        model_config = ConfigDict(
            extra='forbid',
        )
        count: Optional[int] = 0
        period: Optional[Period] = 'hour'
        depends_on: Optional[DependsOn5] = 'any'
    
    
    class Freshness(BaseParserModel):
        model_config = ConfigDict(
            extra='forbid',
        )
        build_after: Optional[BuildAfter] = Field(None, title='ModelBuildAfter')

    Copy link

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Score
    General
    Add string pattern validation to ensure granularity values remain consistent with previously allowed values

    Consider adding validation constraints or a pattern property to ensure that the
    granularity string values follow the previously defined enum values (nanosecond,
    microsecond, etc.) to maintain data consistency.

    dbt_artifacts_parser/resources/manifest/manifest_v12.json [9015-9016]

     "granularity": {
    -  "type": "string"
    +  "type": "string",
    +  "pattern": "^(nanosecond|microsecond|millisecond|second|minute|hour|day|week|month|quarter|year)$"
     }
    • Apply this suggestion
    Suggestion importance[1-10]: 8

    Why: The suggestion addresses a potential data consistency issue by ensuring that string values for granularity still follow the previously defined enum values, preventing invalid inputs while maintaining flexibility.

    8
    Use proper enum value instead of string literal for enum field default value

    The default value for period in BuildAfter class should be a proper enum value using
    Period.hour instead of the string literal 'hour' to ensure type safety and prevent
    potential runtime errors.

    dbt_artifacts_parser/parsers/manifest/manifest_v12.py [841]

    -period: Optional[Period] = 'hour'
    +period: Optional[Period] = Period.hour
    • Apply this suggestion
    Suggestion importance[1-10]: 7

    Why: Using a string literal instead of the proper enum value for an enum field's default value could lead to type errors and inconsistencies. The suggestion correctly recommends using Period.hour for better type safety.

    7

    @yu-iskw
    Copy link
    Owner

    yu-iskw commented Jan 17, 2025

    @SumanMaharana Thank you for the contribution. However, I want to catch up with only the stable version, as the schema can be changed while releasing the next stable version.

    https://github.com/yu-iskw/dbt-artifacts-parser/blob/main/CONTRIBUTING.md#implementation-policy

    @SumanMaharana
    Copy link
    Author

    @SumanMaharana Thank you for the contribution. However, I want to catch up with only the stable version, as the schema can be changed while releasing the next stable version.

    https://github.com/yu-iskw/dbt-artifacts-parser/blob/main/CONTRIBUTING.md#implementation-policy

    @yu-iskw So this can only be accepted after the release of dbt-core v1.10 ?
    I have picked up the manifest and run_results from the dbt-core main

    @yu-iskw
    Copy link
    Owner

    yu-iskw commented Jan 19, 2025

    @SumanMaharana The main branch of dbt-core is in the middle of the development to 1.10 now. The latest stable version is at the tag v1.9.1. The manifest v12 and run results v6 in this repository come from v1.9.1.

    https://github.com/dbt-labs/dbt-core/releases/tag/v1.9.1

    I suppose that dbt-artifacts-parser v0.8.2 should work as long as you use a stable version of dbt-core. Did you encounter any issue with incompatibility to your dbt environment?

    @SumanMaharana
    Copy link
    Author

    @yu-iskw I used the artifacts generated by the dbt cloud. While parsing those artifacts, validation errors are being thrown for the entire manifest file and the run_results file.

    @yu-iskw
    Copy link
    Owner

    yu-iskw commented Jan 20, 2025

    @SumanMaharana I see. Do the schemas on the main branch work? I want to have the error logs, if you don't mind. As I don't use dbt Cloud, I can't try out dbt-artifacts-parser on dbt artifacts generated by dbt Cloud.

    @SumanMaharana
    Copy link
    Author

    @SumanMaharana I see. Do the schemas on the main branch work? I want to have the error logs, if you don't mind. As I don't use dbt Cloud, I can't try out dbt-artifacts-parser on dbt artifacts generated by dbt Cloud.

    The Error is on the whole manifest file(its too long to be shared here) you can just use this manifest file and run it it generates the same error.
    If you still wanna have a look ill share with you the whole error

    @yu-iskw
    Copy link
    Owner

    yu-iskw commented Jan 27, 2025

    @SumanMaharana Let me clarify what you did. Did you try to parse this manifest file with dbt-artifacts-parser? This package is used for parsing dbt artifacts, not for those JSON schemas.

    @SumanMaharana
    Copy link
    Author

    @SumanMaharana Let me clarify what you did. Did you try to parse this manifest file with dbt-artifacts-parser? This package is used for parsing dbt artifacts, not for those JSON schemas.

    @yu-iskw Nope ohh my bad i pasted the wrong link in there. i know that manifest is used to generate the python model its not the actual manifest.
    I used the artifact generated by the dbtcloud workflow after a dbt job is being run.
    Screenshot 2025-01-28 at 11 09 25 AM

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    None yet
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    2 participants