Skip to content

feat: support custom config transformations #653

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 14, 2025

Conversation

ChristoGrab
Copy link
Collaborator

@ChristoGrab ChristoGrab commented Jul 14, 2025

What

Adds support for custom config transformations for config migrations. Added to support the migration of source Mailchimp, which makes a separate API call to a metadata endpoint to obtain the connection's data center.

How

  • Added new CustomConfigTransformation class using the existing pattern for custom component support
  • Added the new class to the list of transformations accessible to the parent ConfigMigration component

Summary by CodeRabbit

  • New Features
    • Added support for custom configuration transformations in declarative source manifests, enabling specification of custom transformation classes with optional parameters.
  • Tests
    • Introduced tests validating custom configuration transformations, including parameter handling and transformation effects on configurations.

@github-actions github-actions bot added the enhancement New feature or request label Jul 14, 2025
Copy link

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This CDK Version

You can test this version of the CDK using the following:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@christo/custom-config-migration#egg=airbyte-python-cdk[dev]' --help

# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch christo/custom-config-migration

Helpful Resources

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /autofix - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test - Runs connector tests with the updated CDK
  • /poe <command> - Runs any poe command in the CDK environment

📝 Edit this welcome message.

Copy link

github-actions bot commented Jul 14, 2025

PyTest Results (Fast)

3 695 tests  +2   3 684 ✅ +2   6m 12s ⏱️ -3s
    1 suites ±0      11 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit 26d09a0. ± Comparison against base commit 940e1fc.

♻️ This comment has been updated with latest results.

Copy link

github-actions bot commented Jul 14, 2025

PyTest Results (Full)

3 698 tests   3 687 ✅  18m 7s ⏱️
    1 suites     11 💤
    1 files        0 ❌

Results for commit 26d09a0.

♻️ This comment has been updated with latest results.

@pnilan pnilan self-requested a review July 14, 2025 17:18
Copy link
Contributor

@pnilan pnilan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good -- I had expected us to need custom config transformations at some point so I'm all for enabling this.

@ChristoGrab ChristoGrab marked this pull request as ready for review July 14, 2025 17:55
Copy link
Contributor

coderabbitai bot commented Jul 14, 2025

📝 Walkthrough

Walkthrough

A new CustomConfigTransformation component type was introduced to the declarative configuration schema and models, allowing users to specify custom transformation classes for configuration normalization and migration. The factory was updated to instantiate these components, and new unit tests were added to validate custom transformation logic and parameterization.

Changes

File(s) Change Summary
airbyte_cdk/sources/declarative/declarative_component_schema.yaml Added CustomConfigTransformation definition; updated transformation lists to accept this new type.
airbyte_cdk/sources/declarative/models/declarative_component_schema.py Introduced CustomConfigTransformation Pydantic model; updated transformation fields in relevant models to accept the new type.
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py Registered CustomConfigTransformationModel for instantiation in the component factory.
unit_tests/sources/declarative/transformations/config_transformations/test_custom_config_transformation.py Added tests for custom config transformation logic and handling of parameters.

Sequence Diagram(s)

sequenceDiagram
    participant DeclarativeManifest
    participant ModelParser
    participant ModelToComponentFactory
    participant CustomTransformationClass

    DeclarativeManifest->>ModelParser: Parse manifest with CustomConfigTransformation
    ModelParser->>ModelToComponentFactory: Instantiate CustomConfigTransformationModel
    ModelToComponentFactory->>CustomTransformationClass: Create instance using class_name and parameters
    CustomTransformationClass-->>ModelToComponentFactory: Return component instance
    ModelToComponentFactory-->>ModelParser: Return instantiated transformation
    ModelParser-->>DeclarativeManifest: Transformation ready for config processing
Loading

Suggested reviewers

  • lazebnyi
  • aldogonzalez8

Would you like me to suggest additional test cases focusing on error handling or edge cases for custom transformation classes, or do you feel the current coverage is sufficient? Wdyt?


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 365a853 and 26d09a0.

📒 Files selected for processing (1)
  • airbyte_cdk/sources/declarative/declarative_component_schema.yaml (3 hunks)
✅ Files skipped from review due to trivial changes (1)
  • airbyte_cdk/sources/declarative/declarative_component_schema.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Check: source-shopify
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Analyze (python)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🔭 Outside diff range comments (1)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)

2163-2187: Pipeline formatting issue needs attention.

The Ruff linter is flagging formatting issues in this range. Since this is an auto-generated file from declarative_component_schema.yaml, would you mind running the code formatter or regenerating the file to resolve these style violations?

🧹 Nitpick comments (2)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (2)

224-230: Pipeline fails – please sort the import you just added

ruff flagged the import block as unsorted after the addition of CustomConfigTransformationModel. Running ruff --fix or isort will automatically place the new import in the correct position (alongside the other Custom* models), keeping the huge import section deterministic. wdyt?


690-696: Explicit constructor could aid type-safety

CustomConfigTransformationModel is wired to the generic create_custom_component. That works, but all other *_TransformationModel² (e.g. AddFields, RemoveFields) have explicit helpers that enforce the ConfigTransformation interface at construction time.
Would it make sense to add a small create_custom_config_transformation wrapper (delegating to create_custom_component) so future refactors can rely on static dispatch? wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 940e1fc and 1444335.

📒 Files selected for processing (4)
  • airbyte_cdk/sources/declarative/declarative_component_schema.yaml (3 hunks)
  • airbyte_cdk/sources/declarative/models/declarative_component_schema.py (3 hunks)
  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (2 hunks)
  • unit_tests/sources/declarative/transformations/config_transformations/test_custom_config_transformation.py (1 hunks)
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (6)

undefined

<retrieved_learning>
Learnt from: aaronsteers
PR: #174
File: airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py:1093-1102
Timestamp: 2025-01-14T00:20:32.310Z
Learning: In the airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py file, the strict module name checks in _get_class_from_fully_qualified_class_name (requiring module_name to be "components" and module_name_full to be "source_declarative_manifest.components") are intentionally designed to provide early, clear feedback when class declarations won't be found later in execution. These restrictions may be loosened in the future if the requirements for class definition locations change.
</retrieved_learning>

<retrieved_learning>
Learnt from: pnilan
PR: airbytehq/airbyte-python-cdk#0
File: :0-0
Timestamp: 2024-12-11T16:34:46.319Z
Learning: In the airbytehq/airbyte-python-cdk repository, the declarative_component_schema.py file is auto-generated from declarative_component_schema.yaml and should be ignored in the recommended reviewing order.
</retrieved_learning>

<retrieved_learning>
Learnt from: aaronsteers
PR: #58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in airbyte_cdk/cli/source_declarative_manifest/, including _run.py, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.
</retrieved_learning>

<retrieved_learning>
Learnt from: ChristoGrab
PR: #58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the YamlDeclarativeSource class in airbyte_cdk/sources/declarative/yaml_declarative_source.py, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
</retrieved_learning>

<retrieved_learning>
Learnt from: aaronsteers
PR: #58
File: airbyte_cdk/cli/source_declarative_manifest/spec.json:9-15
Timestamp: 2024-11-15T00:59:08.154Z
Learning: When code in airbyte_cdk/cli/source_declarative_manifest/ is being imported from another repository, avoid suggesting modifications to it during the import process.
</retrieved_learning>

<retrieved_learning>
Learnt from: aaronsteers
PR: #174
File: unit_tests/source_declarative_manifest/resources/source_the_guardian_api/components.py:21-29
Timestamp: 2025-01-13T23:39:15.457Z
Learning: The CustomPageIncrement class in unit_tests/source_declarative_manifest/resources/source_the_guardian_api/components.py is imported from another connector definition and should not be modified in this context.
</retrieved_learning>

airbyte_cdk/sources/declarative/declarative_component_schema.yaml (6)

undefined

<retrieved_learning>
Learnt from: pnilan
PR: airbytehq/airbyte-python-cdk#0
File: :0-0
Timestamp: 2024-12-11T16:34:46.319Z
Learning: In the airbytehq/airbyte-python-cdk repository, the declarative_component_schema.py file is auto-generated from declarative_component_schema.yaml and should be ignored in the recommended reviewing order.
</retrieved_learning>

<retrieved_learning>
Learnt from: ChristoGrab
PR: #58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the YamlDeclarativeSource class in airbyte_cdk/sources/declarative/yaml_declarative_source.py, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
</retrieved_learning>

<retrieved_learning>
Learnt from: aaronsteers
PR: #58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in airbyte_cdk/cli/source_declarative_manifest/, including _run.py, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.
</retrieved_learning>

<retrieved_learning>
Learnt from: aaronsteers
PR: #174
File: airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py:1093-1102
Timestamp: 2025-01-14T00:20:32.310Z
Learning: In the airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py file, the strict module name checks in _get_class_from_fully_qualified_class_name (requiring module_name to be "components" and module_name_full to be "source_declarative_manifest.components") are intentionally designed to provide early, clear feedback when class declarations won't be found later in execution. These restrictions may be loosened in the future if the requirements for class definition locations change.
</retrieved_learning>

<retrieved_learning>
Learnt from: aaronsteers
PR: #58
File: airbyte_cdk/cli/source_declarative_manifest/spec.json:9-15
Timestamp: 2024-11-15T00:59:08.154Z
Learning: When code in airbyte_cdk/cli/source_declarative_manifest/ is being imported from another repository, avoid suggesting modifications to it during the import process.
</retrieved_learning>

<retrieved_learning>
Learnt from: ChristoGrab
PR: #221
File: airbyte_cdk/sources/utils/transform.py:0-0
Timestamp: 2025-01-16T00:50:39.069Z
Learning: In the TypeTransformer class, the data being transformed comes from API responses or source systems, so only standard JSON-serializable types are expected. The python_to_json mapping covers all expected types, and it's designed to fail fast (KeyError) on unexpected custom types rather than providing fallbacks.
</retrieved_learning>

unit_tests/sources/declarative/transformations/config_transformations/test_custom_config_transformation.py (1)
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#174
File: unit_tests/source_declarative_manifest/resources/source_the_guardian_api/components.py:21-29
Timestamp: 2025-01-13T23:39:15.457Z
Learning: The CustomPageIncrement class in unit_tests/source_declarative_manifest/resources/source_the_guardian_api/components.py is imported from another connector definition and should not be modified in this context.
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (5)

undefined

<retrieved_learning>
Learnt from: pnilan
PR: airbytehq/airbyte-python-cdk#0
File: :0-0
Timestamp: 2024-12-11T16:34:46.319Z
Learning: In the airbytehq/airbyte-python-cdk repository, the declarative_component_schema.py file is auto-generated from declarative_component_schema.yaml and should be ignored in the recommended reviewing order.
</retrieved_learning>

<retrieved_learning>
Learnt from: aaronsteers
PR: #174
File: airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py:1093-1102
Timestamp: 2025-01-14T00:20:32.310Z
Learning: In the airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py file, the strict module name checks in _get_class_from_fully_qualified_class_name (requiring module_name to be "components" and module_name_full to be "source_declarative_manifest.components") are intentionally designed to provide early, clear feedback when class declarations won't be found later in execution. These restrictions may be loosened in the future if the requirements for class definition locations change.
</retrieved_learning>

<retrieved_learning>
Learnt from: ChristoGrab
PR: #58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the YamlDeclarativeSource class in airbyte_cdk/sources/declarative/yaml_declarative_source.py, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
</retrieved_learning>

<retrieved_learning>
Learnt from: aaronsteers
PR: #58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in airbyte_cdk/cli/source_declarative_manifest/, including _run.py, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.
</retrieved_learning>

<retrieved_learning>
Learnt from: ChristoGrab
PR: #221
File: airbyte_cdk/sources/utils/transform.py:0-0
Timestamp: 2025-01-16T00:50:39.069Z
Learning: In the TypeTransformer class, the data being transformed comes from API responses or source systems, so only standard JSON-serializable types are expected. The python_to_json mapping covers all expected types, and it's designed to fail fast (KeyError) on unexpected custom types rather than providing fallbacks.
</retrieved_learning>

🧬 Code Graph Analysis (1)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)
  • CustomConfigTransformation (163-174)
🪛 GitHub Actions: Linters
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py

[error] 5-617: Ruff: Import block is un-sorted or un-formatted. Organize imports. 1 error found, 1 fixable with the '--fix' option.

unit_tests/sources/declarative/transformations/config_transformations/test_custom_config_transformation.py

[error] 25-35: Ruff formatting check failed. The file requires reformatting to comply with code style.

airbyte_cdk/sources/declarative/models/declarative_component_schema.py

[error] 2163-2187: Ruff formatting check failed. The file requires reformatting to comply with code style.

🔇 Additional comments (11)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (2)

3852-3856: CustomConfigTransformation added to the normal-sync transformation list – looks good!
No issues spotted in this hunk.


3884-3890: CustomConfigTransformation also wired into ConfigMigration – great coverage
Everything hooks up cleanly here as well.

unit_tests/sources/declarative/transformations/config_transformations/test_custom_config_transformation.py (5)

1-10: LGTM!

The imports and file header look clean and follow the expected conventions.


12-19: Nice implementation!

The mock class structure is clean and well-documented. The constructor properly handles optional parameters with safe defaults.


21-32: Great logic implementation!

The transform method correctly identifies user keys and applies transformations appropriately. The parameter handling is clean and the comments provide helpful context.


35-43: Solid test coverage!

The test properly validates both the preservation of original config and the addition of transformed fields. Clean and focused.


46-56: Excellent parameterized test!

This test nicely validates that parameters are correctly applied during transformation. The assertions comprehensively check all expected fields.

airbyte_cdk/sources/declarative/models/declarative_component_schema.py (4)

163-175: The new CustomConfigTransformation class structure looks solid!

The implementation follows the established pattern of other custom classes in the codebase with proper Config settings, type literal, and class name validation. The description and example clearly indicate the expected fully-qualified naming format. This should work well for the Mailchimp migration use case mentioned in the PR objectives, wdyt?


2166-2166: Clean integration into ConfigMigration transformations union!

Adding CustomConfigTransformation to the transformations list allows it to be used within migration workflows, which aligns perfectly with the PR's goal of supporting configuration migrations.


2183-2183: Good addition to ConfigNormalizationRules transformations as well!

This ensures custom transformations can be used both in migrations and normalization rules, providing flexibility for different use cases.


1-5: Note: This is an auto-generated file

Based on the retrieved learnings, this file is auto-generated from declarative_component_schema.yaml and should generally be ignored in the reviewing order. The changes here should reflect updates made to the source YAML schema. Just wanted to make sure this is intentional and the schema file was updated accordingly, wdyt?

@ChristoGrab ChristoGrab merged commit 7d045f7 into main Jul 14, 2025
32 checks passed
@ChristoGrab ChristoGrab deleted the christo/custom-config-migration branch July 14, 2025 23:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants