-
Notifications
You must be signed in to change notification settings - Fork 69
feat: Support connector_builder_project_id in get_custom_source_definition #830
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: Support connector_builder_project_id in get_custom_source_definition #830
Conversation
Original prompt from AJ Steers |
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
📝 WalkthroughWalkthroughThe CloudWorkspace.get_custom_source_definition API now requires exactly one identifier: either Changes
Sequence DiagramsequenceDiagram
participant Caller
participant CloudWorkspace
participant API_Util
participant ConfigAPI
Caller->>CloudWorkspace: get_custom_source_definition(definition_id?, definition_type, connector_builder_project_id?)
alt invalid parameters
CloudWorkspace->>CloudWorkspace: validate exactly one identifier
CloudWorkspace-->>Caller: PyAirbyteInputError
else connector_builder_project_id with non-yaml
CloudWorkspace-->>Caller: PyAirbyteInputError
else YAML + connector_builder_project_id
CloudWorkspace->>API_Util: get_definition_id_for_connector_builder_project(workspace_id, builder_project_id)
API_Util->>ConfigAPI: POST /connector_builder_projects/get_with_manifest
ConfigAPI-->>API_Util: { sourceDefinitionId: id | null }
API_Util-->>CloudWorkspace: sourceDefinitionId or AirbyteError
alt id found
CloudWorkspace->>ConfigAPI: get_custom_yaml_source_definition(resolved_id)
ConfigAPI-->>CloudWorkspace: YAML manifest
CloudWorkspace-->>Caller: CustomCloudSourceDefinition
else not found / error
CloudWorkspace-->>Caller: AirbyteError
end
else YAML + definition_id
CloudWorkspace->>ConfigAPI: get_custom_yaml_source_definition(definition_id)
ConfigAPI-->>CloudWorkspace: YAML manifest
CloudWorkspace-->>Caller: CustomCloudSourceDefinition
else Docker
CloudWorkspace-->>Caller: NotImplementedError
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Possibly related PRs
Would you like me to suggest combining the new validation tests into a single parametrized test to reduce duplication, wdyt? Pre-merge checks and finishing touches✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. Testing This PyAirbyte VersionYou can test this version of PyAirbyte using the following: # Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1760655651-custom-source-definition-builder-project-id' pyairbyte --help
# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1760655651-custom-source-definition-builder-project-id'Helpful ResourcesPR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
Community SupportQuestions? Join the #pyairbyte channel in our Slack workspace. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
airbyte/cloud/workspaces.py (1)
637-665: Consider documenting the two-call overhead for builder_project_id lookups.The implementation resolves
connector_builder_project_idtodefinition_id(lines 638-644) then fetches the definition (lines 658-664), resulting in two API calls. This is necessary since no direct endpoint exists, but might be worth noting in the docstring for users concerned about performance, wdyt?Optional addition to docstring:
Args: definition_id: The definition ID. Mutually exclusive with `connector_builder_project_id`. definition_type: Connector type ("yaml" or "docker"). Required. connector_builder_project_id: The connector builder project ID. Mutually exclusive with - `definition_id`. + `definition_id`. Note: Lookups by builder_project_id require two API calls + (resolve ID + fetch definition).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
airbyte/cloud/workspaces.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
airbyte/cloud/workspaces.py (2)
airbyte/exceptions.py (2)
PyAirbyteInputError(201-210)AirbyteError(432-447)airbyte/_util/api_util.py (2)
get_custom_yaml_source_definition(1050-1084)get_definition_id_for_connector_builder_project(1199-1234)
🪛 GitHub Actions: Run Linters
airbyte/cloud/workspaces.py
[error] 594-594: Command 'poetry run ruff check .' failed with lint error: E501 Line too long (101 > 100).
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.11, Windows)
- GitHub Check: Pytest (No Creds)
- GitHub Check: Pytest (All, Python 3.10, Windows)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (Fast)
🔇 Additional comments (4)
airbyte/cloud/workspaces.py (4)
584-590: The signature doesn't enforce keyword-only fordefinition_id.The PR summary mentions changing to keyword-only (breaking change), but
definition_idappears before the*marker, so it's still positional. Was the intent to make both parameters keyword-only?If so, consider moving
definition_idafter the*:def get_custom_source_definition( self, - definition_id: str | None = None, *, + definition_id: str | None = None, definition_type: Literal["yaml", "docker"], connector_builder_project_id: str | None = None, ) -> CustomCloudSourceDefinition:This would make all three parameters keyword-only, which aligns with the breaking change mentioned in the PR objectives. Wdyt?
606-615: LGTM! XOR validation is correct.The validation logic correctly ensures exactly one of
definition_idorconnector_builder_project_idis provided using the XOR pattern(definition_id is None) == (connector_builder_project_id is None).
617-624: LGTM! Type constraint validation is correct.The check appropriately restricts
connector_builder_project_idto YAML definitions only, with clear error messaging.
646-656: LGTM! Error handling for missing builder_project_id is clear.The error message and context clearly indicate when a
connector_builder_project_iddoesn't resolve to a definition, making debugging straightforward.
|
Fixed! Changed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
airbyte/cloud/workspaces.py (1)
594-595: Line length issue already flagged in previous review.The line length violation (E501) on line 594 has already been reported. Please address the fix suggested in the previous review comment.
🧹 Nitpick comments (1)
airbyte/cloud/workspaces.py (1)
627-654: Consider reducing code duplication by refactoring the fetch logic, wdyt?The implementation correctly handles both lookup paths, but lines 629-636 and 647-654 duplicate the fetch-and-return logic. This could be simplified by resolving
definition_idfirst (whenconnector_builder_project_idis provided), then having a single fetch path.Here's a refactored version that reduces duplication:
if definition_type == "yaml": + # Resolve definition_id if using connector_builder_project_id + resolved_definition_id = definition_id + if connector_builder_project_id is not None: + resolved_definition_id = api_util.get_definition_id_for_connector_builder_project( + workspace_id=self.workspace_id, + builder_project_id=connector_builder_project_id, + api_root=self.api_root, + client_id=self.client_id, + client_secret=self.client_secret, + ) + + # Fetch definition using resolved ID - if definition_id is not None: - result = api_util.get_custom_yaml_source_definition( - workspace_id=self.workspace_id, - definition_id=definition_id, - api_root=self.api_root, - client_id=self.client_id, - client_secret=self.client_secret, - ) - return CustomCloudSourceDefinition._from_yaml_response(self, result) # noqa: SLF001 - - if connector_builder_project_id is not None: - definition_id = api_util.get_definition_id_for_connector_builder_project( - workspace_id=self.workspace_id, - builder_project_id=connector_builder_project_id, - api_root=self.api_root, - client_id=self.client_id, - client_secret=self.client_secret, - ) - - result = api_util.get_custom_yaml_source_definition( - workspace_id=self.workspace_id, - definition_id=definition_id, - api_root=self.api_root, - client_id=self.client_id, - client_secret=self.client_secret, - ) - return CustomCloudSourceDefinition._from_yaml_response(self, result) # noqa: SLF001 + result = api_util.get_custom_yaml_source_definition( + workspace_id=self.workspace_id, + definition_id=resolved_definition_id, + api_root=self.api_root, + client_id=self.client_id, + client_secret=self.client_secret, + ) + return CustomCloudSourceDefinition._from_yaml_response(self, result) # noqa: SLF001
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
airbyte/_util/api_util.py(1 hunks)airbyte/cloud/workspaces.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- airbyte/_util/api_util.py
🧰 Additional context used
🧬 Code graph analysis (1)
airbyte/cloud/workspaces.py (3)
airbyte/cloud/connectors.py (3)
connector_builder_project_id(369-387)CustomCloudSourceDefinition(261-591)_from_yaml_response(527-539)airbyte/exceptions.py (1)
PyAirbyteInputError(201-210)airbyte/_util/api_util.py (2)
get_custom_yaml_source_definition(1050-1084)get_definition_id_for_connector_builder_project(1199-1246)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Pytest (All, Python 3.11, Windows)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Windows)
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (No Creds)
🔇 Additional comments (3)
airbyte/cloud/workspaces.py (3)
603-606: Nice documentation of error cases!The Raises section clearly documents both validation scenarios. This helps users understand what inputs are valid.
607-616: LGTM! XOR validation is correctly implemented.The logic correctly enforces that exactly one of the two parameters must be provided. The error message is clear and the context aids debugging.
618-625: Appropriate type restriction validation.Correctly enforces that
connector_builder_project_idis only valid for YAML definitions. The error message clearly explains the constraint.
PyTest Results (Full)382 tests +1 364 ✅ - 1 26m 49s ⏱️ +23s For more details on these failures, see this check. Results for commit 44309a8. ± Comparison against base commit 813d0d0. ♻️ This comment has been updated with latest results. |
|
I've added the integration test as requested, but it's currently failing. The issue is that the It appears that when a YAML source definition is published via
However, these are separate entities and the reverse lookup (builder_project_id → definition_id) doesn't seem to be available through this endpoint. The existing test in builder_project_id = fetched.connector_builder_project_id
if builder_project_id:
# Only test if builder_project_id existsShould we:
Let me know which direction you'd like to take! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
airbyte/cloud/workspaces.py(1 hunks)tests/integration_tests/cloud/test_custom_definitions.py(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
tests/integration_tests/cloud/test_custom_definitions.py (4)
airbyte/cloud/connectors.py (3)
name(113-121)name(305-309)connector_builder_project_id(369-387)airbyte/cloud/workspaces.py (2)
get_custom_source_definition(584-650)publish_custom_source_definition(461-552)airbyte/_util/text_util.py (1)
generate_random_suffix(14-22)airbyte/exceptions.py (1)
PyAirbyteInputError(201-210)
airbyte/cloud/workspaces.py (3)
airbyte/cloud/connectors.py (3)
connector_builder_project_id(369-387)CustomCloudSourceDefinition(261-591)_from_yaml_response(527-539)airbyte/exceptions.py (1)
PyAirbyteInputError(201-210)airbyte/_util/api_util.py (2)
get_definition_id_for_connector_builder_project(1199-1246)get_custom_yaml_source_definition(1050-1084)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (6)
tests/integration_tests/cloud/test_custom_definitions.py (3)
80-80: LGTM!Good catch updating this to use keyword arguments—aligns perfectly with the method signature change.
88-95: Nice addition to exercise the builder project lookup!This round-trip validation (fetch by definition_id → get builder_project_id → fetch by builder_project_id) is a great way to ensure consistency. The conditional check for
builder_project_idis also appropriate.
173-203: Excellent validation coverage!This unit test nicely covers the XOR validation cases and the type constraint for
connector_builder_project_id. Clean and focused.airbyte/cloud/workspaces.py (3)
586-589: Method signature looks good.The optional parameters with XOR validation are correctly implemented. Note that making
definition_idoptional is indeed a breaking change (as documented in the PR), but the updated tests demonstrate the migration path.
591-606: Clear documentation.The docstring accurately describes the new parameters and validation rules. The line breaking at 594-595 also resolves the previous lint concern.
641-650: LGTM!The fetch logic is clean and correct. The assert at line 642 is safe given the validation above ensures exactly one parameter is provided and definition_id is resolved from connector_builder_project_id when needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
airbyte/cloud/workspaces.py(1 hunks)tests/integration_tests/cloud/test_custom_definitions.py(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
airbyte/cloud/workspaces.py (3)
airbyte/cloud/connectors.py (1)
connector_builder_project_id(369-387)airbyte/exceptions.py (1)
PyAirbyteInputError(201-210)airbyte/_util/api_util.py (2)
get_definition_id_for_connector_builder_project(1199-1246)get_custom_yaml_source_definition(1050-1084)
tests/integration_tests/cloud/test_custom_definitions.py (4)
airbyte/cloud/connectors.py (3)
name(113-121)name(305-309)connector_builder_project_id(369-387)airbyte/cloud/workspaces.py (1)
get_custom_source_definition(584-650)airbyte/_util/text_util.py (1)
generate_random_suffix(14-22)airbyte/exceptions.py (1)
PyAirbyteInputError(201-210)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
| if definition_type != "yaml": | ||
| raise NotImplementedError( | ||
| "Docker custom source definitions are not yet supported. " | ||
| "Only YAML manifest-based custom sources are currently available." | ||
| ) | ||
|
|
||
| if (definition_id is None) == (connector_builder_project_id is None): | ||
| raise exc.PyAirbyteInputError( | ||
| message=( | ||
| "Must specify EXACTLY ONE of definition_id or connector_builder_project_id" | ||
| ), | ||
| context={ | ||
| "definition_id": definition_id, | ||
| "connector_builder_project_id": connector_builder_project_id, | ||
| }, | ||
| ) | ||
|
|
||
| if connector_builder_project_id: | ||
| if definition_type != "yaml": | ||
| raise exc.PyAirbyteInputError( | ||
| message="connector_builder_project_id is only valid for yaml definition_type", | ||
| context={ | ||
| "definition_type": definition_type, | ||
| "connector_builder_project_id": connector_builder_project_id, | ||
| }, | ||
| ) | ||
| definition_id = api_util.get_definition_id_for_connector_builder_project( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reorder validation so Docker calls hit the intended PyAirbyteInputError
Right now the definition_type != "yaml" guard fires before we evaluate the mutually exclusive parameter check, so calling with definition_type="docker" and a connector_builder_project_id raises NotImplementedError instead of the documented PyAirbyteInputError. Could we move the XOR validation (and the builder-project/type check) ahead of the NotImplementedError so we emit the expected input error? Something along these lines might help, wdyt?
- if definition_type != "yaml":
- raise NotImplementedError(
- "Docker custom source definitions are not yet supported. "
- "Only YAML manifest-based custom sources are currently available."
- )
-
- if (definition_id is None) == (connector_builder_project_id is None):
+ if (definition_id is None) == (connector_builder_project_id is None):
raise exc.PyAirbyteInputError(
message=(
"Must specify EXACTLY ONE of definition_id or connector_builder_project_id"
),
context={
"definition_id": definition_id,
"connector_builder_project_id": connector_builder_project_id,
},
)
- if connector_builder_project_id:
- if definition_type != "yaml":
- raise exc.PyAirbyteInputError(
- message="connector_builder_project_id is only valid for yaml definition_type",
- context={
- "definition_type": definition_type,
- "connector_builder_project_id": connector_builder_project_id,
- },
- )
+ if connector_builder_project_id and definition_type != "yaml":
+ raise exc.PyAirbyteInputError(
+ message="connector_builder_project_id is only valid for yaml definition_type",
+ context={
+ "definition_type": definition_type,
+ "connector_builder_project_id": connector_builder_project_id,
+ },
+ )
+
+ if definition_type != "yaml":
+ raise NotImplementedError(
+ "Docker custom source definitions are not yet supported. "
+ "Only YAML manifest-based custom sources are currently available."
+ )
+
+ if connector_builder_project_id:
definition_id = api_util.get_definition_id_for_connector_builder_project(
workspace_id=self.workspace_id,
builder_project_id=connector_builder_project_id,
api_root=self.api_root,
client_id=self.client_id,
client_secret=self.client_secret,
)🤖 Prompt for AI Agents
In airbyte/cloud/workspaces.py between lines 607 and 633, reorder the validation
logic so that the XOR check for mutually exclusive definition_id and
connector_builder_project_id parameters happens before the definition_type
"yaml" check. This means first validate that exactly one of definition_id or
connector_builder_project_id is provided and that connector_builder_project_id
is only valid with "yaml" definition_type. Only after these validations pass,
check if definition_type is not "yaml" to raise NotImplementedError. This way,
calls with definition_type "docker" and connector_builder_project_id correctly
raise PyAirbyteInputError as expected.
|
Devin, Is this PR still worth working on? Compare with main to see if these features are still needed or already added. Then, if worth continuing, please reconcile the merge conflicts. |
…ition Allow get_custom_source_definition to accept either definition_id or connector_builder_project_id. Adds validation to ensure exactly one parameter is provided and that connector_builder_project_id is only used with yaml definition_type. - Modified method signature to accept optional definition_id and connector_builder_project_id parameters - Added validation for mutually exclusive parameters - Added validation for yaml-only connector_builder_project_id usage - Implemented lookup by connector_builder_project_id via list and match - Added test coverage for new parameter validation - Updated existing tests to use keyword arguments Requested by: AJ Steers (@aaronsteers) Co-Authored-By: AJ Steers <[email protected]>
Replace list-and-filter approach with direct API call to /connector_builder_projects/get_with_manifest endpoint. - Added get_definition_id_for_connector_builder_project() API function - Updated get_custom_source_definition to use the new API function - More efficient reverse lookup from builder_project_id to definition_id Suggested by: AJ Steers (@aaronsteers) Co-Authored-By: AJ Steers <[email protected]>
- Fix line length issue in docstring (E501) - Changed get_definition_id_for_connector_builder_project to raise AirbyteError instead of returning None when definition not found - Simplified caller code in get_custom_source_definition to remove None check since API function now raises directly Fixes lint error and addresses feedback from @aaronsteers Co-Authored-By: AJ Steers <[email protected]>
Mypy wasn't able to infer that definition_id is guaranteed to be set, so added an assertion to help the type checker. Co-Authored-By: AJ Steers <[email protected]>
d9f9aba to
44309a8
Compare
|
PR rebased on latest main! ✅ Summary:
Remaining question:
Let me know how you'd like to proceed with the integration test! |
feat: Support connector_builder_project_id in get_custom_source_definition
This PR is blocked by https://github.com/airbytehq/airbyte-platform-internal/pull/18082 which investigates API endpoint improvements needed for the reverse lookup functionality (builder_project_id → definition_id).
The current implementation uses
/connector_builder_projects/get_with_manifestbut this endpoint returnssourceDefinitionId: nullfor published YAML connectors, preventing the intended functionality from working reliably.Summary
Modified
CloudWorkspace.get_custom_source_definition()to accept eitherdefinition_idORconnector_builder_project_idas lookup parameters, with validation to ensure exactly one is provided.Key changes:
connector_builder_project_idparameter (yaml-only)Implementation note: Uses
/connector_builder_projects/get_with_manifestAPI endpoint to resolve builder_project_id to definition_id. However, this endpoint currently returns null for published connectors (see blocker above).Review & Testing Checklist for Human
get_custom_source_definition()Recommended Test Plan
definition_idonly (should work as before)connector_builder_project_idonly for a yaml definition (currently may fail due to API returning null)connector_builder_project_idfor docker type (should raise PyAirbyteInputError)connector_builder_project_id(should raise AirbyteError)Notes
Summary by CodeRabbit
New Features
Bug Fixes
Tests
Important
Auto-merge enabled.
This PR is set to merge automatically when all requirements are met.