-
Notifications
You must be signed in to change notification settings - Fork 18
Feat: New CDK-Native FAST Standard tests, replaces CAT #349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughThis set of changes introduces a comprehensive framework for Airbyte connector testing, including new test suite base classes, scenario modeling, and job running utilities within the Changes
Sequence Diagram(s)sequenceDiagram
participant TestSuite as DeclarativeSourceTestSuite
participant Scenario as ConnectorTestScenario
participant Runner as run_test_job
participant Connector as ConcurrentDeclarativeSource
participant Entrypoint as entrypoint_wrapper
TestSuite->>Scenario: Load scenario from YAML
TestSuite->>TestSuite: create_connector(scenario)
TestSuite->>Connector: Instantiate with config, manifest, (components)
TestSuite->>Runner: run_test_job(connector, verb, scenario)
Runner->>Connector: Call connector method (e.g., check, read, discover)
Connector->>Entrypoint: Execute command
Entrypoint-->>Connector: Return EntrypointOutput
Connector-->>Runner: Return results
Runner-->>TestSuite: Assert results and errors
Would you like to see a more detailed breakdown of the test scenario loading and execution flow, or does this high-level overview suffice for your needs? Wdyt? Tip ⚡💬 Agentic Chat (Pro Plan, General Availability)
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
/autofix
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 35 out of 35 changed files in this pull request and generated 1 comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
airbyte_cdk/test/declarative/utils/job_runner.py (1)
56-57
:⚠️ Potential issueConsider addressing the TODO comment for protocol checking.
The comment "TODO: Get a valid protocol check here" suggests this check might not be fully implemented or needs improvement. This is a critical check for validating connector types. Would you consider implementing a more robust protocol check here, as suggested in past reviews?
Based on past review comments, remove the TODO and implement a proper check:
- elif isinstance(connector, IConnector): # TODO: Get a valid protocol check here + elif isinstance(connector, IConnector): # Perform a valid protocol check
🧹 Nitpick comments (6)
airbyte_cdk/test/declarative/utils/job_runner.py (2)
92-94
: Document the hack more clearly or consider refactoring.The comment mentions a hack related to the source needing the catalog early, but it's not fully explained why the try/except block needs to be redundant. Could you add more details about this limitation or consider refactoring to avoid the hack?
- # This is a bit of a hack because the source needs the catalog early. - # Because it *also* can fail, we have to redundantly wrap it in a try/except block. + # The source requires catalog information early in its initialization process, + # before command execution. Since failures can occur both during this early access + # and during command execution, we need exception handling in both places.
101-105
: Consider a more maintainable error formatting approach.The error formatting with nested list comprehension, string replacement, and joining might be hard to maintain. What do you think about extracting this to a helper function for better readability?
+def _format_error_messages(errors): + """Format error messages from traces into a readable string.""" + return "\n\n".join( + [str(err.trace.error).replace("\\n", "\n") for err in errors if err.trace], + ) + def run_test_job( # ... existing code ... if result.errors and not test_scenario.expect_exception: raise AssertionError( - "\n\n".join( - [str(err.trace.error).replace("\\n", "\n") for err in result.errors if err.trace], - ) + _format_error_messages(result.errors) )unit_tests/test/test_standard_tests.py (2)
13-24
: Great test cases, but consider testing more connector implementations.The parametrized test includes good positive and negative cases. Have you considered adding more realistic connector implementations, such as mocks with the necessary methods, to validate more complex cases?
@pytest.mark.parametrize( "input, expected", [ (DeclarativeSource, True), (Source, True), + (type("MockConnector", (), {"spec": lambda self, logger: None}), True), + (type("InvalidConnector", (), {"not_spec": lambda self, logger: None}), False), (None, False), ("", False), ([], False), ({}, False), (object(), False), ], )
25-31
: Consider testing instantiated connector classes.While you're testing both class types and instances, you might want to add specific tests for instantiated DeclarativeSource and Source classes to ensure they pass the instance check. Would that be useful?
Also, a simpler way to express the type check logic might be:
def test_is_iconnector_check(input: Any, expected: bool) -> None: """Assert whether inputs are valid as an IConnector object or class.""" if isinstance(input, type): assert issubclass(input, IConnector) == expected - return - - assert isinstance(input, IConnector) == expected + else: + assert isinstance(input, IConnector) == expectedunit_tests/resources/source_pokeapi_w_components_py/integration_tests/test_airbyte_standards.py (2)
10-11
: Add type hint for metafunc parameter.The metafunc parameter is missing a type hint. According to the relevant code snippet, it should be a pytest.Metafunc type. WDYT?
-def pytest_generate_tests(metafunc) -> None: +def pytest_generate_tests(metafunc: pytest.Metafunc) -> None: generate_tests(metafunc)
14-21
: Good implementation, but consider demonstrating customization.This is a great minimal implementation. To make this example more educational, would it be helpful to include an example of customizing the test suite with specific scenarios or overrides? This would demonstrate how developers can extend the base functionality for their specific needs.
For example, you could add a custom scenario method:
@classmethod def get_scenarios(cls) -> list[ConnectorTestScenario]: """Return the test scenarios for this test suite.""" # This could load from a file or define scenarios programmatically return [ ConnectorTestScenario( name="Basic PokeAPI test", config_dict={"pokemon_name": "pikachu"}, expect_exception=False, ) ]
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
airbyte_cdk/test/declarative/models/scenario.py
(1 hunks)airbyte_cdk/test/declarative/test_suites/connector_base.py
(1 hunks)airbyte_cdk/test/declarative/test_suites/declarative_sources.py
(1 hunks)airbyte_cdk/test/declarative/test_suites/destination_base.py
(1 hunks)airbyte_cdk/test/declarative/test_suites/source_base.py
(1 hunks)airbyte_cdk/test/declarative/utils/job_runner.py
(1 hunks)pyproject.toml
(2 hunks)unit_tests/resources/source_pokeapi_w_components_py/integration_tests/test_airbyte_standards.py
(1 hunks)unit_tests/source_declarative_manifest/conftest.py
(2 hunks)unit_tests/source_declarative_manifest/test_source_declarative_w_custom_components.py
(4 hunks)unit_tests/test/test_standard_tests.py
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- airbyte_cdk/test/declarative/models/scenario.py
🚧 Files skipped from review as they are similar to previous changes (7)
- airbyte_cdk/test/declarative/test_suites/destination_base.py
- pyproject.toml
- unit_tests/source_declarative_manifest/test_source_declarative_w_custom_components.py
- airbyte_cdk/test/declarative/test_suites/source_base.py
- airbyte_cdk/test/declarative/test_suites/declarative_sources.py
- airbyte_cdk/test/declarative/test_suites/connector_base.py
- unit_tests/source_declarative_manifest/conftest.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
unit_tests/resources/source_pokeapi_w_components_py/integration_tests/test_airbyte_standards.py (2)
airbyte_cdk/test/declarative/test_suites/declarative_sources.py (1)
DeclarativeSourceTestSuite
(29-94)airbyte_cdk/test/declarative/test_suites/connector_base.py (1)
generate_tests
(31-72)
⏰ Context from checks skipped due to timeout of 90000ms (10)
- GitHub Check: Check: 'source-pokeapi' (skip=false)
- GitHub Check: Check: 'source-amplitude' (skip=false)
- GitHub Check: Check: 'source-shopify' (skip=false)
- GitHub Check: Check: 'source-hardcoded-records' (skip=false)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: SDM Docker Image Build
- GitHub Check: preview_docs
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Analyze (python)
🔇 Additional comments (1)
airbyte_cdk/test/declarative/utils/job_runner.py (1)
95-98
:❓ Verification inconclusive
Using a private API could lead to maintenance issues.
You're calling
entrypoint_wrapper._run_command
which is a non-public API, as indicated by the leading underscore and the noqa comment. Consider asking the maintainers about making this method public or creating a public wrapper function to avoid potential future breakages.
🌐 Web query:
What are best practices for handling private API dependencies in Python?
💡 Result:
When handling private API dependencies in Python, effective strategies ensure maintainability, security, and reproducibility. Here are some best practices:
1. Using Private Package Indexes
- Host private dependencies on a private package index, such as Devpi, PyPiServer, or Artifact Registry. These tools enable centralized management and secure access to dependencies[1][7].
- Use the
--index-url
or--extra-index-url
options inrequirements.txt
to specify the private repository URL[7].2. Version Control and Reproducibility
- Practice semantic versioning for all private dependencies to signal breaking changes and simplify compatibility management[4].
- Generate a lock file, like
requirements-freeze.txt
, to freeze dependencies and ensure consistent versions across environments[2].3. Editable Installs for Development
- During development, install private packages in editable mode using
pip install -e /path/to/package
for fast iteration without rebuilding packages. Usepyproject.toml
orsetup.py
to define dependencies explicitly[1].4. Dependency Installation from Git
- Install private dependencies directly from version control systems like GitHub or GitLab using URIs in
requirements.txt
. Example:Ensure authentication tokens are stored securely in environment variables[5].git+https://${GITLAB_USER}:${GITLAB_TOKEN}@gitlab.com/user/project.git@branch_name
5. Group Dependencies
- Separate dependencies into groups (e.g.,
dev
,testing
,prod
) to install only the required subset for specific use cases. This aids in managing and organizing dependencies effectively[2].6. Automated Testing
- Implement integration and unit tests to check API compatibility and ensure private dependencies function correctly after updates[4].
- Use a continuous integration (CI) pipeline to automate these tests, guaranteeing stability.
7. Dependency Injection
- Use dependency injection to decouple modules in your application. This approach simplifies testing and allows swapping dependencies for mock implementations when needed[3].
8. Security and Sensitive Data Handling
- Secure API keys or tokens for accessing private repositories by storing them in environment variables or using secret management tools[5][10].
- Regularly monitor and rotate secrets like API tokens to minimize exposure risks[10].
9. Comprehensive Documentation
- Maintain detailed documentation for each private API dependency, including its purpose, endpoints, and data handling. Use tools like Swagger or Postman to document and test APIs[4].
By following these practices, you can effectively manage private API dependencies in Python projects, ensuring security, maintainability, and smooth development workflows.
Citations:
- 1: https://discuss.python.org/t/dependency-management-with-many-private-local-packages/66213
- 2: https://www.fuzzylabs.ai/blog-post/managing-python-dependencies
- 3: https://arjancodes.com/blog/python-dependency-injection-best-practices/
- 4: https://dmwebsoft.com/best-practices-for-managing-api-dependencies-in-large-projects
- 5: https://docs.readthedocs.com/platform/stable/guides/private-python-packages.html
- 6: https://snyk.io/blog/how-to-secure-rest-api/
- 7: https://cloud.google.com/run/docs/runtimes/python-dependencies
- 8: https://community.temporal.io/t/python-dependency-management/12935
- 9: https://auth0.com/blog/developing-restful-apis-with-python-and-flask/
- 10: https://www.pythoncentral.io/how-to-handle-sensitive-data-in-python-best-practices-and-common-mistakes/
Attention: Private API usage in job_runner.py may become a future maintenance risk
It looks like you’re currently invoking the non-public method
entrypoint_wrapper._run_command
(as signaled by the underscore and accompanying# noqa
comment), which may expose the code to unforeseen changes since it's intended for internal use. Best practices for handling private API dependencies recommend either confirming with the maintainers about exposing a public interface or wrapping this call in a dedicated public method, so that potential breaking changes can be isolated.
- File:
airbyte_cdk/test/declarative/utils/job_runner.py
- Lines: 95-98
- Snippet:
result: entrypoint_wrapper.EntrypointOutput = entrypoint_wrapper._run_command( # noqa: SLF001 # Non-public API source=connector_obj, # type: ignore [arg-type] args=args, expecting_exception=test_scenario.expect_exception,Would you consider reaching out to the maintainers about making this method public or perhaps adding a stable public wrapper for it? wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🔭 Outside diff range comments (2)
airbyte_cdk/test/standard_tests/pytest_hooks.py (1)
1-61
: 💡 Verification agent🧩 Analysis chain
Fix Ruff formatting issue
Ruff formatting check is failing on this file. Running the auto-formatter should resolve this.
🏁 Script executed:
#!/bin/bash # Run Ruff formatter on the file to see what changes it would make rg -q "ruff" pyproject.toml && python -m ruff format --diff airbyte_cdk/test/standard_tests/pytest_hooks.py || echo "Ruff not configured in pyproject.toml"Length of output: 412
Ruff Formatting Issue in
pytest_hooks.py
– Please Apply AutoformattingIt looks like Ruff is suggesting a minor reformat on line 12—specifically, adding an extra newline after the module docstring. Could you please run the autoformatter (e.g.,
python -m ruff format airbyte_cdk/test/standard_tests/pytest_hooks.py
) to automatically apply the change? wdyt?🧰 Tools
🪛 GitHub Actions: Linters
[error] 1-1: Ruff formatting check failed. File would be reformatted.
airbyte_cdk/test/standard_tests/connector_base.py (1)
1-151
: 💡 Verification agent🧩 Analysis chain
Fix Ruff formatting issue
Ruff formatting check is failing on this file. Running the auto-formatter should resolve this.
🏁 Script executed:
#!/bin/bash # Run Ruff formatter on the file to see what changes it would make rg -q "ruff" pyproject.toml && python -m ruff format --diff airbyte_cdk/test/standard_tests/connector_base.py || echo "Ruff not configured in pyproject.toml"Length of output: 1188
Apply autoformatter changes to fix Ruff formatting discrepancies
The autoformatter diff indicates two adjustments:
- Remove the extra blank line before the declaration of
class ConnectorTestSuiteBase(abc.ABC)
.- Reformat the assert statement in the
test_check
method to use a parenthesized condition and move the error message after the closing parenthesis.Could you please apply these autoformatter changes? wdyt?
🧰 Tools
🪛 GitHub Actions: Linters
[error] 1-1: Ruff formatting check failed. File would be reformatted.
🧹 Nitpick comments (9)
airbyte_cdk/test/standard_tests/pytest_hooks.py (1)
53-56
: Error message inconsistencyThe error message mentions a missing 'scenarios' attribute, but the code is actually checking for a 'get_scenarios' method. This could be confusing.
- raise ValueError( - f"Test class {test_class} does not have a 'scenarios' attribute. " - "Please define the 'scenarios' attribute in the test class." - ) + raise ValueError( + f"Test class {test_class} does not have a 'get_scenarios' method. " + "Please define the 'get_scenarios' method in the test class." + )Wdyt about updating the error message to match what we're actually checking for?
airbyte_cdk/test/standard_tests/declarative_sources.py (2)
18-24
: Add error handling for md5_checksumThe method works well, but would benefit from error handling for file I/O operations.
def md5_checksum(file_path: Path) -> str: """Helper function to calculate the MD5 checksum of a file. This is used to calculate the checksum of the `components.py` file, if it exists. """ - with open(file_path, "rb") as file: - return md5(file.read()).hexdigest() + try: + with open(file_path, "rb") as file: + return md5(file.read()).hexdigest() + except IOError as e: + raise IOError(f"Failed to read file for checksum calculation: {e}") from eWhat do you think about adding error handling here?
77-82
: Consider adding a comment about AIRBYTE_ENABLE_UNSAFE_CODEThe environment variable enables unsafe code execution, which might be a security consideration.
if cls.components_py_path and cls.components_py_path.exists(): + # Enable unsafe code execution for components.py injection + # This is necessary to allow custom Python code in components.py to be executed os.environ["AIRBYTE_ENABLE_UNSAFE_CODE"] = "true" config["__injected_components_py"] = cls.components_py_path.read_text() config["__injected_components_py_checksums"] = { "md5": md5_checksum(cls.components_py_path), }Adding a small comment here would help explain the security implications, wdyt?
airbyte_cdk/test/standard_tests/source_base.py (2)
93-95
: Improve error message in test_basic_readThe error message could be more descriptive to aid troubleshooting.
if not result.records: - raise AssertionError("Expected records but got none.") # noqa: TRY003 + raise AssertionError( + f"Expected records but got none for scenario: {scenario.name}. " + f"Check the connector configuration and verify stream availability." + ) # noqa: TRY003What do you think about enhancing this error message with more context?
121-121
: Add comment for setting scenario.status = "failed"Modifying the input parameter could lead to confusion without clear documentation.
# Set expected status to "failed" to ensure the test fails if the connector. + # This is intentionally modifying the scenario object to expect failure + # since we're testing with an invalid catalog scenario.status = "failed"Would adding a clearer comment help future maintainers understand the intent here?
airbyte_cdk/test/standard_tests/connector_base.py (4)
51-51
: Type ignore needs explanationThere's a type ignore annotation without explanation of why it's needed.
- connector = cls.connector # type: ignore + connector = cls.connector # type: ignore # connector is declared as possibly None, but at this point it's expected to have a valueWould adding a comment explaining the reason for the type ignore be helpful here?
84-106
: Consider optimizing directory searchThe method searches through parent directories which could be slow in deep directory structures.
What about adding a maximum depth parameter or using a more efficient approach if performance becomes an issue? This isn't urgent but might be worth considering for large projects.
141-141
: Add comment for filtering out iam_role testsThe explicit filtering of tests containing "iam_role" deserves an explanation.
tests_scenarios = [ ConnectorTestScenario.model_validate(test) for test in all_tests_config["acceptance_tests"][category]["tests"] - if "iam_role" not in test["config_path"] + # Skip IAM role tests as they require special handling with AWS credentials + if "iam_role" not in test["config_path"] ]It would be helpful to explain why we're excluding these specific tests, wdyt?
66-81
: Consider consolidating test_check implementationsThis class has a
test_check
method that's very similar to the one inSourceTestSuiteBase
.Since both this class and
SourceTestSuiteBase
implement very similartest_check
methods, would it make sense to refactor to avoid duplication? Perhaps by keeping the implementation here and havingSourceTestSuiteBase
call the parent method?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
airbyte_cdk/test/pytest_config/plugin.py
(1 hunks)airbyte_cdk/test/standard_tests/__init__.py
(1 hunks)airbyte_cdk/test/standard_tests/connector_base.py
(1 hunks)airbyte_cdk/test/standard_tests/declarative_sources.py
(1 hunks)airbyte_cdk/test/standard_tests/destination_base.py
(1 hunks)airbyte_cdk/test/standard_tests/pytest_hooks.py
(1 hunks)airbyte_cdk/test/standard_tests/source_base.py
(1 hunks)unit_tests/resources/source_pokeapi_w_components_py/components.py
(2 hunks)unit_tests/resources/source_pokeapi_w_components_py/integration_tests/test_airbyte_standards.py
(1 hunks)unit_tests/source_declarative_manifest/test_source_declarative_w_custom_components.py
(4 hunks)
✅ Files skipped from review due to trivial changes (2)
- airbyte_cdk/test/standard_tests/destination_base.py
- airbyte_cdk/test/standard_tests/init.py
🚧 Files skipped from review as they are similar to previous changes (4)
- airbyte_cdk/test/pytest_config/plugin.py
- unit_tests/resources/source_pokeapi_w_components_py/integration_tests/test_airbyte_standards.py
- unit_tests/source_declarative_manifest/test_source_declarative_w_custom_components.py
- unit_tests/resources/source_pokeapi_w_components_py/components.py
🧰 Additional context used
🪛 GitHub Actions: Linters
airbyte_cdk/test/standard_tests/pytest_hooks.py
[error] 1-1: Ruff formatting check failed. File would be reformatted.
airbyte_cdk/test/standard_tests/connector_base.py
[error] 1-1: Ruff formatting check failed. File would be reformatted.
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: Check: 'source-pokeapi' (skip=false)
- GitHub Check: Check: 'source-amplitude' (skip=false)
- GitHub Check: Check: 'source-shopify' (skip=false)
- GitHub Check: Check: 'source-hardcoded-records' (skip=false)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: SDM Docker Image Build
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (Fast)
- GitHub Check: Analyze (python)
🔇 Additional comments (6)
airbyte_cdk/test/standard_tests/pytest_hooks.py (2)
42-48
: LGTM: Good early return patternThe early return when test_class is None is a clean approach.
58-60
: Good approach to scenario handlingThe implementation for retrieving scenarios from the test class and parametrizing tests is clean and effective. This enables flexible test parameterization without requiring custom test methods.
airbyte_cdk/test/standard_tests/declarative_sources.py (2)
38-48
: LGTM: Good error handling in manifest_yaml_pathThe property correctly checks if the manifest file exists and raises a clear error message if not. The use of classproperty is also appropriate here.
50-60
: LGTM: Well-structured components_py_path propertyThe property correctly handles the case where the components.py file doesn't exist by returning None. This is a clean approach.
airbyte_cdk/test/standard_tests/source_base.py (2)
32-54
: LGTM: Well-implemented test_check methodThe method effectively validates the connector's ability to establish a connection and return the expected message type.
83-83
:❓ Verification inconclusive
Consider addressing type ignore comments
The use of type ignore annotations might be hiding potential issues.
Can we revisit why these type ignores are needed and see if we can address the underlying issue instead?
🏁 Script executed:
#!/bin/bash # Check for discover_result.catalog.catalog.streams to understand why type ignore is needed rg -A 3 "EntrypointOutput" airbyte_cdk/test/entrypoint_wrapper.pyLength of output: 837
Revisit Type Ignore Usage on Discover Result
It looks like the
# type: ignore
was added to suppress issues—likely stemming from the waydiscover_result.catalog.catalog
is typed (possibly dealing with optional or union types). Would you be open to one of the following approaches?
- Refining the type definitions: Perhaps updating the type hints for the objects involved (e.g., ensuring that the inner
catalog
is non-optional) might remove the need for ignoring.- Adding a runtime check: Instead of bypassing the type check, we could add a guard to verify that
catalog
is indeed present before iterating over itsstreams
.WDYT? This would help ensure we’re not potentially overlooking a subtle type issue.
/autofix
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
APPROVED, thanks for answering my annoying questions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (5)
airbyte_cdk/test/standard_tests/_job_runner.py (2)
87-92
: Missing cleanup for temporary files?The code creates temporary files for configs but doesn't have explicit cleanup. Would using a context manager or try/finally block to ensure these temp files get deleted be better for preventing disk space issues during long test runs? Wdyt?
- config_path = ( - Path(tempfile.gettempdir()) / "airbyte-test" / f"temp_config_{uuid.uuid4().hex}.json" - ) - config_path.parent.mkdir(parents=True, exist_ok=True) - config_path.write_text(orjson.dumps(test_scenario.config_dict).decode()) - args += ["--config", str(config_path)] + with tempfile.NamedTemporaryFile(suffix=".json", delete=False) as temp_config: + config_path = Path(temp_config.name) + temp_config.write(orjson.dumps(test_scenario.config_dict)) + args += ["--config", str(config_path)] + # Register for cleanup after the function completes + # (Implementation depends on your preferred approach)
99-106
: Similar cleanup concern for catalog temp files.As with the config files, catalog temp files could benefit from proper cleanup. Would using a context manager be preferred here as well?
airbyte_cdk/test/standard_tests/models/scenario.py (3)
46-58
: Config dictionary retrieval looks good.The
get_config_dict
method appropriately handles both direct dictionary config and loading from file paths.Could adding some error handling around the YAML loading make this more robust against malformed files? Wdyt?
if self.config_path: - return cast(dict[str, Any], yaml.safe_load(self.config_path.read_text())) + try: + return cast(dict[str, Any], yaml.safe_load(self.config_path.read_text())) + except yaml.YAMLError as e: + raise ValueError(f"Failed to parse config file {self.config_path}: {e}") from e
60-62
: Boolean expression could be clearer.The
expect_exception
property uses an or expression that might be a bit difficult to parse at first glance.@property def expect_exception(self) -> bool: - return self.status and self.status == "failed" or False + return self.status == "failed" if self.status else False
64-66
: Safe property access.The
instance_name
property assumesconfig_path
is notNone
when accessing.stem
.@property def instance_name(self) -> str: - return self.config_path.stem if self.config_path else "Unnamed Scenario" + return self.config_path.stem if self.config_path is not None else "Unnamed Scenario"
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
airbyte_cdk/test/standard_tests/_job_runner.py
(1 hunks)airbyte_cdk/test/standard_tests/connector_base.py
(1 hunks)airbyte_cdk/test/standard_tests/declarative_sources.py
(1 hunks)airbyte_cdk/test/standard_tests/models/__init__.py
(1 hunks)airbyte_cdk/test/standard_tests/models/scenario.py
(1 hunks)airbyte_cdk/test/standard_tests/source_base.py
(1 hunks)unit_tests/test/test_standard_tests.py
(1 hunks)
✅ Files skipped from review due to trivial changes (2)
- airbyte_cdk/test/standard_tests/models/init.py
- airbyte_cdk/test/standard_tests/declarative_sources.py
🚧 Files skipped from review as they are similar to previous changes (3)
- unit_tests/test/test_standard_tests.py
- airbyte_cdk/test/standard_tests/source_base.py
- airbyte_cdk/test/standard_tests/connector_base.py
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: Check: 'source-pokeapi' (skip=false)
- GitHub Check: Check: 'source-amplitude' (skip=false)
- GitHub Check: Check: 'source-shopify' (skip=false)
- GitHub Check: Check: 'source-hardcoded-records' (skip=false)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: SDM Docker Image Build
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Analyze (python)
🔇 Additional comments (9)
airbyte_cdk/test/standard_tests/_job_runner.py (7)
24-41
: Clean error formatting function!This function effectively converts errors from the entrypoint output to readable strings, handling the no-error case gracefully and ensuring that error traces are properly formatted.
44-55
: Great use of Protocol for the connector interface.The @runtime_checkable Protocol approach provides a clean way to define the connector interface. The docstring nicely explains the current dependency on 'spec' and future direction.
65-71
: Configuration handling looks good.The code properly converts ConfiguredAirbyteCatalog to dictionary format if needed. This ensures compatibility regardless of the input format.
72-81
: Connector instantiation looks clean.The code handles multiple connector input types (class, factory, instance) elegantly, with proper type checking and error handling.
112-114
: Unclear comment about catalog access.The comment mentions "a bit of a hack because the source needs the catalog early" but there's no code implementing this hack. Is this comment outdated or is something missing here?
125-146
: Clean check verb validation.The code properly validates CONNECTION_STATUS messages for the check verb, including ensuring a single message is present and that failed status is reported when exceptions are expected.
148-157
: Good exception handling for non-check verbs.The error validation logic for non-check verbs is clean and appropriate, making assertions about errors based on the expected exception status.
airbyte_cdk/test/standard_tests/models/scenario.py (2)
19-45
: Well-structured connector test scenario model.The Pydantic model provides a clear structure for test scenarios with appropriate nested classes for related data. All fields are properly typed.
68-74
: String representation looks good.The
__str__
method provides a nice hierarchy of identifiers for the test scenario, making debugging easier.
Some drive-by fixes have been spun out here:
Test
#477Test PRs for three connectors are here:
In each of these, I've disabled CAT and replaced with the new FAST Standard tests.
✅ In scope here
integration_tests
orunit_tests
- is agnostic.acceptanceTests
) can be copy-pasted to theintegrationTests
section ofmetadata.yaml
.acceptanceTests
withintegrationTests
to toggle over to the new tests (after creating the Pytest scaffold file). (The linked PRs all disable CAT.)❌ Future Scope (not included here):
Merging the above will significantly decrease the number of changed files here in this PR.
Loom Walkthrough
Here is the walkthrough of the Hubspot connector.
https://www.loom.com/share/7f8d8a8dabef43049b478b119ed4aff0
Published as version:
6.45.4.post72.dev14497997772
(f39f480)Using the recursive acronym "FAST", this PR adds Fast Airbyte Standard Tests - a replacement for CAT.
FAST tests are:
Summary by CodeRabbit
New Features
Bug Fixes
Chores