Skip to content

tests: add docker-based tests to FAST standard tests #586

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 68 commits into from
Jun 13, 2025

Conversation

aaronsteers
Copy link
Contributor

@aaronsteers aaronsteers commented Jun 6, 2025

Enhance the testing framework by adding Docker-based tests and updating the .gitignore to exclude JUnit results. Improve code readability in test calls.

Summary by CodeRabbit

  • New Features

    • Introduced a CLI command to test connector Docker images, enabling tests on pre-built or newly built images.
    • Added support for passing extra arguments to pytest during connector testing.
    • Added a context manager for creating temporary config files in test scenarios.
    • Added new pytest CLI options and fixtures for flexible control over connector image usage, stream reading, and test scenario selection.
    • Added serializers for Airbyte catalog and stream models.
    • Added a new optional field for suggested streams in connector metadata.
    • Added a new metadata YAML file for the source-pokeapi connector.
  • Enhancements

    • Streamlined and simplified the connector test workflow in GitHub Actions.
    • Improved separation of concerns by refactoring test execution logic in the CLI.
    • Expanded test scenario status options to include "exception".
    • Enhanced Docker image build and command execution utilities for better output handling and directory management.
    • Extended entrypoint output handling to support file-based messages and memory-efficient iteration.
    • Improved test message handling by using public APIs instead of private attributes.
    • Refined secrets fetching logic to better handle CI environment detection and printing of secret masks.
    • Adjusted connector root path resolution to always return absolute paths.
    • Added warnings and error handling for missing or malformed test configuration files.
  • Bug Fixes

    • Fixed test skipping logic for scenarios expecting exceptions to avoid false failures.
  • Refactor

    • Refactored test suite base classes to enhance Docker-based connector testing support.
    • Updated internal mappings and imports for test suite management.
    • Reorganized entrypoint output class for better message handling and iteration.

Copy link

github-actions bot commented Jun 6, 2025

PyTest Results (Fast)

3 666 tests  +10   3 655 ✅ +9   5m 58s ⏱️ +36s
    1 suites ± 0      11 💤 +1 
    1 files   ± 0       0 ❌ ±0 

Results for commit 9fcc198. ± Comparison against base commit a1b575f.

This pull request removes 4 and adds 14 tests. Note that renamed tests count towards both.
unit_tests.resources.source_pokeapi_w_components_py.integration_tests.test_airbyte_standards.TestSuiteSourcePokeAPI ‑ test_basic_read['valid_config.yaml' Test Scenario]
unit_tests.resources.source_pokeapi_w_components_py.integration_tests.test_airbyte_standards.TestSuiteSourcePokeAPI ‑ test_check['valid_config.yaml' Test Scenario]
unit_tests.resources.source_pokeapi_w_components_py.integration_tests.test_airbyte_standards.TestSuiteSourcePokeAPI ‑ test_discover['valid_config.yaml' Test Scenario]
unit_tests.resources.source_pokeapi_w_components_py.integration_tests.test_airbyte_standards.TestSuiteSourcePokeAPI ‑ test_fail_read_with_bad_catalog['valid_config.yaml' Test Scenario]
unit_tests.resources.source_pokeapi_w_components_py.integration_tests.test_airbyte_standards.TestSuiteSourcePokeAPI ‑ test_basic_read['valid_config' Test Scenario]
unit_tests.resources.source_pokeapi_w_components_py.integration_tests.test_airbyte_standards.TestSuiteSourcePokeAPI ‑ test_check['valid_config' Test Scenario]
unit_tests.resources.source_pokeapi_w_components_py.integration_tests.test_airbyte_standards.TestSuiteSourcePokeAPI ‑ test_discover['valid_config' Test Scenario]
unit_tests.resources.source_pokeapi_w_components_py.integration_tests.test_airbyte_standards.TestSuiteSourcePokeAPI ‑ test_docker_image_build_and_check['valid_config' Test Scenario]
unit_tests.resources.source_pokeapi_w_components_py.integration_tests.test_airbyte_standards.TestSuiteSourcePokeAPI ‑ test_docker_image_build_and_read['valid_config' Test Scenario]
unit_tests.resources.source_pokeapi_w_components_py.integration_tests.test_airbyte_standards.TestSuiteSourcePokeAPI ‑ test_docker_image_build_and_spec
unit_tests.resources.source_pokeapi_w_components_py.integration_tests.test_airbyte_standards.TestSuiteSourcePokeAPI ‑ test_fail_read_with_bad_catalog['valid_config' Test Scenario]
unit_tests.sources.declarative.incremental.test_concurrent_perpartitioncursor ‑ test_duplicate_partition_after_closing_partition_cursor_deleted
unit_tests.sources.declarative.incremental.test_concurrent_perpartitioncursor ‑ test_duplicate_partition_after_closing_partition_cursor_exists
unit_tests.sources.declarative.incremental.test_concurrent_perpartitioncursor ‑ test_duplicate_partition_while_processing
…

♻️ This comment has been updated with latest results.

Copy link

github-actions bot commented Jun 6, 2025

PyTest Results (Full)

3 669 tests  +10   3 658 ✅ +9   17m 27s ⏱️ +15s
    1 suites ± 0      11 💤 +1 
    1 files   ± 0       0 ❌ ±0 

Results for commit 9fcc198. ± Comparison against base commit a1b575f.

This pull request removes 4 and adds 14 tests. Note that renamed tests count towards both.
unit_tests.resources.source_pokeapi_w_components_py.integration_tests.test_airbyte_standards.TestSuiteSourcePokeAPI ‑ test_basic_read['valid_config.yaml' Test Scenario]
unit_tests.resources.source_pokeapi_w_components_py.integration_tests.test_airbyte_standards.TestSuiteSourcePokeAPI ‑ test_check['valid_config.yaml' Test Scenario]
unit_tests.resources.source_pokeapi_w_components_py.integration_tests.test_airbyte_standards.TestSuiteSourcePokeAPI ‑ test_discover['valid_config.yaml' Test Scenario]
unit_tests.resources.source_pokeapi_w_components_py.integration_tests.test_airbyte_standards.TestSuiteSourcePokeAPI ‑ test_fail_read_with_bad_catalog['valid_config.yaml' Test Scenario]
unit_tests.resources.source_pokeapi_w_components_py.integration_tests.test_airbyte_standards.TestSuiteSourcePokeAPI ‑ test_basic_read['valid_config' Test Scenario]
unit_tests.resources.source_pokeapi_w_components_py.integration_tests.test_airbyte_standards.TestSuiteSourcePokeAPI ‑ test_check['valid_config' Test Scenario]
unit_tests.resources.source_pokeapi_w_components_py.integration_tests.test_airbyte_standards.TestSuiteSourcePokeAPI ‑ test_discover['valid_config' Test Scenario]
unit_tests.resources.source_pokeapi_w_components_py.integration_tests.test_airbyte_standards.TestSuiteSourcePokeAPI ‑ test_docker_image_build_and_check['valid_config' Test Scenario]
unit_tests.resources.source_pokeapi_w_components_py.integration_tests.test_airbyte_standards.TestSuiteSourcePokeAPI ‑ test_docker_image_build_and_read['valid_config' Test Scenario]
unit_tests.resources.source_pokeapi_w_components_py.integration_tests.test_airbyte_standards.TestSuiteSourcePokeAPI ‑ test_docker_image_build_and_spec
unit_tests.resources.source_pokeapi_w_components_py.integration_tests.test_airbyte_standards.TestSuiteSourcePokeAPI ‑ test_fail_read_with_bad_catalog['valid_config' Test Scenario]
unit_tests.sources.declarative.incremental.test_concurrent_perpartitioncursor ‑ test_duplicate_partition_after_closing_partition_cursor_deleted
unit_tests.sources.declarative.incremental.test_concurrent_perpartitioncursor ‑ test_duplicate_partition_after_closing_partition_cursor_exists
unit_tests.sources.declarative.incremental.test_concurrent_perpartitioncursor ‑ test_duplicate_partition_while_processing
…

♻️ This comment has been updated with latest results.

@aaronsteers
Copy link
Contributor Author

aaronsteers commented Jun 6, 2025

I believe the test is failing because our manifest-only Dockerfile is not correct yet.

Complaint is of not having __injected_* config elements, which are needed when invoking source-declarative-manifest directly but shouldn't be needed when invoking the built version of the connector.

image

https://github.com/airbytehq/airbyte-python-cdk/runs/43642867333

@aaronsteers aaronsteers marked this pull request as ready for review June 7, 2025 08:23
@Copilot Copilot AI review requested due to automatic review settings June 7, 2025 08:23
@aaronsteers
Copy link
Contributor Author

/format-fix

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Enhance the standard test framework to support Docker-based connector tests, streamline test suite mapping, and expose a new CLI command for image testing.

  • Add a DockerConnectorTestSuite and related pytest hooks for building and running connector images
  • Refactor build_connector_image to return the image tag instead of exiting on success
  • Introduce new CLI commands (airbyte-cdk image test and connector test) with --connector-image and --pytest-arg options

Reviewed Changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
unit_tests/resources/source_pokeapi_w_components_py/metadata.yaml New test metadata for source-pokeapi connector
airbyte_cdk/utils/docker.py Change build_connector_image to return a string tag; adjust paths
airbyte_cdk/test/standard_tests/util.py Update TEST_CLASS_MAPPING keys, import DockerConnectorTestSuite, refine get_connector_root_dir override
airbyte_cdk/test/standard_tests/source_base.py Hoist entrypoint_wrapper import under TYPE_CHECKING
airbyte_cdk/test/standard_tests/pytest_hooks.py Add --connector-image CLI option and fixture for image override
airbyte_cdk/test/standard_tests/models/scenario.py Extend status enum, refine expect_exception, add temp config file context manager
airbyte_cdk/test/standard_tests/docker_base.py New base class for Docker-based connector tests with build/run methods
airbyte_cdk/test/standard_tests/connector_base.py Refactor ConnectorTestSuiteBase to inherit from Docker base class
airbyte_cdk/cli/airbyte_cdk/_image.py Add image test command under airbyte-cdk image CLI
airbyte_cdk/cli/airbyte_cdk/_connector.py Rename and extend connector test command with --pytest-arg support
Comments suppressed due to low confidence (2)

airbyte_cdk/utils/docker.py:137

  • The docstring for build_connector_image hasn’t been updated to reflect that it now returns the built image tag as a string; consider adding a Returns: section describing the returned str.
def build_connector_image(

airbyte_cdk/test/standard_tests/util.py:20

  • [nitpick] The type annotation restricts mapping values to DockerConnectorTestSuite, but entries like SourceTestSuiteBase and DeclarativeSourceTestSuite inherit from ConnectorTestSuiteBase; updating the annotation to type[ConnectorTestSuiteBase] would be more accurate.
TEST_CLASS_MAPPING: dict[

Copy link
Contributor

coderabbitai bot commented Jun 7, 2025

📝 Walkthrough

Walkthrough

This update introduces a new Docker-based connector image test command, refactors the connector test CLI for improved argument handling, and adds a helper for running connector tests. It also introduces a new Docker test suite base class, updates test scenario handling with temporary config file support, modifies workflow steps, and updates metadata and test utilities to align with the new testing architecture. Additionally, it enhances Docker image building with multi-platform support and flexible output capturing, and refactors the EntrypointOutput class for memory-efficient message iteration.

Changes

File(s) Change Summary
airbyte_cdk/cli/airbyte_cdk/_connector.py Refactored connector test CLI: renamed command, added --pytest-arg, moved test logic to run_connector_tests helper.
airbyte_cdk/cli/airbyte_cdk/_image.py Added new image test CLI command for Docker image testing, supporting image selection/building and test execution.
airbyte_cdk/test/standard_tests/__init__.py Changed import of ConnectorTestScenario to source from models.scenario instead of connector_base.
airbyte_cdk/test/standard_tests/connector_base.py Simplified ConnectorTestSuiteBase: now inherits from DockerConnectorTestSuite, removed config/scenario methods.
airbyte_cdk/test/standard_tests/docker_base.py Introduced DockerConnectorTestSuite class for Docker-based connector test logic, scenario loading, and test methods.
airbyte_cdk/test/standard_tests/pytest_hooks.py Added --connector-image pytest option and connector_image_override fixture for Docker image override in tests; added --read-from-streams and --read-scenarios options and fixtures.
airbyte_cdk/test/standard_tests/util.py Updated test class mapping for Java connectors to use DockerConnectorTestSuite; fixed method override to classmethod.
airbyte_cdk/utils/docker.py Changed build_connector_image to return image tag string, ensured Dockerfile dir exists, improved output and flow; enhanced run_docker_command to support flexible output capturing.
airbyte_cdk/test/models/scenario.py Added "exception" status, implemented with_temp_config_file context manager for temp config file handling; renamed id field to _id with new id property.
unit_tests/resources/source_pokeapi_w_components_py/metadata.yaml Added new metadata YAML for source-pokeapi connector with build, registry, and test suite configuration.
.github/workflows/connector-tests.yml Streamlined workflow: added source-intercom, simplified test step, removed output parsing and artifact upload.
airbyte_cdk/models/airbyte_protocol_serializers.py Added serializers for AirbyteCatalog and AirbyteStream.
airbyte_cdk/models/connector_metadata.py Added optional suggestedStreams field to ConnectorMetadata.
airbyte_cdk/test/entrypoint_wrapper.py Refactored EntrypointOutput to support file-based input, generator-based message iteration, and added uncaught exception handling.
airbyte_cdk/cli/airbyte_cdk/_secrets.py Modified fetch command to handle --print-ci-secrets-masks as optional bool with smarter default based on CI env var.
airbyte_cdk/test/standard_tests/source_base.py Updated test_check to use public connection_status_messages property instead of private _messages list; added conditional skips for exception scenarios.
airbyte_cdk/utils/connector_paths.py Modified find_connector_root to return absolute paths for connector root detection.
pyproject.toml Added explicit setuptools dependency to address undeclared transitive dependency issues.

Sequence Diagram(s)

Connector Test Command Flow

sequenceDiagram
    participant User
    participant CLI
    participant run_connector_tests
    participant pytest

    User->>CLI: airbyte-cdk connector test [args]
    CLI->>run_connector_tests: Prepare connector info and pytest args
    run_connector_tests->>pytest: Run pytest with args
    pytest-->>run_connector_tests: Test results
    run_connector_tests-->>CLI: Exit or report result
Loading

Image Test Command Flow

sequenceDiagram
    participant User
    participant CLI
    participant Docker
    participant run_connector_tests
    participant pytest

    User->>CLI: airbyte-cdk image test [--image X]
    CLI->>Docker: (Optional) Build image if --image not provided
    Docker-->>CLI: Image tag
    CLI->>run_connector_tests: Pass connector info and pytest args (with image)
    run_connector_tests->>pytest: Run pytest -m image_tests --connector-image
    pytest-->>run_connector_tests: Test results
    run_connector_tests-->>CLI: Exit or report result
Loading

Possibly related PRs

Would you like to explore further refinements to the CLI argument structure or test scenario parametrization, wdyt?

Suggested reviewers

  • dbgold17
✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
unit_tests/resources/source_pokeapi_w_components_py/metadata.yaml (1)

16-19: Consider clarifying the base image versioning approach

The comment suggests updating to the latest version, but then uses a specific SHA256 hash for reproducibility. These seem contradictory - perhaps rephrase the comment to clarify the intended update strategy? Something like "Check for newer versions at https://hub.docker.com/r/airbyte/python-connector-base" might be clearer, wdyt?

airbyte_cdk/cli/airbyte_cdk/_image.py (1)

92-165: Well-structured image test command with good error handling!

The command follows the established patterns and provides a clean interface for testing Docker images. The logic flow is clear: verify Docker → resolve metadata → build image if needed → run tests.

A few observations:

  • The hardcoded tag "dev-latest" is reasonable for development testing
  • The no_verify=True when building for tests makes sense for performance
  • Good error handling throughout

One small suggestion - wdyt about adding a brief note in the docstring about what the image_tests marker expects? This could help users understand which tests will run.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f398ab6 and dc788fa.

📒 Files selected for processing (11)
  • airbyte_cdk/cli/airbyte_cdk/_connector.py (4 hunks)
  • airbyte_cdk/cli/airbyte_cdk/_image.py (2 hunks)
  • airbyte_cdk/test/standard_tests/__init__.py (1 hunks)
  • airbyte_cdk/test/standard_tests/connector_base.py (1 hunks)
  • airbyte_cdk/test/standard_tests/docker_base.py (1 hunks)
  • airbyte_cdk/test/standard_tests/models/scenario.py (4 hunks)
  • airbyte_cdk/test/standard_tests/pytest_hooks.py (1 hunks)
  • airbyte_cdk/test/standard_tests/source_base.py (2 hunks)
  • airbyte_cdk/test/standard_tests/util.py (2 hunks)
  • airbyte_cdk/utils/docker.py (4 hunks)
  • unit_tests/resources/source_pokeapi_w_components_py/metadata.yaml (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
airbyte_cdk/test/standard_tests/__init__.py (3)
airbyte_cdk/test/standard_tests/connector_base.py (1)
  • ConnectorTestSuiteBase (29-114)
airbyte_cdk/test/standard_tests/declarative_sources.py (1)
  • DeclarativeSourceTestSuite (27-98)
airbyte_cdk/test/standard_tests/models/scenario.py (1)
  • ConnectorTestScenario (25-107)
airbyte_cdk/cli/airbyte_cdk/_image.py (4)
airbyte_cdk/cli/airbyte_cdk/_connector.py (1)
  • run_connector_tests (155-204)
airbyte_cdk/utils/docker.py (3)
  • verify_docker_installation (398-404)
  • build_connector_image (137-268)
  • ConnectorImageBuildError (22-35)
airbyte_cdk/utils/connector_paths.py (1)
  • resolve_connector_name_and_directory (84-148)
airbyte_cdk/models/connector_metadata.py (2)
  • MetadataFile (72-97)
  • from_file (80-97)
🪛 GitHub Actions: Linters
airbyte_cdk/test/standard_tests/pytest_hooks.py

[error] 32-32: mypy error: Returning Any from function declared to return "str | None" [no-any-return]

airbyte_cdk/test/standard_tests/docker_base.py

[error] 1-1: Ruff formatting check failed. File would be reformatted. Run 'ruff format' to fix code style issues.

🪛 Pylint (3.3.7)
airbyte_cdk/test/standard_tests/models/scenario.py

[refactor] 79-79: Boolean expression may be simplified to False

(R1709)

airbyte_cdk/test/standard_tests/docker_base.py

[error] 43-43: Method 'acceptance_test_config_path' should have "self" as first argument

(E0213)


[error] 73-73: Method 'acceptance_test_config_path' has no 'read_text' member

(E1101)

⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
🔇 Additional comments (25)
airbyte_cdk/test/standard_tests/source_base.py (1)

5-5: Nice improvement with the TYPE_CHECKING import pattern!

Moving the entrypoint_wrapper import to a TYPE_CHECKING block is a great practice that prevents runtime imports while preserving type annotations. This should help avoid potential circular import issues too, wdyt?

Also applies to: 24-26

airbyte_cdk/test/standard_tests/__init__.py (1)

35-35: Good refactoring to move ConnectorTestScenario to the models module!

Moving ConnectorTestScenario from connector_base to models.scenario makes the code organization cleaner and more logical. The models module is a better home for data structures like this, wdyt?

airbyte_cdk/test/standard_tests/util.py (3)

13-13: LGTM on the DockerConnectorTestSuite integration!

The import and type annotation updates look good and align with the new Docker-based testing infrastructure.

Also applies to: 21-22


26-26: Question about the mapping change from "declarative" to "java"

I notice you changed the key from "declarative" to "java" in the TEST_CLASS_MAPPING. Are you adding Java connector support or replacing the declarative mapping? The current mapping shows:

  • "manifest-only": DeclarativeSourceTestSuite (still there)
  • "java": DockerConnectorTestSuite (new)

Just want to confirm this is the intended change and not accidentally removing declarative support, wdyt?


71-71: Great fix with the classmethod lambda!

Converting the lambda to classmethod(lambda cls: connector_directory) ensures proper class method behavior. This is essential for the method to work correctly when called on the dynamically created test suite class.

airbyte_cdk/test/standard_tests/pytest_hooks.py (2)

19-27: Excellent addition of the --connector-image CLI option!

This new pytest option provides great flexibility for Docker-based testing by allowing users to specify pre-built images instead of building during tests. The implementation looks clean and well-documented.


29-33: ⚠️ Potential issue

Fix the mypy type issue in the fixture

The pipeline shows a mypy error because request.config.getoption() returns Any, but your function is annotated to return str | None. Here's a quick fix, wdyt?

 @pytest.fixture
 def connector_image_override(request: pytest.FixtureRequest) -> str | None:
     """Return the value of --connector-image, or None if not set."""
-    return request.config.getoption("--connector-image")
+    return request.config.getoption("--connector-image") or None

This ensures we return None for falsy values and satisfies the type checker.

Likely an incorrect or invalid review comment.

🧰 Tools
🪛 GitHub Actions: Linters

[error] 32-32: mypy error: Returning Any from function declared to return "str | None" [no-any-return]

unit_tests/resources/source_pokeapi_w_components_py/metadata.yaml (1)

1-1: Clarify the purpose of this test resource file

Since this file is copied from the source-pokeapi connector and placed in a test resources directory, could you confirm this is intended as a test fixture? If so, it might be helpful to add a comment clarifying its test-specific purpose to avoid confusion with the actual connector metadata, wdyt?

airbyte_cdk/test/standard_tests/models/scenario.py (2)

50-50: Good enhancement to support exception status!

The addition of "exception" status and the updated expect_exception property logic looks correct. The static analysis hint about simplifying the boolean expression appears to be a false positive - the current implementation properly handles the case when self.status is None.

Also applies to: 79-79


93-107: Well-implemented temporary config file management!

The with_temp_config_file context manager is nicely implemented with proper cleanup handling. Great attention to detail with:

  • Setting readable permissions for other processes (important for Docker containers)
  • Using suppress(OSError) for cleanup to avoid masking exceptions
  • Proper use of tempfile with descriptive prefix/suffix
airbyte_cdk/test/standard_tests/connector_base.py (3)

29-30: Clean refactoring to leverage DockerConnectorTestSuite!

Nice work simplifying this class by delegating Docker-specific functionality to the parent class. This separation of concerns makes the code more maintainable.


50-60: Good defensive programming with directory restoration

Excellent use of try/finally to ensure the working directory is restored even if the import fails. This prevents side effects that could affect other tests.


67-76: Thoughtful error handling with case-insensitive fallback

The case-insensitive class name matching is a nice developer experience enhancement. The error messages are also clear and helpful for debugging import issues.

airbyte_cdk/cli/airbyte_cdk/_image.py (1)

13-13: Good integration with the refactored connector testing logic!

Nice to see the reuse of the new run_connector_tests helper function from _connector.py. This promotes consistency between the connector test and image test commands.

airbyte_cdk/utils/docker.py (3)

146-146: Great improvement to make the function return the built image tag!

This change enables programmatic use of the built image, which is exactly what the new image test command needs. The return type annotation makes the API clear.


168-168: Nice refactoring with the connector_dockerfile_dir variable and directory creation!

The introduction of connector_dockerfile_dir improves readability and reduces duplication. The explicit directory creation with mkdir(parents=True, exist_ok=True) is a good defensive practice that ensures the build process is more robust.

Also applies to: 173-174, 197-198


261-268: Solid return logic that maintains the existing behavior while adding the return value!

The function now properly returns the image tag on success while preserving the existing exit-on-failure behavior. The updated success messages are more descriptive too.

airbyte_cdk/cli/airbyte_cdk/_connector.py (4)

104-104: Good explicit command naming!

Explicitly setting the command name to "test" while using connector_test as the function name provides clarity about the CLI interface.


117-123: Nice addition of the --pytest-arg option for flexibility!

This gives users the ability to pass additional pytest arguments, which is very useful for debugging and customizing test runs. The multiple=True parameter allows for multiple arguments to be passed, which is perfect.


141-152: Clean delegation to the extracted helper function!

The command function now focuses on CLI argument parsing and delegates the actual test execution. This separation of concerns is excellent and enables the reuse we see in the image test command.


155-204: Well-extracted helper function with comprehensive test setup!

The run_connector_tests function encapsulates all the pytest execution logic nicely. The function handles:

  • pytest import validation
  • test suite creation
  • working directory management
  • temporary test file generation
  • argument passing
  • JUnit XML output

This is exactly the kind of reusable function that makes the codebase more maintainable. Great work!

airbyte_cdk/test/standard_tests/docker_base.py (4)

27-50: Solid foundation with good utility methods!

The DockerConnectorTestSuite class provides essential infrastructure for Docker-based testing. The utility methods like get_test_class_dir(), get_connector_root_dir(), and acceptance_test_config_path are well-designed.

Note: The pylint errors about the acceptance_test_config_path are false positives - @classproperty is the correct pattern here, and Path.read_text() definitely exists.

🧰 Tools
🪛 Pylint (3.3.7)

[error] 43-43: Method 'acceptance_test_config_path' should have "self" as first argument

(E0213)


52-101: Comprehensive scenario loading with good error handling!

The get_scenarios() method handles the complexity of loading and filtering acceptance test scenarios well. I particularly like:

  • The warning when config files are missing (instead of failing hard)
  • Filtering out IAM role tests (good for avoiding sensitive configs)
  • Path resolution relative to connector root

The error handling for malformed config files is also robust.

🧰 Tools
🪛 Pylint (3.3.7)

[error] 73-73: Method 'acceptance_test_config_path' has no 'read_text' member

(E1101)


103-137: Well-designed Docker spec test with appropriate guards!

The test method includes good practices:

  • Skipping when Docker isn't available
  • Using the image_tests marker for selective execution
  • Proper image override handling
  • Building with a consistent tag when needed

The logic is clean and the error handling with check=True ensures failures are properly reported.


139-191: Excellent Docker check test with volume mounting for configs!

This test demonstrates sophisticated Docker usage with:

  • Volume mounting for config files
  • Temporary file management via context manager
  • Skipping tests expected to fail
  • Proper container lifecycle management

The use of scenario.with_temp_config_file() is particularly elegant for managing test configs.

@dbgold17 dbgold17 self-requested a review June 10, 2025 16:56
@aaronsteers aaronsteers changed the title test: add docker-based tests to FAST standard tests tests: add docker-based tests to FAST standard tests Jun 11, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
airbyte_cdk/test/standard_tests/models/scenario.py (4)

12-17: Consider grouping standard-lib imports for clarity, wdyt?

The new json, tempfile, contextlib additions are intermixed with typing imports later on. Aligning with PEP 8 (“std-lib, third-party, local”) grouping would slightly improve readability.


21-23: TYPE_CHECKING guard looks unused here

TYPE_CHECKING is imported but not referenced in this file after the recent edits. Could we drop it to avoid dead imports, wdyt?


78-80: Boolean expression can be simplified for readability

(self.status and self.status in {...}) or False works, yet the same result is obtained via return self.status in {...} because None is never in the set. Would a simpler one-liner make intent clearer, wdyt?

-        return (self.status and self.status in {"failed", "exception"}) or False
+        return self.status in {"failed", "exception"}
🧰 Tools
🪛 Pylint (3.3.7)

[refactor] 79-79: Boolean expression may be simplified to False

(R1709)


93-106: Nice utility – two micro suggestions

  1. File permissions: we currently OR with 0o444, which leads to 0o644. If the goal is world-read and read-only, maybe replace with an explicit 0o444 write-masked assignment?
  2. Resource warnings: NamedTemporaryFile closes automatically, but using delete=False keeps the descriptor open until the context ends. Would explicitly closing via temp_file.close() before changing permissions avoid subtle platform edge-cases, wdyt?

Example diff:

-            temp_file.write(json.dumps(config))
-            temp_file.flush()
+            json.dump(config, temp_file)
+            temp_file.flush()
+            temp_file.close()
-            temp_path = Path(temp_file.name)
-            temp_path.chmod(temp_path.stat().st_mode | 0o444)
+            temp_path = Path(temp_file.name)
+            temp_path.chmod(0o444)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dc788fa and b5c0174.

📒 Files selected for processing (1)
  • airbyte_cdk/test/standard_tests/models/scenario.py (4 hunks)
🧰 Additional context used
🪛 Pylint (3.3.7)
airbyte_cdk/test/standard_tests/models/scenario.py

[refactor] 79-79: Boolean expression may be simplified to False

(R1709)

🪛 GitHub Actions: Linters
airbyte_cdk/test/standard_tests/models/scenario.py

[error] 1-1: Ruff formatting check failed. File would be reformatted.

⏰ Context from checks skipped due to timeout of 90000ms (10)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Check: 'source-amplitude' (skip=false)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: preview_docs
  • GitHub Check: Analyze (python)
🔇 Additional comments (2)
airbyte_cdk/test/standard_tests/models/scenario.py (2)

50-50: Literal now contains "exception" – great!

The extra status broadens scenario expressiveness and is backward-compatible.


1-107: Ruff formatter failure

CI reports Ruff would reformat this file. Running ruff format airbyte_cdk/test/standard_tests/models/scenario.py (or pre-commit run --all-files) locally before pushing should keep the check green.

🧰 Tools
🪛 Pylint (3.3.7)

[refactor] 33-33: Too few public methods (0/2)

(R0903)


[refactor] 37-37: Too few public methods (0/2)

(R0903)


[refactor] 79-79: Boolean expression may be simplified to False

(R1709)

🪛 GitHub Actions: Linters

[error] 1-1: Ruff formatting check failed. File would be reformatted.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🔭 Outside diff range comments (1)
airbyte_cdk/cli/airbyte_cdk/_secrets.py (1)

98-103: ⚠️ Potential issue

Is default=None compatible with a bool flag?

click.option(..., type=bool, is_flag=True, default=None) relies on Click accepting a non-boolean default for a flag.
Click’s docs state that default must match flag_value’s type (bool), so None may raise a BadParameter at runtime.
Perhaps expose three-state behaviour via an explicit choice (type=click.Choice(["auto", "true", "false"], case_sensitive=False)) or keep is_flag=True but post-process False versus “not-provided” with ctx.get_parameter_source() – wdyt?

♻️ Duplicate comments (2)
airbyte_cdk/test/entrypoint_wrapper.py (2)

48-48: Duplicate import still shadows the canonical AirbyteMessage

The second import re-introduces the shadowing previously flagged.
Could we drop or alias it to avoid accidental type mismatches, wdyt?


65-82: ⚠️ Potential issue

Constructor still rejects AirbyteMessage instances passed by _run_command

_run_command builds messages: list[AirbyteMessage] yet EntrypointOutput.__init__ assumes list[str], leading to a TypeError in _parse_message.

Would adapting the constructor like below solve the mismatch?

-        messages: list[str] | None = None,
+        messages: list[AirbyteMessage | str] | None = None,
@@
-        if messages:
-            try:
-                self._messages = [self._parse_message(message) for message in messages]
-            except V2ValidationError as exception:
-                raise ValueError("All messages are expected to be AirbyteMessage") from exception
+        if messages:
+            self._messages = [
+                m if isinstance(m, AirbyteMessage) else self._parse_message(m)  # accept both
+                for m in messages
+            ]

This keeps the streaming enhancements while restoring runtime safety – wdyt?

🧹 Nitpick comments (2)
airbyte_cdk/cli/airbyte_cdk/_secrets.py (1)

184-196: Minor readability nit – collapse the env logic?

The two branches mutate print_ci_secrets_masks in place.
Would you be open to pulling the “auto-detect” logic into a small helper, e.g.

def _resolve_mask_flag(flag: bool | None) -> bool:
    if flag and "CI" not in os.environ:
        click.echo("…", err=True)
        return False
    if flag is None:
        return bool(os.environ.get("CI"))
    return flag

and then simply call it once? Wdyt?

.github/workflows/connector-tests.yml (1)

173-182: Optional: fail fast on missing secrets instead of silent success?

airbyte-cdk secrets fetch inside CI can silently succeed even when no secrets are fetched (e.g. wrong GCP creds).
Would you like to add --print-ci-secrets-masks --error-on-missing (or similar) so failures surface earlier, wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b13f37b and 4514b88.

📒 Files selected for processing (5)
  • .github/workflows/connector-tests.yml (4 hunks)
  • airbyte_cdk/cli/airbyte_cdk/_secrets.py (2 hunks)
  • airbyte_cdk/test/entrypoint_wrapper.py (5 hunks)
  • airbyte_cdk/test/standard_tests/connector_base.py (3 hunks)
  • airbyte_cdk/test/standard_tests/source_base.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • airbyte_cdk/test/standard_tests/connector_base.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
airbyte_cdk/test/standard_tests/source_base.py (1)
airbyte_cdk/test/entrypoint_wrapper.py (2)
  • connection_status_messages (149-150)
  • get_message_iterator (202-222)
⏰ Context from checks skipped due to timeout of 90000ms (11)
  • GitHub Check: Check: 'source-amplitude
  • GitHub Check: Check: 'source-intercom
  • GitHub Check: Check: 'source-pokeapi
  • GitHub Check: Check: 'source-shopify
  • GitHub Check: Check: 'source-zendesk-support
  • GitHub Check: Check: 'source-hardcoded-records
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Analyze (python)
🔇 Additional comments (1)
airbyte_cdk/test/standard_tests/source_base.py (1)

51-55: Nice switch to the public accessor

Using connection_status_messages removes the old private-attribute reach-in – looks great!

Copy link
Contributor

@dbgold17 dbgold17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talked through this over Zoom with @aaronsteers.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (6)
airbyte_cdk/test/standard_tests/docker_base.py (3)

137-146: Image is rebuilt for every test – cache a single build via a session fixture?

test_docker_image_build_and_spec unconditionally builds the connector image when no override is provided.
test_docker_image_build_and_check/read do the very same, so the same image may be built up to three times per test run.
Would introducing a @pytest.fixture(scope="session") that builds once and injects the tag into the other tests speed things up, wdyt?


288-305: Temp files mounted read-only?

Docker runs mount the config with -v {temp_file}:{container_config_path} (no :ro).
Granting write access to the container for secrets isn’t necessary and could surprise connector code that tries to overwrite it.
Consider appending :ro to the volume spec to enforce read-only semantics, wdyt?


313-323: discover_result.stdout.splitlines() may include ANSI escape codes

If the connector logs coloured output, the split lines fed to EntrypointOutput will still carry ANSI sequences, which the message parser treats as invalid JSON and converts to log messages.
Should we strip ANSI codes (e.g. re.sub(r"\x1b\\[[0-9;]*m", "", line)) before parsing to reduce noise, wdyt?

airbyte_cdk/test/entrypoint_wrapper.py (3)

65-76: Loading messages from file vs. list – re-parsing on every call is costly

get_message_iterator() reopens and reparses the file each time a high-level accessor is invoked, so records, state_messages, logs, etc. each pay the I/O cost.
Caching the parsed list after the first pass (similar to how _messages is cached) could avoid this overhead while still supporting generator semantics.
Would you consider memoising the parsed result or reusing an internal iterator, wdyt?


248-267: Generator comprehension captures the outer iterator – subtle bug

When safe_iterator=True, the returned generator closes over message_generator; once it’s exhausted by one consumer, subsequent consumers will see nothing.
Maybe yield directly inside a new generator function to avoid shared state, e.g.:

def _iter_filtered():
    for m in self.get_message_iterator():
        if m.type in message_types:
            yield m
return _iter_filtered()

Would that make the behaviour more predictable, wdyt?


337-340: Type hint mismatch – messages can include non-JSON log lines

_run_command assembles messages: list[str], but some elements are plain log lines, not serialized JSON.
The constructor handles it gracefully, yet callers may expect strictly JSON strings.
Documenting this or renaming the variable (raw_lines?) could reduce confusion – what do you think?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4514b88 and 9fcc198.

⛔ Files ignored due to path filters (1)
  • poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (7)
  • .github/workflows/connector-tests.yml (2 hunks)
  • airbyte_cdk/test/entrypoint_wrapper.py (8 hunks)
  • airbyte_cdk/test/standard_tests/docker_base.py (1 hunks)
  • airbyte_cdk/test/standard_tests/source_base.py (3 hunks)
  • airbyte_cdk/utils/connector_paths.py (1 hunks)
  • airbyte_cdk/utils/docker.py (12 hunks)
  • pyproject.toml (2 hunks)
✅ Files skipped from review due to trivial changes (1)
  • airbyte_cdk/utils/connector_paths.py
🚧 Files skipped from review as they are similar to previous changes (3)
  • airbyte_cdk/test/standard_tests/source_base.py
  • .github/workflows/connector-tests.yml
  • airbyte_cdk/utils/docker.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
airbyte_cdk/test/entrypoint_wrapper.py (3)
airbyte_cdk/models/airbyte_protocol.py (2)
  • AirbyteStreamState (55-57)
  • AirbyteMessage (79-88)
unit_tests/sources/test_source.py (3)
  • state (127-128)
  • state (131-132)
  • catalog (70-93)
airbyte_cdk/test/utils/reading.py (1)
  • catalog (12-14)
🪛 Pylint (3.3.7)
airbyte_cdk/test/standard_tests/docker_base.py

[error] 59-59: Method 'acceptance_test_config_path' should have "self" as first argument

(E0213)


[error] 88-88: Method 'acceptance_test_config_path' has no 'read_text' member

(E1101)


[refactor] 228-228: Too many local variables (25/15)

(R0914)


[refactor] 228-228: Too many branches (13/12)

(R0912)

⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: Check: source-amplitude
  • GitHub Check: Check: source-shopify
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: Check: source-intercom
  • GitHub Check: Pytest (Fast)
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
🔇 Additional comments (1)
pyproject.toml (1)

92-94: Dependency is added but simultaneously marked “unused” – double-check the intention, wdyt?

setuptools is now a first-class dependency (👍), yet it is still listed under DEP002 ignores as “unused”.
If CDK code never imports setuptools directly, deptry will indeed flag it – but at that point it is expected rather than unused.
Would it be cleaner to remove it from the ignore list (or move it to DEP003, “transitive but used”) so the tooling reflects the new ground truth?

@aaronsteers aaronsteers merged commit dcdd18b into main Jun 13, 2025
26 of 27 checks passed
@aaronsteers aaronsteers deleted the aj/tests/add-docker-based-standard-tests branch June 13, 2025 22:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants