Skip to content

Conversation

@aaronsteers
Copy link
Contributor

@aaronsteers aaronsteers commented Nov 12, 2025

feat: Add Hercules AI test agent example (do not merge)

Summary

This PR adds an example demonstrating how to use Hercules (an open-source AI testing agent) to test PyAirbyte functionality. This is the first part of a "bake off" comparison between Hercules and Goose AI test agents.

Key additions:

  • Documentation on using Hercules with PyAirbyte in examples/ai-test-agents/hercules/README.md
  • Gherkin feature file with two test scenarios (test_pyairbyte.feature)
  • Requirements file for Hercules dependencies (requirements.txt)

Important note: Due to a dependency conflict (Hercules requires psutil <6.0.0 while airbyte-cdk requires psutil 6.1.0), Hercules must be installed in a separate virtual environment. This example documents the workaround approach rather than integrating Hercules as a direct dependency.

Review & Testing Checklist for Human

This is a YELLOW risk PR (documentation/example only, but untested):

  • Verify the Gherkin test scenarios are valid - I created these based on PyAirbyte's API but have NOT tested them with Hercules
  • Check if the separate environment approach aligns with the "bake off" intent - This is more of a "how to use together" guide than a true integration
  • Review the installation instructions - Confirm the Hercules installation steps are correct and complete

Test Plan

To verify this example works:

  1. Create a separate virtual environment
  2. Install Hercules following the README instructions
  3. Set up an OpenAI API key
  4. Run Hercules with the provided feature file
  5. Verify the test scenarios execute successfully

Notes

Summary by CodeRabbit

  • Documentation

    • Added comprehensive guide for using Hercules AI testing agent to test PyAirbyte, including setup and usage instructions.
  • Tests

    • Added test scenarios for PyAirbyte integration, covering basic functionality and connector discovery workflows.

- Add example demonstrating how to use Hercules to test PyAirbyte
- Include Gherkin feature files with test scenarios
- Document installation and usage in separate environment
- Note dependency conflict with airbyte-cdk (psutil version)

Co-Authored-By: AJ Steers <[email protected]>
@devin-ai-integration
Copy link
Contributor

Original prompt from AJ Steers
Received message in Slack channel #ask-devin-ai:

@Devin - Let's have an AI-Test-Agent bake off between Hercules and Goose. Create a minimal implementation for both in PyAirbyte in separate PRs.
Thread URL: https://airbytehq-team.slack.com/archives/C08BHPUMEPJ/p1762924311392839

@devin-ai-integration
Copy link
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This PyAirbyte Version

You can test this version of PyAirbyte using the following:

# Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1762924591-hercules-ai-test-agent' pyairbyte --help

# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1762924591-hercules-ai-test-agent'

Helpful Resources

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /fix-pr - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test-pr - Runs tests with the updated PyAirbyte

Community Support

Questions? Join the #pyairbyte channel in our Slack workspace.

📝 Edit this welcome message.

@github-actions
Copy link

PyTest Results (Fast Tests Only, No Creds)

312 tests  ±0   312 ✅ ±0   5m 51s ⏱️ -2s
  1 suites ±0     0 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit 05b3a94. ± Comparison against base commit 782b1f2.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 12, 2025

📝 Walkthrough

Walkthrough

Three new files added to the examples/ai-test-agents/hercules/ directory: a README documenting Hercules AI test agent setup and usage for PyAirbyte testing, a requirements.txt file specifying dependencies, and a BDD-style feature file containing test scenarios for PyAirbyte integration.

Changes

Cohort / File(s) Summary
Hercules AI Test Agent Example
examples/ai-test-agents/hercules/README.md
Documentation covering prerequisites, separate virtual environment installation due to dependency conflicts, setup instructions for Hercules and Playwright, usage guidelines including LLM API key configuration, test scenario overview, expected outputs (JUnit XML, HTML reports, logs), and limitations with alternative approaches suggested (Docker, CI/CD).
Hercules AI Test Agent Example
examples/ai-test-agents/hercules/requirements.txt
Dependencies file with testzeus-hercules (≥0.2.0) and playwright (≥1.40.0), plus advisory comments about isolated virtual environment installation.
Hercules AI Test Agent Example
examples/ai-test-agents/hercules/test_pyairbyte.feature
BDD feature file with two test scenarios: (1) basic PyAirbyte functionality covering source creation, connectivity checks, data reads, and pandas conversion; (2) connector discovery validating registry access and source-faker availability.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • README documentation: Verify setup instructions are clear and accurate, and that the dependency conflict notes are complete
  • Feature file scenarios: Ensure BDD scenarios correctly reflect PyAirbyte's API usage and cover meaningful test cases

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly describes the main change: adding a Hercules AI test agent example to the repository, with an explicit note that it's not ready to merge.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch devin/1762924591-hercules-ai-test-agent

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
examples/ai-test-agents/hercules/requirements.txt (1)

4-5: Consider updating playwright to a more recent minimum version.

The good news: testzeus-hercules>=0.2.0 is already pinned to the latest stable (released Nov 1, 2025), so that's perfect. However, playwright's latest stable is 1.56.0 (released Nov 11, 2025), while your minimum is 1.40.0 from late 2023. Would bumping this to something like >=1.45.0 or higher make sense? This would ensure the environment has more recent bug fixes and features without being overly restrictive. Wdyt?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 782b1f2 and 05b3a94.

📒 Files selected for processing (3)
  • examples/ai-test-agents/hercules/README.md (1 hunks)
  • examples/ai-test-agents/hercules/requirements.txt (1 hunks)
  • examples/ai-test-agents/hercules/test_pyairbyte.feature (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2024-08-31T01:20:08.405Z
Learnt from: aaronsteers
Repo: airbytehq/PyAirbyte PR: 347
File: tests/integration_tests/fixtures/registry.json:48-48
Timestamp: 2024-08-31T01:20:08.405Z
Learning: Test fixtures in the PyAirbyte project do not need to align with real Docker repositories.

Applied to files:

  • examples/ai-test-agents/hercules/test_pyairbyte.feature
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (No Creds)
🔇 Additional comments (9)
examples/ai-test-agents/hercules/README.md (4)

15-29: Installation instructions look solid!

The separate virtual environment approach and the installation steps are well documented. The explanation of the psutil dependency conflict is clear and helpful.


64-73: Great job documenting limitations and alternatives!

I appreciate that you've been transparent about the dependency conflict limitations and provided concrete alternative approaches. This is really helpful for users trying to adopt this example.


11-11: No changes needed—the Python version requirement is accurate.

Based on the verification, testzeus-hercules requires Python 3.11, which matches the README documentation. This isn't overly restrictive; it's the actual upstream requirement of the Hercules package. The documentation is correct as-is.


38-45: All CLI arguments verified as correct.

The command arguments you've included—--input-file, --output-path, --test-data-path, --llm-model, and --llm-model-api-key—are all documented and correct for testzeus-hercules. The example looks solid as-is!

examples/ai-test-agents/hercules/test_pyairbyte.feature (5)

8-8: No changes needed—the import statement is correct.

The script output confirms that import airbyte as ab is the correct and standard convention throughout the PyAirbyte codebase. This is even documented in airbyte/__init__.py itself, which states: "Most examples in the PyAirbyte documentation use the import airbyte as ab convention." The file at line 8 uses the right pattern.


21-21: No issues found – the cache creation API usage is correct.

The code uses ab.new_local_cache() which is the correct method. It's properly exported from the main PyAirbyte module and widely used throughout the codebase in examples and tests with this same no-argument pattern. The method signature supports optional parameters, so calling it without arguments is valid and intended.


25-25: The "users" stream exists in source-faker and is correctly referenced.

The source-faker connector generates an "e-commerce-like" dataset with users, products, and purchases, and all streams (users, products, and purchases) declare id as their respective primary keys. The codebase verification confirms this across multiple test files where cache["users"] is consistently accessed without issues.


29-29: The "id" column assertion is correct—no changes needed.

The "id" field is indeed available in the source-faker connector's users stream. The assertion in your test appropriately validates for this column's presence, so the code is solid as-is.


52-52: The .name attribute is correct—no changes needed.

The code properly accesses the .name attribute on connector metadata objects returned by get_available_connectors(). The ConnectorMetadata class in airbyte/registry.py defines name: str as a field, and ConnectorBase (the parent class for Source/Destination connectors) also exposes .name as a property. Both contexts support the attribute access used in line 52.

Likely an incorrect or invalid review comment.

Comment on lines +47 to +54
## Test Scenario

The included `test_pyairbyte.feature` file contains a simple Gherkin scenario that tests basic PyAirbyte functionality:

- Installing PyAirbyte
- Creating a source connector
- Reading data from the source
- Validating the data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Update to reflect both test scenarios.

The README says "Test Scenario" (singular) but the feature file actually contains TWO scenarios:

  1. Test PyAirbyte source connector with faker (lines 3-36)
  2. Test PyAirbyte connector discovery (lines 38-61)

Would you mind updating this section to mention both scenarios? Something like:

## Test Scenarios

The included `test_pyairbyte.feature` file contains two Gherkin scenarios:

1. **Basic functionality test**: Tests installing PyAirbyte, creating a source connector, reading data, and validating the results
2. **Connector discovery test**: Tests querying the registry to discover available source connectors

Both scenarios validate core PyAirbyte functionality.

WDYT?

🤖 Prompt for AI Agents
In examples/ai-test-agents/hercules/README.md around lines 47 to 54, the "Test
Scenario" section incorrectly refers to a single scenario though the feature
file contains two; update the heading and body to plural "Test Scenarios" and
describe both scenarios: (1) the basic functionality test — installing
PyAirbyte, creating a source connector (faker), reading data and validating
results, and (2) the connector discovery test — querying the registry to
discover available source connectors; end with a brief line that both scenarios
validate core PyAirbyte functionality.

When I create a Python script to discover available connectors:
"""
import airbyte as ab
from airbyte.registry import get_available_connectors
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify the registry import path.

Is from airbyte.registry import get_available_connectors the correct import path? Would be good to verify this is the actual module structure in PyAirbyte. WDYT?


🏁 Script executed:

#!/bin/bash
# Search for registry module and get_available_connectors function
ast-grep --pattern $'def get_available_connectors($$$) {
  $$$
}'

Length of output: 0


Let me try different approaches to verify the registry import:


---

🏁 Script executed:

```shell
#!/bin/bash
# Search for get_available_connectors function definition using rg
rg -nP 'def\s+get_available_connectors' --type=py

Length of output: 113


🏁 Script executed:

#!/bin/bash
# Find files related to registry module
fd -e py '.*registry.*' | head -20

Length of output: 185


🏁 Script executed:

#!/bin/bash
# Search for imports of get_available_connectors to see how it's used
rg -nP 'from\s+airbyte\.registry\s+import|import.*get_available_connectors' --type=py

Length of output: 923


🏁 Script executed:

#!/bin/bash
# Look for airbyte module structure
fd -type d -name 'airbyte' | head -5

Length of output: 170


Let me check the actual test file to see the full context:


---

🏁 Script executed:

```shell
#!/bin/bash
# Get the actual content of the test file around lines 43-52
cat examples/ai-test-agents/hercules/test_pyairbyte.feature | sed -n '38,61p' | cat -n

Length of output: 1153


🏁 Script executed:

#!/bin/bash
# Search for connector_type parameter usage in the codebase
rg -nP 'connector_type\s*=' --type=py -A2 -B2

Length of output: 3502


🏁 Script executed:

#!/bin/bash
# Search for how get_available_connectors is called in actual code
rg -nP 'get_available_connectors\(' --type=py -A2 -B1 | head -50

Length of output: 3141


The import is correct, but the test script has critical bugs in how it uses the function.

The import path from airbyte.registry import get_available_connectors is valid and properly exported. However, the test script has two issues:

  1. Line 46: The parameter should be install_type, not connector_type. Usage in the codebase shows: get_available_connectors(install_type="docker") or install_type="yaml", not connector_type="source".

  2. Line 52: Since get_available_connectors() returns list[str] (connector names as strings), iterating with c.name won't work. Should be: for c in sources if c == "source-faker" instead.

Wdyt on addressing these two issues in the test script?

🤖 Prompt for AI Agents
In examples/ai-test-agents/hercules/test_pyairbyte.feature around lines 43–52,
the test misuses get_available_connectors: change the keyword arg from
connector_type to install_type (e.g.,
get_available_connectors(install_type="docker") or "yaml") and update the loop
to treat returned items as strings (e.g., check equality against "source-faker"
rather than accessing c.name). Ensure any assertions or membership checks use
string comparisons or contains checks on the list of connector names.

from airbyte.registry import get_available_connectors

# Get list of available source connectors
sources = get_available_connectors(connector_type="source")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify the get_available_connectors API.

Could you verify that get_available_connectors(connector_type="source") is the correct function signature? Want to make sure the parameter name is connector_type and not something like type or kind. WDYT?


🏁 Script executed:

#!/bin/bash
# Find the get_available_connectors function definition
rg -n "def get_available_connectors" --type=py -A 10

Length of output: 850


I found the issue! The function definition shows install_type as the parameter, but the code is calling it with connector_type="source".Perfect! The API documentation confirms that get_available_connectors takes install_type parameter, not connector_type. The function definition shows install_type accepts values like InstallType.PYTHON, InstallType.JAVA, InstallType.DOCKER, and InstallType.YAML.

Fix the parameter name in get_available_connectors() call.

The function signature uses install_type, not connector_type. Additionally, "source" is not a valid value for install_type—valid options are "python", "docker", "yaml", "java", or None based on the installation method, not connector type. If you need to filter for source connectors specifically, the current API doesn't support that parameter. What's your intention here?

🤖 Prompt for AI Agents
In examples/ai-test-agents/hercules/test_pyairbyte.feature around line 46, the
call uses connector_type="source" but the function signature expects
install_type; change the argument name to install_type and pass one of the valid
install type values ("python", "docker", "yaml", "java") or None as appropriate
(e.g., install_type="python"); if your intent was to filter by connector
category (“source”), remove this unsupported parameter and instead call
get_available_connectors() without install_type and apply a separate filter on
the returned connectors for type=="source" after retrieving them.

@github-actions
Copy link

PyTest Results (Full)

381 tests  ±0   365 ✅ ±0   25m 18s ⏱️ + 1m 12s
  1 suites ±0    16 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit 05b3a94. ± Comparison against base commit 782b1f2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants