feat: Add Hercules AI test agent example (do not merge) #861

aaronsteers · 2025-11-12T05:21:48Z

feat: Add Hercules AI test agent example (do not merge)

Summary

This PR adds an example demonstrating how to use Hercules (an open-source AI testing agent) to test PyAirbyte functionality. This is the first part of a "bake off" comparison between Hercules and Goose AI test agents.

Key additions:

Documentation on using Hercules with PyAirbyte in examples/ai-test-agents/hercules/README.md
Gherkin feature file with two test scenarios (test_pyairbyte.feature)
Requirements file for Hercules dependencies (requirements.txt)

Important note: Due to a dependency conflict (Hercules requires psutil <6.0.0 while airbyte-cdk requires psutil 6.1.0), Hercules must be installed in a separate virtual environment. This example documents the workaround approach rather than integrating Hercules as a direct dependency.

Review & Testing Checklist for Human

This is a YELLOW risk PR (documentation/example only, but untested):

Verify the Gherkin test scenarios are valid - I created these based on PyAirbyte's API but have NOT tested them with Hercules
Check if the separate environment approach aligns with the "bake off" intent - This is more of a "how to use together" guide than a true integration
Review the installation instructions - Confirm the Hercules installation steps are correct and complete

Test Plan

To verify this example works:

Create a separate virtual environment
Install Hercules following the README instructions
Set up an OpenAI API key
Run Hercules with the provided feature file
Verify the test scenarios execute successfully

Notes

This is part 1 of 2 for the AI test agent bake-off (Goose implementation still pending)
The dependency conflict prevents Hercules from being a direct PyAirbyte dependency
The Gherkin scenarios are untested examples that may need refinement
Requested by: AJ Steers ([email protected], @aaronsteers)
Devin session: https://app.devin.ai/sessions/0d255eb4e8c14c6a93cd7cbc5c8f9a72

Summary by CodeRabbit

Documentation
- Added comprehensive guide for using Hercules AI testing agent to test PyAirbyte, including setup and usage instructions.
Tests
- Added test scenarios for PyAirbyte integration, covering basic functionality and connector discovery workflows.

- Add example demonstrating how to use Hercules to test PyAirbyte - Include Gherkin feature files with test scenarios - Document installation and usage in separate environment - Note dependency conflict with airbyte-cdk (psutil version) Co-Authored-By: AJ Steers <[email protected]>

devin-ai-integration · 2025-11-12T05:21:52Z

Original prompt from AJ Steers

Received message in Slack channel #ask-devin-ai:

@Devin - Let's have an AI-Test-Agent bake off between Hercules and Goose. Create a minimal implementation for both in PyAirbyte in separate PRs.
Thread URL: https://airbytehq-team.slack.com/archives/C08BHPUMEPJ/p1762924311392839

devin-ai-integration · 2025-11-12T05:21:53Z

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

Disable automatic comment and CI monitoring

github-actions · 2025-11-12T05:22:10Z

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This PyAirbyte Version

You can test this version of PyAirbyte using the following:

# Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1762924591-hercules-ai-test-agent' pyairbyte --help

# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1762924591-hercules-ai-test-agent'

Helpful Resources

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

/fix-pr - Fixes most formatting and linting issues
/poetry-lock - Updates poetry.lock file
/test-pr - Runs tests with the updated PyAirbyte

Community Support

Questions? Join the #pyairbyte channel in our Slack workspace.

📝 Edit this welcome message.

github-actions · 2025-11-12T05:28:50Z

PyTest Results (Fast Tests Only, No Creds)

312 tests ±0 312 ✅ ±0 5m 51s ⏱️ -2s
1 suites ±0 0 💤 ±0
1 files ±0 0 ❌ ±0

Results for commit 05b3a94. ± Comparison against base commit 782b1f2.

coderabbitai · 2025-11-12T05:29:14Z

📝 Walkthrough

Walkthrough

Three new files added to the examples/ai-test-agents/hercules/ directory: a README documenting Hercules AI test agent setup and usage for PyAirbyte testing, a requirements.txt file specifying dependencies, and a BDD-style feature file containing test scenarios for PyAirbyte integration.

Changes

Cohort / File(s)	Summary
Hercules AI Test Agent Example `examples/ai-test-agents/hercules/README.md`	Documentation covering prerequisites, separate virtual environment installation due to dependency conflicts, setup instructions for Hercules and Playwright, usage guidelines including LLM API key configuration, test scenario overview, expected outputs (JUnit XML, HTML reports, logs), and limitations with alternative approaches suggested (Docker, CI/CD).
Hercules AI Test Agent Example `examples/ai-test-agents/hercules/requirements.txt`	Dependencies file with testzeus-hercules (≥0.2.0) and playwright (≥1.40.0), plus advisory comments about isolated virtual environment installation.
Hercules AI Test Agent Example `examples/ai-test-agents/hercules/test_pyairbyte.feature`	BDD feature file with two test scenarios: (1) basic PyAirbyte functionality covering source creation, connectivity checks, data reads, and pandas conversion; (2) connector discovery validating registry access and source-faker availability.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

README documentation: Verify setup instructions are clear and accurate, and that the dependency conflict notes are complete
Feature file scenarios: Ensure BDD scenarios correctly reflect PyAirbyte's API usage and cover meaningful test cases

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly describes the main change: adding a Hercules AI test agent example to the repository, with an explicit note that it's not ready to merge.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch devin/1762924591-hercules-ai-test-agent

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (1)

examples/ai-test-agents/hercules/requirements.txt (1)

4-5: Consider updating playwright to a more recent minimum version.

The good news: testzeus-hercules>=0.2.0 is already pinned to the latest stable (released Nov 1, 2025), so that's perfect. However, playwright's latest stable is 1.56.0 (released Nov 11, 2025), while your minimum is 1.40.0 from late 2023. Would bumping this to something like >=1.45.0 or higher make sense? This would ensure the environment has more recent bug fixes and features without being overly restrictive. Wdyt?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 782b1f2 and 05b3a94.

📒 Files selected for processing (3)

examples/ai-test-agents/hercules/README.md (1 hunks)
examples/ai-test-agents/hercules/requirements.txt (1 hunks)
examples/ai-test-agents/hercules/test_pyairbyte.feature (1 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2024-08-31T01:20:08.405Z

Learnt from: aaronsteers
Repo: airbytehq/PyAirbyte PR: 347
File: tests/integration_tests/fixtures/registry.json:48-48
Timestamp: 2024-08-31T01:20:08.405Z
Learning: Test fixtures in the PyAirbyte project do not need to align with real Docker repositories.

Applied to files:

examples/ai-test-agents/hercules/test_pyairbyte.feature

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)

GitHub Check: Pytest (All, Python 3.10, Ubuntu)
GitHub Check: Pytest (All, Python 3.11, Ubuntu)
GitHub Check: Pytest (All, Python 3.11, Windows)
GitHub Check: Pytest (All, Python 3.10, Windows)
GitHub Check: Pytest (Fast)
GitHub Check: Pytest (No Creds)

🔇 Additional comments (9)

examples/ai-test-agents/hercules/README.md (4)

15-29: Installation instructions look solid!

The separate virtual environment approach and the installation steps are well documented. The explanation of the psutil dependency conflict is clear and helpful.

64-73: Great job documenting limitations and alternatives!

I appreciate that you've been transparent about the dependency conflict limitations and provided concrete alternative approaches. This is really helpful for users trying to adopt this example.

11-11: No changes needed—the Python version requirement is accurate.

Based on the verification, testzeus-hercules requires Python 3.11, which matches the README documentation. This isn't overly restrictive; it's the actual upstream requirement of the Hercules package. The documentation is correct as-is.

38-45: All CLI arguments verified as correct.

The command arguments you've included—--input-file, --output-path, --test-data-path, --llm-model, and --llm-model-api-key—are all documented and correct for testzeus-hercules. The example looks solid as-is!

examples/ai-test-agents/hercules/test_pyairbyte.feature (5)

8-8: No changes needed—the import statement is correct.

The script output confirms that import airbyte as ab is the correct and standard convention throughout the PyAirbyte codebase. This is even documented in airbyte/__init__.py itself, which states: "Most examples in the PyAirbyte documentation use the import airbyte as ab convention." The file at line 8 uses the right pattern.

21-21: No issues found – the cache creation API usage is correct.

The code uses ab.new_local_cache() which is the correct method. It's properly exported from the main PyAirbyte module and widely used throughout the codebase in examples and tests with this same no-argument pattern. The method signature supports optional parameters, so calling it without arguments is valid and intended.

25-25: The "users" stream exists in source-faker and is correctly referenced.

The source-faker connector generates an "e-commerce-like" dataset with users, products, and purchases, and all streams (users, products, and purchases) declare id as their respective primary keys. The codebase verification confirms this across multiple test files where cache["users"] is consistently accessed without issues.

29-29: The "id" column assertion is correct—no changes needed.

The "id" field is indeed available in the source-faker connector's users stream. The assertion in your test appropriately validates for this column's presence, so the code is solid as-is.

52-52: The .name attribute is correct—no changes needed.

The code properly accesses the .name attribute on connector metadata objects returned by get_available_connectors(). The ConnectorMetadata class in airbyte/registry.py defines name: str as a field, and ConnectorBase (the parent class for Source/Destination connectors) also exposes .name as a property. Both contexts support the attribute access used in line 52.

Likely an incorrect or invalid review comment.

coderabbitai · 2025-11-12T05:29:17Z

examples/ai-test-agents/hercules/README.md

+## Test Scenario
+
+The included `test_pyairbyte.feature` file contains a simple Gherkin scenario that tests basic PyAirbyte functionality:
+
+- Installing PyAirbyte
+- Creating a source connector
+- Reading data from the source
+- Validating the data


🛠️ Refactor suggestion | 🟠 Major

Update to reflect both test scenarios.

The README says "Test Scenario" (singular) but the feature file actually contains TWO scenarios:

Test PyAirbyte source connector with faker (lines 3-36)

Test PyAirbyte connector discovery (lines 38-61)

Would you mind updating this section to mention both scenarios? Something like:

## Test Scenarios The included `test_pyairbyte.feature` file contains two Gherkin scenarios: 1. **Basic functionality test**: Tests installing PyAirbyte, creating a source connector, reading data, and validating the results 2. **Connector discovery test**: Tests querying the registry to discover available source connectors Both scenarios validate core PyAirbyte functionality.

WDYT?

🤖 Prompt for AI Agents

In examples/ai-test-agents/hercules/README.md around lines 47 to 54, the "Test Scenario" section incorrectly refers to a single scenario though the feature file contains two; update the heading and body to plural "Test Scenarios" and describe both scenarios: (1) the basic functionality test — installing PyAirbyte, creating a source connector (faker), reading data and validating results, and (2) the connector discovery test — querying the registry to discover available source connectors; end with a brief line that both scenarios validate core PyAirbyte functionality.

coderabbitai · 2025-11-12T05:29:17Z

examples/ai-test-agents/hercules/test_pyairbyte.feature

+    When I create a Python script to discover available connectors:
+      """
+      import airbyte as ab
+      from airbyte.registry import get_available_connectors


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify the registry import path.

Is from airbyte.registry import get_available_connectors the correct import path? Would be good to verify this is the actual module structure in PyAirbyte. WDYT?

🏁 Script executed:

#!/bin/bash # Search for registry module and get_available_connectors function ast-grep --pattern $'def get_available_connectors($$$) { $$$ }'

Length of output: 0

Let me try different approaches to verify the registry import:

--- 🏁 Script executed: ```shell #!/bin/bash # Search for get_available_connectors function definition using rg rg -nP 'def\s+get_available_connectors' --type=py

Length of output: 113

🏁 Script executed:

#!/bin/bash # Find files related to registry module fd -e py '.*registry.*' | head -20

Length of output: 185

🏁 Script executed:

#!/bin/bash # Search for imports of get_available_connectors to see how it's used rg -nP 'from\s+airbyte\.registry\s+import|import.*get_available_connectors' --type=py

Length of output: 923

🏁 Script executed:

#!/bin/bash # Look for airbyte module structure fd -type d -name 'airbyte' | head -5

Length of output: 170

Let me check the actual test file to see the full context:

--- 🏁 Script executed: ```shell #!/bin/bash # Get the actual content of the test file around lines 43-52 cat examples/ai-test-agents/hercules/test_pyairbyte.feature | sed -n '38,61p' | cat -n

Length of output: 1153

🏁 Script executed:

#!/bin/bash # Search for connector_type parameter usage in the codebase rg -nP 'connector_type\s*=' --type=py -A2 -B2

Length of output: 3502

🏁 Script executed:

#!/bin/bash # Search for how get_available_connectors is called in actual code rg -nP 'get_available_connectors\(' --type=py -A2 -B1 | head -50

Length of output: 3141

The import is correct, but the test script has critical bugs in how it uses the function.

The import path from airbyte.registry import get_available_connectors is valid and properly exported. However, the test script has two issues:

Line 46: The parameter should be install_type, not connector_type. Usage in the codebase shows: get_available_connectors(install_type="docker") or install_type="yaml", not connector_type="source".

Line 52: Since get_available_connectors() returns list[str] (connector names as strings), iterating with c.name won't work. Should be: for c in sources if c == "source-faker" instead.

Wdyt on addressing these two issues in the test script?

🤖 Prompt for AI Agents

In examples/ai-test-agents/hercules/test_pyairbyte.feature around lines 43–52, the test misuses get_available_connectors: change the keyword arg from connector_type to install_type (e.g., get_available_connectors(install_type="docker") or "yaml") and update the loop to treat returned items as strings (e.g., check equality against "source-faker" rather than accessing c.name). Ensure any assertions or membership checks use string comparisons or contains checks on the list of connector names.

coderabbitai · 2025-11-12T05:29:17Z

examples/ai-test-agents/hercules/test_pyairbyte.feature

+      from airbyte.registry import get_available_connectors
+
+      # Get list of available source connectors
+      sources = get_available_connectors(connector_type="source")


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify the get_available_connectors API.

Could you verify that get_available_connectors(connector_type="source") is the correct function signature? Want to make sure the parameter name is connector_type and not something like type or kind. WDYT?

🏁 Script executed:

#!/bin/bash # Find the get_available_connectors function definition rg -n "def get_available_connectors" --type=py -A 10

Length of output: 850

I found the issue! The function definition shows install_type as the parameter, but the code is calling it with connector_type="source".Perfect! The API documentation confirms that get_available_connectors takes install_type parameter, not connector_type. The function definition shows install_type accepts values like InstallType.PYTHON, InstallType.JAVA, InstallType.DOCKER, and InstallType.YAML.

Fix the parameter name in get_available_connectors() call.

The function signature uses install_type, not connector_type. Additionally, "source" is not a valid value for install_type—valid options are "python", "docker", "yaml", "java", or None based on the installation method, not connector type. If you need to filter for source connectors specifically, the current API doesn't support that parameter. What's your intention here?

🤖 Prompt for AI Agents

In examples/ai-test-agents/hercules/test_pyairbyte.feature around line 46, the call uses connector_type="source" but the function signature expects install_type; change the argument name to install_type and pass one of the valid install type values ("python", "docker", "yaml", "java") or None as appropriate (e.g., install_type="python"); if your intent was to filter by connector category (“source”), remove this unsupported parameter and instead call get_available_connectors() without install_type and apply a separate filter on the returned connectors for type=="source" after retrieving them.

github-actions · 2025-11-12T05:48:27Z

PyTest Results (Full)

381 tests ±0 365 ✅ ±0 25m 18s ⏱️ + 1m 12s
1 suites ±0 16 💤 ±0
1 files ±0 0 ❌ ±0

Results for commit 05b3a94. ± Comparison against base commit 782b1f2.

devin-ai-integration bot assigned aaronsteers Nov 12, 2025

aaronsteers mentioned this pull request Nov 12, 2025

feat: Add Goose AI developer agent example (do not merge) #862

Draft

5 tasks

coderabbitai bot requested changes Nov 12, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add Hercules AI test agent example (do not merge) #861

feat: Add Hercules AI test agent example (do not merge) #861

Uh oh!

aaronsteers commented Nov 12, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

devin-ai-integration bot commented Nov 12, 2025

Uh oh!

devin-ai-integration bot commented Nov 12, 2025

Uh oh!

github-actions bot commented Nov 12, 2025

Uh oh!

github-actions bot commented Nov 12, 2025

Uh oh!

coderabbitai bot commented Nov 12, 2025

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Nov 12, 2025

Uh oh!

coderabbitai bot Nov 12, 2025

Uh oh!

coderabbitai bot Nov 12, 2025

Uh oh!

github-actions bot commented Nov 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: Add Hercules AI test agent example (do not merge) #861

Are you sure you want to change the base?

feat: Add Hercules AI test agent example (do not merge) #861

Uh oh!

Conversation

aaronsteers commented Nov 12, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

feat: Add Hercules AI test agent example (do not merge)

Summary

Review & Testing Checklist for Human

Test Plan

Notes

Summary by CodeRabbit

Uh oh!

devin-ai-integration bot commented Nov 12, 2025

Uh oh!

devin-ai-integration bot commented Nov 12, 2025

🤖 Devin AI Engineer

Uh oh!

github-actions bot commented Nov 12, 2025

👋 Greetings, Airbyte Team Member!

Testing This PyAirbyte Version

Helpful Resources

PR Slash Commands

Community Support

Uh oh!

github-actions bot commented Nov 12, 2025

PyTest Results (Fast Tests Only, No Creds)

Uh oh!

coderabbitai bot commented Nov 12, 2025

Walkthrough

Changes

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Nov 12, 2025

PyTest Results (Full)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

aaronsteers commented Nov 12, 2025 •

edited by coderabbitai bot

Loading