airbytehq · aaronsteers · Nov 12, 2025 · coderabbitai · Nov 12, 2025 · coderabbitai
diff --git a/examples/ai-test-agents/hercules/README.md b/examples/ai-test-agents/hercules/README.md
@@ -0,0 +1,73 @@
+# Hercules AI Test Agent for PyAirbyte
+
+This example demonstrates how to use [Hercules](https://github.com/test-zeus-ai/testzeus-hercules), an open-source AI testing agent, to test PyAirbyte functionality.
+
+## About Hercules
+
+Hercules is the world's first open-source testing agent that uses Gherkin format for test scenarios. It can perform UI, API, and other types of testing without requiring manual scripting.
+
+## Prerequisites
+
+1. Python 3.11 or higher
+2. Hercules installed in a separate virtual environment (due to dependency conflicts with PyAirbyte)
+3. OpenAI API key or other LLM provider credentials
+
+## Installation
+
+Since Hercules has a dependency conflict with PyAirbyte's airbyte-cdk (psutil version), it should be installed in a separate virtual environment:
+
+```bash
+# Create a separate virtual environment for Hercules
+python -m venv hercules-env
+source hercules-env/bin/activate  # On Windows: hercules-env\Scripts\activate
+
+# Install Hercules
+pip install testzeus-hercules
+
+# Install Playwright (required by Hercules)
+playwright install --with-deps
+```
+
+## Usage
+
+1. Set your LLM API key:
+```bash
+export OPENAI_API_KEY="your-api-key-here"
+```
+
+2. Run Hercules with the test feature file:
+```bash
+testzeus-hercules --input-file test_pyairbyte.feature \
+                  --output-path ./output \
+                  --test-data-path ./test_data \
+                  --llm-model gpt-4o \
+                  --llm-model-api-key $OPENAI_API_KEY
+```
+
+## Test Scenario
+
+The included `test_pyairbyte.feature` file contains a simple Gherkin scenario that tests basic PyAirbyte functionality:
+
+- Installing PyAirbyte
+- Creating a source connector
+- Reading data from the source
+- Validating the data
+
+## Output
+
+Hercules will generate:
+- JUnit XML test results in `./output/`
+- HTML test report in `./output/`
+- Execution proofs (screenshots, videos, network logs) in `./proofs/`
+- Detailed logs in `./log_files/`
+
+## Limitations
+
+Due to dependency conflicts (specifically psutil version requirements), Hercules cannot be installed in the same environment as PyAirbyte. This example demonstrates using Hercules in a separate environment to test PyAirbyte functionality.
+
+## Alternative Approach
+
+For integrated testing, consider:
+1. Using Docker to run Hercules in an isolated container
+2. Creating a CI/CD pipeline that runs Hercules tests separately
+3. Using Hercules to test PyAirbyte's CLI or API endpoints rather than importing it directly
diff --git a/examples/ai-test-agents/hercules/requirements.txt b/examples/ai-test-agents/hercules/requirements.txt
@@ -0,0 +1,5 @@
+# Requirements for running Hercules AI test agent
+# Note: Install these in a separate virtual environment due to dependency conflicts with PyAirbyte
+
+testzeus-hercules>=0.2.0
+playwright>=1.40.0
diff --git a/examples/ai-test-agents/hercules/test_pyairbyte.feature b/examples/ai-test-agents/hercules/test_pyairbyte.feature
@@ -0,0 +1,61 @@
+Feature: PyAirbyte Basic Functionality Test
+
+  Scenario: Test PyAirbyte source connector with faker
+    Given I have Python installed on my system
+    When I install PyAirbyte using "pip install airbyte"
+    And I create a Python script with the following code:
+      """
+      import airbyte as ab
+
+      # Create a source connector
+      source = ab.get_source(
+          "source-faker",
+          config={"count": 10},
+          install_if_missing=True
+      )
+
+      # Check the connection
+      source.check()
+
+      # Read data into a local cache
+      cache = ab.new_local_cache()
+      result = source.read(cache)
+
+      # Get data from a stream
+      df = cache["users"].to_pandas()
+
+      # Validate we got data
+      assert len(df) > 0, "No data was read from source"
+      assert "id" in df.columns, "Expected 'id' column not found"
+
+      print(f"Successfully read {len(df)} records from source-faker")
+      """
+    And I run the Python script
+    Then the script should execute successfully
+    And I should see output containing "Successfully read"
+    And I should see output containing "records from source-faker"
+
+  Scenario: Test PyAirbyte connector discovery
+    Given I have PyAirbyte installed
+    When I create a Python script to discover available connectors:
+      """
+      import airbyte as ab
+      from airbyte.registry import get_available_connectors
+
+      # Get list of available source connectors
+      sources = get_available_connectors(connector_type="source")
+
+      # Validate we have connectors
+      assert len(sources) > 0, "No source connectors found"
+
+      # Check that source-faker is available
+      faker_found = any(c.name == "source-faker" for c in sources)
+      assert faker_found, "source-faker not found in available connectors"
+
+      print(f"Found {len(sources)} source connectors")
+      print("source-faker is available")
+      """
+    And I run the Python script
+    Then the script should execute successfully
+    And I should see output containing "source connectors"
+    And I should see output containing "source-faker is available"