archi-physics · JasonMoho · Mar 23, 2026 · Mar 23, 2026 · Mar 24, 2026 · Mar 24, 2026
diff --git a/.github/workflows/pr-preview.yml b/.github/workflows/pr-preview.yml
@@ -48,7 +48,7 @@ jobs:
           python -m pip install --upgrade pip
           pip install . || true
           pip install -r requirements/requirements-base.txt
-          pip install pytest
+          pip install pytest pytest-asyncio
 
       - name: Run unit tests
         run: python -m pytest tests/unit/ -v --tb=short
@@ -255,12 +255,39 @@ jobs:
           path: playwright-report/
           retention-days: 14
 
-      # ── Cleanup ─────────────────────────────────────────────────────────
-      - name: Cleanup smoke deployment
+      # ── Copilot SDK smoke (BYOK via local Ollama) ─────────────────────
+      # NOTE: The Copilot SDK requires GitHub auth even in BYOK mode.
+      # This step validates build + boot but react_smoke will time out
+      # until CI has a Copilot-authenticated token.  Non-fatal for now.
+      - name: Cleanup CMSCompOps deployment before Copilot smoke
         if: ${{ always() }}
         run: |
           yes | archi delete --name ci-${{ github.run_id }} || true
 
+      - name: Run Copilot SDK smoke deployment
+        continue-on-error: true
+        uses: ./.github/actions/run-smoke
+        with:
+          deployment-name: ci-copilot-${{ github.run_id }}
+          config-path: tests/pr_preview_config/pr_preview_copilot_config.yaml
+          config-destination: configs/ci/ci_copilot_config.generated.yaml
+          services: chatbot
+          hostmode: "true"
+          wait-url: http://localhost:2786/api/health
+          base-url: http://localhost:2786
+          extra-env: |
+            ARCHI_COMPOSE_UP_FLAGS=--build --force-recreate
+            SMOKE_OLLAMA_MODEL=qwen3:4b
+            SMOKE_OLLAMA_URL=http://localhost:11434
+            SMOKE_OLLAMA_HOST=http://localhost:11434
+          use-podman: "false"
+
+      # ── Cleanup ─────────────────────────────────────────────────────────
+      - name: Cleanup smoke deployments
+        if: ${{ always() }}
+        run: |
+          yes | archi delete --name ci-copilot-${{ github.run_id }} || true
+
       - name: Cleanup local base images
         if: ${{ always() && needs.build-base-images.outputs.changed == 'true' }}
         run: |

diff --git a/.gitignore b/.gitignore
@@ -37,3 +37,4 @@ git/
 local_files/
 raw_local_files/
 websites/
+.env_tmp_smoke
diff --git a/docs/multi-backend-agent-recommendation.md b/docs/multi-backend-agent-recommendation.md
@@ -0,0 +1,89 @@
+# Multi-Backend Agent Abstraction: Recommendation
+
+**Date:** March 25, 2026
+**Question:** Should A2rchi support a general agent backend (Copilot SDK, Claude Agent SDK, LangChain) or lock into the Copilot SDK?
+
+**Verdict: Lock into Copilot SDK. A general abstraction is feasible but not advisable.**
+
+## Side-by-Side Comparison
+
+| Dimension | Copilot SDK | Claude Agent SDK | LangChain |
+|---|---|---|---|
+| **Runtime** | CLI subprocess (`copilot --headless`) | CLI subprocess (`claude` CLI) | In-process graph |
+| **Tool definition** | `defineTool(name, {description, parameters: JSONSchema, handler})` | `@tool(name, desc, schema)` → must return MCP `{"content": [...]}` | `@tool` decorator, returns `str` |
+| **Streaming** | Event callbacks: `session.on("event_type", handler)` | Async iterator: `async for msg in query()` | State generator: `for chunk in agent.stream()` |
+| **Session** | `createSession()` → `sendAndWait()`, managed by CLI | `query()` (stateless) or `ClaudeSDKClient` (sessioned), managed by CLI | `invoke(state)`, state is external (you manage it) |
+| **Models** | GPT-4.1 default, BYOK for OpenAI/Azure/Anthropic/Google/Mistral | Claude only, BYOK via Bedrock/Vertex/Azure AI Foundry | Any provider via `init_chat_model()` |
+| **Hooks** | `onPreToolUse`, `onPostToolUse`, session lifecycle | `PreToolUse`, `PostToolUse`, `PermissionRequest`, etc. | Middleware: `@before_model`, `@after_model`, `@wrap_tool_call` |
+| **Auth** | GitHub OAuth, env vars, BYOK | Anthropic API key, Bedrock, Vertex | Per-model provider keys |
+
+## Key Issues With a General Abstraction
+
+### 1. Tool return format mismatch
+
+Claude Agent SDK enforces MCP wire format — tools must return `{"content": [{"type": "text", "text": "..."}]}`. Copilot tools return any serializable value. LangChain tools return strings. Every tool needs a per-backend wrapper that normalizes both input schemas and return formats. Our 7 tools become 21 adapter functions.
+
+### 2. Both Copilot and Claude SDKs are CLI wrappers
+
+They spawn a subprocess and communicate over stdio/TCP. LangChain runs fully in-process. This means:
+
+- Two separate CLI binaries in your Docker image
+- Two different auth flows (GitHub OAuth vs Anthropic API key)
+- Two different process lifecycle managers
+- LangChain requires none of this (but has completely different plumbing)
+
+### 3. Three incompatible streaming models
+
+Our existing `copilot_event_adapter.py` is ~400 lines that translate Copilot's event callbacks into `PipelineOutput` objects. We'd need an equivalent adapter for each backend — each handling different event types, different data shapes, different async patterns (callbacks vs async iterators vs sync generators).
+
+### 4. Claude Agent SDK BYOK is provider-level, not model-level
+
+The Claude Agent SDK does support BYOK via Amazon Bedrock, Google Vertex AI, and Microsoft Azure AI Foundry. But this means "bring your own cloud credentials to access **Claude models**" — not "bring your own key to use any model." You're still restricted to Claude (Sonnet, Opus, Haiku). Copilot SDK's BYOK lets you swap between entirely different model families (GPT-4.1, Claude, Gemini, Mistral). A2rchi's multi-provider model selection would not work through the Claude Agent SDK.
+
+### 5. Session lifecycle is fundamentally different
+
+Copilot and Claude manage sessions inside their CLI process (persist, resume, fork). LangChain has no built-in session — you provide state via checkpointers. Abstracting over "session" means accepting the lowest common denominator: no resume, no persistence, no fork.
+
+### 6. LCD strips unique value from each SDK
+
+- **Copilot:** Custom agents, skills, system message section overrides (replace/remove/append per section) — can't express through an abstraction
+- **Claude:** Permission system, sandbox, file checkpointing, subagents — not available in others
+- **LangChain:** Middleware pipeline, dynamic model selection, structured output strategies — completely different paradigm
+
+## The Math
+
+Each additional backend requires:
+
+| Component | LOC |
+|---|---|
+| Event/streaming adapter | ~400 |
+| Tool wrappers (7 tools × format normalization) | ~200 |
+| Session lifecycle management | ~300 |
+| Auth/config integration | ~150 |
+| **Total per backend** | **~1,050** |
+
+Plus ongoing maintenance when any SDK ships breaking changes.
+
+## Why the Architecture Already Supports a Future Pivot
+
+The current architecture is already well-separated:
+
+- **`archi.py`** is 100% backend-agnostic — it calls `pipeline.stream()` and validates `PipelineOutput`
+- The pipeline factory (`getattr(archiPipelines, class_name)`) lets you add a `LangChainAgentPipeline` or `ClaudeAgentPipeline` as a new pipeline class without touching any shared code
+- **`PipelineOutput`** is the universal contract — any new backend just needs to yield these
+
+No premature abstraction layer needed. When the time comes, you add a new pipeline class.
+
+## If You Ever Need a Second Backend
+
+**LangChain is the better addition** (not Claude Agent SDK) because:
+
+1. It runs in-process (no CLI dependency)
+2. It supports any model provider
+3. Its `@tool` decorator is closest to Copilot's `defineTool`
+
+But even then, it's ~1,000+ LOC of glue code for marginal value — the same users who want "Anthropic models" already get them through Copilot SDK's BYOK.
+
+## Recommendation
+
+Stay on the Copilot SDK. Build the second backend only when a concrete use case demands it — the architecture is ready.
diff --git a/examples/agents/cms-comp-ops.md b/examples/agents/cms-comp-ops.md
@@ -4,6 +4,8 @@ tools:
   - search_vectorstore_hybrid
   - search_local_files
   - search_metadata_index
+  - list_metadata_schema
+  - fetch_catalog_document
 ---
 
 You are the CMS Comp Ops assistant. You help with operational questions, troubleshooting,

diff --git a/pyproject.toml b/pyproject.toml
@@ -2,7 +2,7 @@
 name = "archi"
 version = "1.2.4"
 description = "An AI Augmented Research Chat Intelligence (archi)"
-requires-python = ">=3.7"
+requires-python = ">=3.10"
 authors = [
     {name="Pietro Lugato", email="pmlugato@mit.edu"},
     {name="Julius Heitkoetter", email="juliush@mit.edu"},
@@ -14,7 +14,7 @@ authors = [
 ]
 dependencies = [
     "pyyaml==6.0.1",
-    "click==8.1.7",
+    "click>=8.1.7",
     "jinja2==3.1.3",
     "requests==2.31.0",
     "podman-compose==1.4.0",
@@ -48,6 +48,7 @@ build-backend = "setuptools.build_meta"
 [tool.pytest.ini_options]
 testpaths = ["tests/unit"]
 addopts = "-v --tb=short"
+asyncio_mode = "auto"
 
 [project.urls]
 "Homepage" = "https://github.com/archi-physics/archi"
diff --git a/requirements/requirements-base.txt b/requirements/requirements-base.txt
@@ -18,6 +18,7 @@ httptools==0.6.1
 httpx==0.27.2
 humanfriendly==10.0
 croniter==2.0.5
+github-copilot-sdk>=0.2.0
 langgraph==1.0.2
 langchain-mcp-adapters==0.1.11
 langchain==1.0.3

diff --git a/src/archi/pipelines/__init__.py b/src/archi/pipelines/__init__.py
@@ -1,11 +1,12 @@
 """Pipeline package exposing the available pipeline classes."""
 
+from .agents.base_react import BaseReActAgent
+from .agents.cms_comp_ops_agent import CMSCompOpsAgent
 from .classic_pipelines.base import BasePipeline
 from .classic_pipelines.grading import GradingPipeline
 from .classic_pipelines.image_processing import ImageProcessingPipeline
 from .classic_pipelines.qa import QAPipeline
-from .agents.base_react import BaseReActAgent
-from .agents.cms_comp_ops_agent import CMSCompOpsAgent
+from .copilot_agents.copilot_agent import CopilotAgentPipeline
 
 __all__ = [
     "BasePipeline",
@@ -14,4 +15,5 @@
     "QAPipeline",
     "BaseReActAgent",
     "CMSCompOpsAgent",
+    "CopilotAgentPipeline",
 ]
diff --git a/src/archi/pipelines/agents/base_react.py b/src/archi/pipelines/agents/base_react.py
@@ -19,7 +19,7 @@
 from src.archi.providers.base import ProviderType
 from src.archi.utils.output_dataclass import PipelineOutput
 from src.archi.pipelines.agents.utils.run_memory import RunMemory
-from src.archi.pipelines.agents.utils.mcp_utils import AsyncLoopThread
+from src.archi.utils.async_loop import AsyncLoopThread
 from src.archi.pipelines.agents.tools import initialize_mcp_client
 from src.utils.logging import get_logger
 
@@ -79,6 +79,10 @@ def create_run_memory(self) -> RunMemory:
         """Instantiate a fresh run memory for an agent run."""
         return RunMemory()
 
+    def supports_persisted_session_id(self) -> bool:
+        """Classic ReAct agents are stateless beyond the provided history."""
+        return False
+
     def start_run_memory(self) -> RunMemory:
         """Create and store the active memory for the current run."""
         memory = self.create_run_memory()

diff --git a/src/archi/pipelines/agents/tools/local_files.py b/src/archi/pipelines/agents/tools/local_files.py
@@ -108,7 +108,13 @@ def search(
             params=params,
             headers=self._headers,
             timeout=self.timeout,
+            allow_redirects=False,
         )
+        if resp.is_redirect or resp.status_code in (301, 302, 303, 307, 308):
+            raise RuntimeError(
+                f"Catalog API redirected to {resp.headers.get('Location', '?')} — "
+                "check DM_API_TOKEN or data_manager auth config"
+            )
         resp.raise_for_status()
         data = resp.json()
         return data.get("hits", []) or []
@@ -119,9 +125,15 @@ def get_document(self, resource_hash: str, *, max_chars: int = 4000) -> Optional
             params={"max_chars": max_chars},
             headers=self._headers,
             timeout=self.timeout,
+            allow_redirects=False,
         )
         if resp.status_code == 404:
             return None
+        if resp.is_redirect or resp.status_code in (301, 302, 303, 307, 308):
+            raise RuntimeError(
+                f"Catalog API redirected to {resp.headers.get('Location', '?')} \u2014 "
+                "check DM_API_TOKEN or data_manager auth config"
+            )
         resp.raise_for_status()
         return resp.json()
 
@@ -130,7 +142,13 @@ def schema(self) -> Dict[str, object]:
             f"{self.base_url}/api/catalog/schema",
             headers=self._headers,
             timeout=self.timeout,
+            allow_redirects=False,
         )
+        if resp.is_redirect or resp.status_code in (301, 302, 303, 307, 308):
+            raise RuntimeError(
+                f"Catalog API redirected to {resp.headers.get('Location', '?')} \u2014 "
+                "check DM_API_TOKEN or data_manager auth config"
+            )
         resp.raise_for_status()
         return resp.json()
 

diff --git a/src/archi/pipelines/agents/tools/retriever.py b/src/archi/pipelines/agents/tools/retriever.py
@@ -62,7 +62,7 @@ def _format_documents_for_llm(
 def create_retriever_tool(
     retriever: BaseRetriever,
     *,
-    name: str = "search_knowledge_base",
+    name: str = "search_vectorstore_hybrid",
     description: Optional[str] = None,
     max_documents: int = 4,
     max_chars: int = 800,

diff --git a/src/archi/pipelines/copilot_agents/__init__.py b/src/archi/pipelines/copilot_agents/__init__.py
@@ -0,0 +1,5 @@
+"""Copilot SDK agent package."""
+
+from .copilot_agent import CopilotAgentPipeline
+
+__all__ = ["CopilotAgentPipeline"]