archi-physics · hassan11196 · Mar 15, 2026 · Mar 15, 2026 · Mar 15, 2026 · Mar 15, 2026
diff --git a/docs/docs/agents_tools.md b/docs/docs/agents_tools.md
@@ -250,6 +250,31 @@ external information retrieval.
 - Each MCP tool is wrapped for synchronous execution so it integrates seamlessly with the ReAct agent loop
 - Tool names from MCP servers are namespaced to avoid conflicts with built-in tools
 
+### Built-in Archi MCP Server
+
+When `services.mcp_server.enabled: true` is set, the chat service also exposes
+its own MCP server at `/mcp/sse`. This lets IDEs and MCP clients connect
+directly to an Archi deployment and use Archi-native tools over SSE.
+
+The built-in Archi MCP server currently exposes these read-only tools:
+
+- `archi_query` — ask the deployment a question through the normal RAG/chat pipeline
+- `archi_list_documents` — page through indexed documents with source, status, and enabled state
+- `archi_search_document_metadata` — search by metadata fields such as `source_type`, `ticket_id`, `url`, or `relative_path`
+- `archi_list_metadata_schema` — inspect the metadata keys and common values supported by metadata search
+- `archi_search_document_content` — grep-like exact or regex search over indexed document contents
+- `archi_get_document_content` — fetch the raw text content for a document by hash
+- `archi_get_document_chunks` — inspect stored chunk boundaries and chunk text for a document
+- `archi_get_data_stats` — view corpus-level document, chunk, source, and ingestion statistics
+- `archi_get_deployment_info` — inspect active model, retrieval settings, embedding config, and MCP runtime info
+- `archi_list_agents` — list available agent specs and their configured tools
+- `archi_get_agent_spec` — fetch the full markdown agent spec for a named agent
+- `archi_health` — basic deployment/database health check
+
+These tools are especially useful from VS Code, Cursor, Claude Desktop, and
+Claude Code when you want direct access to Archi's indexed corpus without
+having to proxy through a separate MCP server.
+
 ---
 
 ## Vector Store & Retrieval

diff --git a/docs/docs/services.md b/docs/docs/services.md
@@ -25,6 +25,7 @@ The primary user-facing service. Provides a web-based chat application for inter
 - Streaming responses with tool-call visualization
 - Agent selector dropdown for switching between agents
 - Built-in [Data Viewer](data_sources.md#data-viewer) at `/data`
+- Optional built-in MCP server at `/mcp/sse` for IDE and agent integrations
 - Settings panel for model/provider selection
 - [BYOK](models_providers.md#bring-your-own-key-byok) support
 - Conversation history
@@ -51,6 +52,24 @@ services:
 archi create [...] --services chatbot
 ```
 
+### Built-in MCP Server
+
+The chat service can expose Archi itself as an MCP server over Server-Sent
+Events. Enable it when you want tools like VS Code, Cursor, Claude Desktop, or
+Claude Code to connect directly to your deployment.
+
+```yaml
+services:
+  mcp_server:
+    enabled: true
+    url: "https://chat.example.org"
+```
+
+- **Endpoint:** `/mcp/sse`
+- **Auth page:** `/mcp/auth` for generating bearer tokens when auth is enabled
+- **Tools exposed:** query, document discovery, metadata search, content grep,
+  chunk inspection, corpus stats, deployment info, and agent-spec inspection
+
 ---
 
 ## Service Status Board & Alert Banners
@@ -279,31 +298,194 @@ archi create [...] --services chatbot,redmine-mailer
 
 ## Mattermost Interface
 
-Reads posts from a Mattermost forum and posts draft responses to a specified channel.
+Connects Archi to a Mattermost channel. Supports two operating modes:
 
-### Configuration
+- **Webhook mode** — Mattermost pushes outgoing webhooks to Archi (recommended)
+- **Polling mode** — Archi polls a channel periodically via the Mattermost API
+
+**Default port:** `5000`
+
+### Setup
+
+#### Secrets
+
+```bash
+# Required for webhook mode
+MATTERMOST_WEBHOOK=https://mattermost.example.com/hooks/...  # Incoming webhook URL
+MATTERMOST_OUTGOING_TOKEN=...                                 # Outgoing webhook token for request validation
+
+# Required for polling mode only
+MATTERMOST_PAK=...                       # Personal Access Token for the bot account
+MATTERMOST_CHANNEL_ID_READ=...           # Channel to read posts from
+MATTERMOST_CHANNEL_ID_WRITE=...          # Channel to post responses to
+
+# Required for SSO auth (db mode)
+SSO_CLIENT_ID=...
+SSO_CLIENT_SECRET=...
+BYOK_ENCRYPTION_KEY=...                  # Used to encrypt stored refresh tokens
+PG_PASSWORD=...
+```
+
+#### Basic Configuration
 
 ```yaml
 services:
   mattermost:
-    update_time: 60
+    update_time: 60       # polling interval in seconds (polling mode only)
+    port: 5000
+    external_port: 5000
 ```
 
-### Secrets
+#### Running
 
 ```bash
-MATTERMOST_WEBHOOK=...
-MATTERMOST_PAK=...
-MATTERMOST_CHANNEL_ID_READ=...
-MATTERMOST_CHANNEL_ID_WRITE=...
+archi create [...] --services chatbot,mattermost
 ```
 
-### Running
+---
 
-```bash
-archi create [...] --services chatbot,mattermost
+### Authentication
+
+By default auth is disabled and the bot responds to all users. Two auth modes are available.
+
+#### Mode 1: Config (Static Allowlist)
+
+Roles are assigned to Mattermost users via a static map in the config. No SSO or database required.
+
+```yaml
+services:
+  mattermost:
+    auth:
+      enabled: true
+      token_store: config
+      default_role: mattermost-restricted  # role for users not in user_roles
+      user_roles:
+        jsmith: [archi-expert]             # Mattermost username → list of roles
+        ahmedmu: [archi-admins]
+        someuser: [archi-expert, base-user]
+```
+
+- Users in `user_roles` get the specified roles.
+- Users not in `user_roles` get `default_role`.
+- If `default_role` is not defined in `auth_roles`, those users have no permissions and are denied.
+
+#### Mode 2: DB / SSO (Recommended)
+
+Roles come from the CERN SSO JWT token. On first message, the bot sends the user a login link. After authenticating, their roles are stored in the database and reused on subsequent messages — no re-login required until the session expires.
+
+```yaml
+services:
+  mattermost:
+    auth:
+      enabled: true
+      token_store: db
+      session_lifetime_days: 30     # full re-login required after this period
+      roles_refresh_hours: 24       # silent background role refresh interval
+      login_base_url: "https://your-mattermost-service-host:5000"
+      sso:
+        server_metadata_url: "https://auth.cern.ch/auth/realms/cern/.well-known/openid-configuration"
+        token_endpoint: "https://auth.cern.ch/auth/realms/cern/protocol/openid-connect/token"
+```
+
+**SSO registration requirement:** The callback URL `<login_base_url>/mattermost-auth/callback` must be registered as a valid redirect URI in your SSO client (Keycloak / CERN Auth).
+
+**Login flow:**
+
+```
+1. User sends message to bot (no token stored)
+2. Bot replies: "Please login: https://<host>:5000/mattermost-auth?state=<user_id>&username=<username>"
+3. User clicks link → redirected to CERN SSO
+4. After SSO login → redirected to /mattermost-auth/callback
+5. Roles extracted from JWT, stored in mattermost_tokens table
+6. User sees success page, closes tab, returns to Mattermost
+7. Future messages use stored roles (silent refresh every 24h)
+```
+
+**Session lifecycle:**
+
+| Event | Behaviour |
+|-------|-----------|
+| First message | Login link sent |
+| Token valid, roles fresh | Respond normally |
+| Roles stale (`> roles_refresh_hours`) | Silent refresh via stored refresh token |
+| Session expired (`> session_lifetime_days`) | Login link sent again |
+| Admin invalidates token | Login link sent on next message |
+
+---
+
+### Role-Based Access Control
+
+Mattermost auth integrates with the same RBAC system used by the chat app. Roles are defined under `services.chat_app.auth.auth_roles`.
+
+#### Restricting Access
+
+To allow only users with a specific role (e.g. `archi-expert` and above), add the `mattermost:access` permission to those roles and **not** to `base-user`:
+
+```yaml
+services:
+  chat_app:
+    auth:
+      auth_roles:
+        roles:
+          base-user:
+            permissions:
+              - chat:query
+              - chat:history
+              # no mattermost:access here
+
+          archi-expert:
+            inherits: [base-user]
+            permissions:
+              - mattermost:access   # grants access to the Mattermost bot
+              - documents:view
+              - config:view
+              # ...
+
+          archi-admins:
+            permissions:
+              - "*"                 # wildcard includes mattermost:access
+
+        permissions:
+          mattermost:access:
+            description: "Access the Mattermost bot"
+            category: "mattermost"
 ```
 
+- `base-user` only → denied with "you don't have permission" message
+- `archi-expert` → allowed (has `mattermost:access`)
+- `archi-admins` → allowed (wildcard)
+
+#### Tool-Level Permissions
+
+Tool permissions work the same as in the chat app. Add permissions like `tools:http_get` to roles that should be able to use specific agent tools. The Mattermost user context is propagated through the full call stack so tool checks apply correctly.
+
+```yaml
+          archi-expert:
+            permissions:
+              - mattermost:access
+              - tools:http_get      # allow HTTP GET tool for this role
+```
+
+#### Database
+
+A `mattermost_tokens` table is required when using `token_store: db`. It is created automatically by `init.sql` on first deploy. For existing deployments, run the migration manually:
+
+```sql
+CREATE TABLE IF NOT EXISTS mattermost_tokens (
+    mattermost_user_id  VARCHAR(255) PRIMARY KEY,
+    mattermost_username VARCHAR(255),
+    email               VARCHAR(255),
+    roles               JSONB NOT NULL DEFAULT '[]',
+    refresh_token       BYTEA,
+    token_expires_at    TIMESTAMPTZ,
+    roles_refreshed_at  TIMESTAMPTZ NOT NULL DEFAULT NOW(),
+    created_at          TIMESTAMPTZ NOT NULL DEFAULT NOW(),
+    updated_at          TIMESTAMPTZ NOT NULL DEFAULT NOW()
+);
+```
+
+Refresh tokens are encrypted at rest using `pgp_sym_encrypt` (requires `BYOK_ENCRYPTION_KEY`).
+
 ---
 
 ## Grafana Monitoring

diff --git a/src/archi/pipelines/agents/base_react.py b/src/archi/pipelines/agents/base_react.py
@@ -998,20 +998,39 @@ def refresh_agent(
         extra_tools: Optional[Sequence[Callable]] = None,
         middleware: Optional[Sequence[Callable]] = None,
         force: bool = False,
+        user_id: Optional[str] = None,
     ) -> CompiledStateGraph:
         """Ensure the LangGraph agent reflects the latest tool set."""
         base_tools = list(static_tools) if static_tools is not None else self.tools
         toolset: List[Callable] = list(base_tools)
 
         if "mcp" in self.selected_tool_names:
-            if self._mcp_tools is None:
-                built = self._build_mcp_tools()
-                self._mcp_tools = list(built or [])
-            toolset.extend(self._mcp_tools)
+            # When user_id is present, always rebuild so each request fetches a
+            # fresh (possibly refreshed) token from the DB for SSO-auth servers.
+            # Without a user_id (anonymous), cache the tools as before.
+            if self._mcp_tools is None or user_id:
+                built = self._build_mcp_tools(user_id=user_id)
+                if not user_id:
+                    self._mcp_tools = list(built or [])
+                toolset.extend(built or [])
+            else:
+                toolset.extend(self._mcp_tools)
 
         if extra_tools:
             toolset.extend(extra_tools)
 
+        # OpenAI enforces a hard 128-tool limit per request.
+        _OPENAI_MAX_TOOLS = 128
+        if len(toolset) > _OPENAI_MAX_TOOLS:
+            logger.warning(
+                f"Toolset has {len(toolset)} tools, exceeding OpenAI max of {_OPENAI_MAX_TOOLS}. "
+                f"Truncating MCP tools to fit. Static tools ({len(base_tools)}) are preserved."
+            )
+            # Keep all static/extra tools; trim only the MCP portion
+            n_static = len(base_tools) + (len(list(extra_tools)) if extra_tools else 0)
+            mcp_budget = max(0, _OPENAI_MAX_TOOLS - n_static)
+            toolset = toolset[:n_static] + toolset[n_static:n_static + mcp_budget]
+
         middleware = list(middleware) if middleware is not None else self.middleware
 
         requires_refresh = (
@@ -1057,14 +1076,14 @@ def _build_static_tools(self) -> List[Callable]:
         static_names = [name for name in selected if name != "mcp"]
         return self._select_tools_from_registry(static_names)
 
-    def _build_mcp_tools(self) -> List[Callable]:
+    def _build_mcp_tools(self, user_id: Optional[str] = None) -> List[Callable]:
         """Retrieve MCP tools from servers defined in the config and keep those server connections alive"""
         try:
             self._async_runner = AsyncLoopThread.get_instance()
 
             # Initialize MCP client on the background loop
             # The client and sessions will live on this loop
-            client, mcp_tools = self._async_runner.run(initialize_mcp_client())
+            client, mcp_tools = self._async_runner.run(initialize_mcp_client(user_id=user_id))
             if client is None:
                 logger.info("No MCP servers configured.")
                 return None
@@ -1153,7 +1172,8 @@ def _prepare_agent_inputs(self, **kwargs) -> Dict[str, Any]:
         if hasattr(self, "_vector_tools"):
             extra_tools = self._vector_tools if self._vector_tools else None  # type: ignore[attr-defined]
 
-        self.refresh_agent(extra_tools=extra_tools)
+        user_id = kwargs.get("user_id")
+        self.refresh_agent(extra_tools=extra_tools, user_id=user_id)
 
         inputs = self._prepare_inputs(history=kwargs.get("history"))
         history_messages = inputs["history"]

diff --git a/src/archi/pipelines/agents/tools/base.py b/src/archi/pipelines/agents/tools/base.py
@@ -35,7 +35,30 @@ def check_tool_permission(required_permission: str) -> tuple[bool, Optional[str]
     try:
         from flask import session, has_request_context
         from src.utils.rbac.registry import get_registry
-
+
+        # Check Mattermost context first — covers webhook mode (Flask context, no session)
+        # and polling mode (no Flask context). ContextVar is set by mattermost_user_context().
+        try:
+            from src.utils.rbac.mattermost_context import get_mattermost_context
+            mm_ctx = get_mattermost_context()
+            if mm_ctx is not None:
+                registry = get_registry()
+                if registry.has_permission(mm_ctx.roles, required_permission):
+                    logger.debug(
+                        f"Mattermost user @{mm_ctx.username} granted '{required_permission}'"
+                    )
+                    return True, None
+                logger.info(
+                    f"Mattermost user @{mm_ctx.username} denied '{required_permission}' "
+                    f"(roles: {mm_ctx.roles})"
+                )
+                return False, (
+                    f"Permission denied for @{mm_ctx.username}: "
+                    f"requires '{required_permission}'."
+                )
+        except Exception as mm_exc:
+            logger.debug(f"Mattermost context check skipped: {mm_exc}")
+
         # If we're not in a request context, allow the tool (for testing/CLI usage)
         if not has_request_context():
             logger.debug("No request context, allowing tool access")