Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -38,3 +38,6 @@ strict = true
testpaths = ["src"]
python_files = ["test_*.py"]
asyncio_mode = "auto"
markers = [
"integration: marks integration tests that require running services",
]
35 changes: 28 additions & 7 deletions src/agents/orchestrator/agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,8 @@

import asyncio
import json
from datetime import datetime
import logging
from datetime import UTC, datetime
from pathlib import Path
from typing import Any, AsyncIterator

Expand All @@ -18,6 +19,13 @@
from .workflow_state import WorkflowState


logger = logging.getLogger(__name__)


def _utc_iso_timestamp() -> str:
return datetime.now(UTC).isoformat()
Comment on lines +25 to +26
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description says utcnow usage was replaced with timezone-aware UTC timestamps, but there are still datetime.utcnow() call sites elsewhere (e.g., src/apps/api/workers/tasks.py, src/apps/api/services/launch_service.py, and several repositories). Either narrow the description to the orchestrator events/HITL timestamps or update the remaining occurrences for consistency.

Copilot uses AI. Check for mistakes.


def _load_system_prompt() -> str:
prompt_path = Path(__file__).parent / "prompts" / "system.md"
if not prompt_path.exists():
Expand Down Expand Up @@ -68,7 +76,7 @@ async def run(self, payload: dict[str, Any]) -> AgentResult:
"type": "agent_started",
"launch_id": launch_id,
"agent_id": agent_id,
"timestamp": datetime.utcnow().isoformat(),
"timestamp": _utc_iso_timestamp(),
},
)

Expand Down Expand Up @@ -98,7 +106,7 @@ async def run(self, payload: dict[str, Any]) -> AgentResult:
"launch_id": launch_id,
"agent_id": agent_id,
"error": state.failure_reason,
"timestamp": datetime.utcnow().isoformat(),
"timestamp": _utc_iso_timestamp(),
},
)
break
Expand All @@ -111,7 +119,7 @@ async def run(self, payload: dict[str, Any]) -> AgentResult:
"launch_id": launch_id,
"agent_id": agent_id,
"output": output,
"timestamp": datetime.utcnow().isoformat(),
"timestamp": _utc_iso_timestamp(),
},
)

Expand All @@ -134,7 +142,13 @@ async def run(self, payload: dict[str, Any]) -> AgentResult:
break

state.resume_from_hitl(edits if isinstance(edits, dict) else None)
await self._hitl_store.clear(launch_id)
try:
await self._hitl_store.clear(launch_id)
except Exception:
logger.warning(
"Failed to clear HITL state",
extra={"launch_id": launch_id},
)
Comment on lines +145 to +151
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The broad except Exception here can unintentionally swallow task cancellation (e.g., asyncio.CancelledError) and also hides non-Redis bugs (like serialization errors). Consider catching Redis-specific exceptions (e.g., redis.exceptions.RedisError / connection/timeouts) and re-raising cancellation so shutdown/cancellation works correctly; also include the exception details in the log so the root cause isn’t lost.

Copilot uses AI. Check for mistakes.
Comment on lines +148 to +151
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra={...} won’t show up with the current logging configuration (e.g., logging.basicConfig(...) in src/apps/api/main.py uses the default formatter, which ignores custom extra fields). If you want launch_id to be visible in logs, include it in the formatted message/args (or switch to a formatter/structured logger that prints extra fields).

Copilot uses AI. Check for mistakes.

if state.failed:
status = "failed"
Expand Down Expand Up @@ -172,7 +186,7 @@ async def _pause_for_hitl(
"checkpoint": checkpoint,
"agent_id": state.completed_agents[-1] if state.completed_agents else "unknown",
"output_preview": output,
"created_at": datetime.utcnow().isoformat(),
"created_at": _utc_iso_timestamp(),
}
await self._hitl_store.set_pending(state.launch_id, pending)
await self._publish_event(
Expand All @@ -198,7 +212,14 @@ async def _wait_for_hitl_resolution(
return None

async def _publish_event(self, launch_id: str, event: dict[str, Any]) -> None:
await self._session_store.publish(f"launch:{launch_id}:events", event)
# Event streaming should not block workflow completion if Redis is unavailable.
try:
await self._session_store.publish(f"launch:{launch_id}:events", event)
except Exception:
Comment on lines 214 to +218
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New “best-effort publish” behavior isn’t covered by tests in src/agents/orchestrator/tests/test_agent.py. Please add a test that simulates Redis being unavailable (e.g., patch _session_store.publish to raise a Redis connection error) and asserts run() still completes and returns the expected status.

Copilot uses AI. Check for mistakes.
logger.warning(
"Failed to publish orchestrator event",
extra={"launch_id": launch_id, "event_type": event.get("type")},
)
Comment on lines 214 to +222
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Catching all Exception here will also hide non-Redis errors (e.g., JSON encoding problems) and can suppress cancellation if asyncio.CancelledError is caught. To keep “best-effort Redis” without masking real bugs, catch Redis-specific exceptions and ensure cancellation is re-raised; also log the exception (message/stack) so failures are diagnosable.

Copilot uses AI. Check for mistakes.

async def stream(self, payload: dict[str, Any]) -> AsyncIterator[str]:
result = await self.run(payload)
Expand Down
Loading