Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -32,3 +32,4 @@ tests/.tmp/
*.log
*.txt
.kode/
.kode-observability-http/
43 changes: 43 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,48 @@ export OPEN_SANDBOX_ENDPOINT=http://127.0.0.1:8080 # optional
export OPEN_SANDBOX_IMAGE=ubuntu # optional
```

## Observability

KODE keeps observability as an SDK-facing capability first:

- runtime metrics via `agent.getMetricsSnapshot()`
- runtime observations via `agent.getObservationReader()` / `agent.subscribeObservations()`
- optional OTEL bridge via `observability.otel`
- optional persisted observation query via `observability.persistence`

Minimal persisted-observation example:

```typescript
import {
Agent,
JSONStore,
JSONStoreObservationBackend,
createStoreBackedObservationReader,
} from '@shareai-lab/kode-sdk';

const storeDir = './.kode';
const observationBackend = new JSONStoreObservationBackend(storeDir);

const agent = await Agent.create({
templateId: 'assistant',
observability: {
persistence: {
backend: observationBackend,
},
},
}, deps);

const runtimeSnapshot = agent.getMetricsSnapshot();
const runtimeObservations = agent.getObservationReader().listObservations();

const persistedReader = createStoreBackedObservationReader(observationBackend);
const persistedObservations = await persistedReader.listObservations({ limit: 50 });
```

If you want to expose these metrics or observations over HTTP, do it in your application on top of readers/backends, not inside `Agent` itself. `examples/08-observability-http.ts` is an application-layer example, not an SDK-owned HTTP feature.

Run the full example locally with `npm run example:observability-http`.

## Architecture for Scale

For production deployments serving many users, we recommend the **Worker Microservice Pattern**:
Expand Down Expand Up @@ -150,6 +192,7 @@ See [docs/en/guides/architecture.md](./docs/en/guides/architecture.md) for detai
| [Concepts](./docs/en/getting-started/concepts.md) | Core concepts explained |
| **Guides** | |
| [Events](./docs/en/guides/events.md) | Three-channel event system |
| [Observability](./docs/en/guides/observability.md) | Metrics, observations, persistence, and app-layer exposure |
| [Tools](./docs/en/guides/tools.md) | Built-in tools & custom tools |
| [E2B Sandbox](./docs/en/guides/e2b-sandbox.md) | E2B cloud sandbox integration |
| [OpenSandbox](./docs/en/guides/opensandbox-sandbox.md) | OpenSandbox self-hosted sandbox integration |
Expand Down
43 changes: 43 additions & 0 deletions README.zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,48 @@ export OPEN_SANDBOX_ENDPOINT=http://127.0.0.1:8080 # 可选
export OPEN_SANDBOX_IMAGE=ubuntu # 可选
```

## 可观测性

KODE 把可观测性优先作为 SDK 能力暴露:

- 运行时指标:`agent.getMetricsSnapshot()`
- 运行时 observation:`agent.getObservationReader()` / `agent.subscribeObservations()`
- 可选 OTEL bridge:`observability.otel`
- 可选持久化 observation 查询:`observability.persistence`

最小持久化 observation 示例:

```typescript
import {
Agent,
JSONStore,
JSONStoreObservationBackend,
createStoreBackedObservationReader,
} from '@shareai-lab/kode-sdk';

const storeDir = './.kode';
const observationBackend = new JSONStoreObservationBackend(storeDir);

const agent = await Agent.create({
templateId: 'assistant',
observability: {
persistence: {
backend: observationBackend,
},
},
}, deps);

const runtimeSnapshot = agent.getMetricsSnapshot();
const runtimeObservations = agent.getObservationReader().listObservations();

const persistedReader = createStoreBackedObservationReader(observationBackend);
const persistedObservations = await persistedReader.listObservations({ limit: 50 });
```

如果你要通过 HTTP 对外暴露这些指标或 observation,应该在你的应用层基于 reader/backend 去包装,而不是让 `Agent` 自己直接监听端口。`examples/08-observability-http.ts` 只是应用层示例,不是 SDK 自带的 HTTP 能力。

可通过 `npm run example:observability-http` 本地运行完整示例。

## 支持的 Provider

| Provider | 流式输出 | 工具调用 | 推理 | 文件 |
Expand All @@ -110,6 +152,7 @@ export OPEN_SANDBOX_IMAGE=ubuntu # 可选
| [核心概念](./docs/zh-CN/getting-started/concepts.md) | 核心概念详解 |
| **使用指南** | |
| [事件系统](./docs/zh-CN/guides/events.md) | 三通道事件系统 |
| [可观测性](./docs/zh-CN/guides/observability.md) | 指标、observation、持久化与应用层暴露 |
| [工具系统](./docs/zh-CN/guides/tools.md) | 内置工具与自定义工具 |
| [E2B 沙箱](./docs/zh-CN/guides/e2b-sandbox.md) | E2B 云端沙箱接入 |
| [OpenSandbox 沙箱](./docs/zh-CN/guides/opensandbox-sandbox.md) | OpenSandbox 自托管沙箱接入 |
Expand Down
31 changes: 29 additions & 2 deletions docs/en/examples/playbooks.md
Original file line number Diff line number Diff line change
Expand Up @@ -153,7 +153,33 @@ const stats = await store.aggregateStats(agent.agentId);

---

## 6. Combined: Approval + Collaboration + Scheduling
## 6. Observability Readers + Application HTTP Wrapper

- **Goal**: Read runtime/persisted observations from the SDK and optionally expose them through your own app-layer HTTP service.
- **Example**: `examples/08-observability-http.ts`
- **Run**: `npm run example:observability-http`
- **Key Steps**:
1. Read point-in-time metrics with `agent.getMetricsSnapshot()`.
2. Read live in-memory observations with `agent.getObservationReader()` or `agent.subscribeObservations()`.
3. Configure `observability.persistence.backend` and query history with `createStoreBackedObservationReader(...)`.
4. Map your own routes, auth, tenant checks, and response shaping in application code.
- **Considerations**:
- Prefer runtime reader for "what is happening now" and persisted reader for audit/history views.
- Treat `metadata.__debug` as internal/debug-only data; do not expose it blindly to external consumers.
- Keep HTTP, auth, rate limiting, and dashboard concerns outside SDK core.

```typescript
const metrics = agent.getMetricsSnapshot();
const runtimeReader = agent.getObservationReader();
const persistedReader = createStoreBackedObservationReader(observationBackend);

const runtime = runtimeReader.listObservations({ limit: 20 });
const persisted = await persistedReader.listObservations({ agentIds: [agent.agentId], limit: 50 });
```

---

## 7. Combined: Approval + Collaboration + Scheduling

- **Scenario**: Code review bot, Planner splits tasks and assigns to Specialists, tool operations need approval, scheduled reminders ensure SLA.
- **Implementation**:
Expand Down Expand Up @@ -184,12 +210,13 @@ const stats = await store.aggregateStats(agent.agentId);

- [Getting Started](../getting-started/quickstart.md)
- [Events Guide](../guides/events.md)
- [Observability Guide](../guides/observability.md)
- [Multi-Agent Systems](../advanced/multi-agent.md)
- [Database Guide](../guides/database.md)

---

## 7. CLI Agent Application
## 8. CLI Agent Application

Build command-line AI assistants like Claude Code or Cursor.

Expand Down
166 changes: 166 additions & 0 deletions docs/en/guides/observability.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,166 @@
# Observability Guide

KODE exposes observability as SDK capabilities first, not as an application server.

That means the SDK gives you structured metrics, observations, persistence hooks, and OTEL bridging. Your application decides whether to expose them through HTTP, dashboards, alerting, or internal admin tools.

---

## What KODE Includes

- Runtime metrics via `agent.getMetricsSnapshot()`
- Runtime observation reads via `agent.getObservationReader()`
- Runtime observation streaming via `agent.subscribeObservations()`
- Optional persisted observation queries via `observability.persistence`
- Optional OTEL export via `observability.otel`

## What KODE Deliberately Does Not Include

- Built-in HTTP server lifecycle
- Built-in auth, tenant isolation, or rate limiting
- Built-in observability dashboard UI
- Opinionated public API contracts for app delivery

Those concerns belong in your application layer.

---

## Runtime Metrics and Observations

Use runtime readers when you want to inspect the current agent process without waiting for external exports.

```typescript
const metrics = agent.getMetricsSnapshot();
const reader = agent.getObservationReader();

const latest = reader.listObservations({
kinds: ['generation', 'tool'],
limit: 20,
});

for await (const envelope of agent.subscribeObservations({ runId: metrics.currentRunId })) {
console.log(envelope.observation.kind, envelope.observation.name);
}
```

Typical runtime uses:

- show "live now" generation/tool activity in an admin panel
- inspect approval waits, tool errors, and compression events
- derive counters without polling raw event buses

---

## Persisted Observations

Use persisted readers when you need history, audit views, or process-restart durability.

```typescript
import {
Agent,
JSONStoreObservationBackend,
createStoreBackedObservationReader,
} from '@shareai-lab/kode-sdk';

const observationBackend = new JSONStoreObservationBackend('./.kode-observability');

const agent = await Agent.create({
templateId: 'assistant',
observability: {
persistence: {
backend: observationBackend,
},
},
}, deps);

const persistedReader = createStoreBackedObservationReader(observationBackend);
const history = await persistedReader.listObservations({
agentIds: [agent.agentId],
kinds: ['agent_run', 'generation', 'tool'],
limit: 50,
});
```

Use persisted storage for:

- audit timelines
- run replay pages
- offline analytics jobs
- debugging after process restart

---

## OTEL Bridge

If your platform already standardizes on OpenTelemetry, enable the bridge and ship translated spans to your collector.

```typescript
const agent = await Agent.create({
templateId: 'assistant',
observability: {
otel: {
enabled: true,
serviceName: 'kode-agent',
exporter: {
protocol: 'http/json',
endpoint: process.env.OTEL_EXPORTER_OTLP_ENDPOINT!,
},
},
},
}, deps);
```

Keep KODE's native observation model as your source of truth. OTEL is best treated as an interoperability/export path.

---

## Data Safety and Capture Boundaries

KODE supports configurable capture levels through `observability.capture`:

- `off`
- `summary`
- `full`
- `redacted`

Prefer `summary` or `redacted` for production unless you have a clear compliance reason to store more detail.

Also note:

- provider-specific raw payloads are not part of the public observation schema
- debug-only extensions may appear under `metadata.__debug`
- `metadata.__debug` should be treated as internal/private and filtered before external exposure

This keeps the public observation model safer and more stable.

---

## Exposing Observability over HTTP

If you need HTTP endpoints, build them in your app on top of the SDK readers/backends.

Reference example:

- `examples/08-observability-http.ts`
- run with `npm run example:observability-http`

That example demonstrates:

- a normal app-owned HTTP server
- `POST /agents/demo/send` to drive an agent run
- `GET /api/observability/.../metrics` for runtime metrics
- `GET /api/observability/.../observations/runtime` for live observation reads
- `GET /api/observability/.../observations/persisted` for persisted history

This boundary is intentional: the SDK provides observability primitives, while the app owns transport, auth, and presentation.

---

## Recommended Rollout

1. Start with runtime metrics and runtime observation readers.
2. Add persisted observation storage for auditability.
3. Add OTEL export only if your platform needs centralized telemetry.
4. Add app-layer HTTP or UI only after the data model and filtering policy are clear.

This order keeps the SDK integration stable and avoids prematurely coupling KODE to one delivery surface.
29 changes: 28 additions & 1 deletion docs/zh-CN/examples/playbooks.md
Original file line number Diff line number Diff line change
Expand Up @@ -153,7 +153,33 @@ const stats = await store.aggregateStats(agent.agentId);

---

## 6. 组合拳:审批 + 协作 + 调度
## 6. 观测层读取与应用层 HTTP 包装

- **目标**:从 SDK 读取运行时/持久化 observation,并按你自己的应用边界选择是否通过 HTTP 暴露出去。
- **示例**:`examples/08-observability-http.ts`
- **运行**:`npm run example:observability-http`
- **关键步骤**:
1. 通过 `agent.getMetricsSnapshot()` 读取当前指标快照。
2. 通过 `agent.getObservationReader()` 或 `agent.subscribeObservations()` 读取运行时 observation。
3. 为 `observability.persistence.backend` 配置后端,并用 `createStoreBackedObservationReader(...)` 查询历史数据。
4. 在应用代码中自行定义路由、鉴权、租户隔离和响应裁剪。
- **注意事项**:
- 运行时 reader 更适合“现在发生了什么”,持久化 reader 更适合审计与历史视图。
- `metadata.__debug` 只能视为内部调试数据,不应直接原样对外暴露。
- HTTP、鉴权、限流、Dashboard 都应留在 SDK 外部。

```typescript
const metrics = agent.getMetricsSnapshot();
const runtimeReader = agent.getObservationReader();
const persistedReader = createStoreBackedObservationReader(observationBackend);

const runtime = runtimeReader.listObservations({ limit: 20 });
const persisted = await persistedReader.listObservations({ agentIds: [agent.agentId], limit: 50 });
```

---

## 7. 组合拳:审批 + 协作 + 调度

- **场景**:代码审查机器人,Planner 负责拆分任务并分配到不同 Specialist,工具操作需审批,定时提醒确保 SLA。
- **实现路径**:
Expand Down Expand Up @@ -184,5 +210,6 @@ const stats = await store.aggregateStats(agent.agentId);

- [快速开始](../getting-started/quickstart.md)
- [事件指南](../guides/events.md)
- [可观测性指南](../guides/observability.md)
- [多 Agent 系统](../advanced/multi-agent.md)
- [数据库指南](../guides/database.md)
Loading
Loading