Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ Kontext balances security and utility for AI agents: low-risk actions keep movin

- **Audit trails:** Record who instructed which agent to do what, what the agent accessed, which tools it called, what policy decisions were made, and what happened next. Build a chain of custody for security review, incident investigation, and compliance evidence.
- **Deterministic policy:** Apply `allow`, `ask`, and `deny` rules to agent actions at runtime, before they execute. Use hard policies for known boundaries such as destructive commands, production resources, sensitive files, data exports, and credential access.
- **Probabilistic risk detection:** Detect when an agent is entering an unsafe state, drifting from user intent, or executing actions the user likely did not mean to authorize. Escalate ambiguous behavior without blocking normal agent productivity.
- **Probabilistic risk detection:** Route actions that deterministic policy allows through a local judge for an additional allow/deny decision without sending tool context to hosted services.
- **Credential injection:** Inject scoped OAuth credentials at runtime using RFC 8693-compliant OAuth 2.0 Token Exchange, so agents can access approved tools without users pasting secrets into chat, config files, or project environments. Credentials can be short-lived, least-privilege, and bound to the current user, session, or workflow.

The local decision path is:
Expand Down
40 changes: 0 additions & 40 deletions docs/guard-model-updates.md

This file was deleted.

17 changes: 6 additions & 11 deletions docs/guard.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

Guard is the local safety mode inside `kontext`.

It lets a developer run Claude Code normally while Kontext watches tool calls locally, redacts captured data, scores risk, stores events in local SQLite, and shows a local dashboard with `would allow`, `would ask`, and `would deny` decisions.
It lets a developer run Claude Code normally while Kontext watches tool calls locally, redacts captured data, stores events in local SQLite, and shows a local dashboard with `would allow`, `would ask`, and `would deny` decisions.

## User path

Expand Down Expand Up @@ -46,8 +46,8 @@ Claude Code
-> kontext hook --agent claude --mode observe
-> local runtime Unix socket
-> RuntimeCore
-> deterministic risk rules
-> optional local LLM judge or Markov-chain risk model
-> deterministic policy
-> local LLM judge when deterministic policy allows
-> local SQLite
-> local dashboard + notifications
```
Expand All @@ -56,10 +56,8 @@ Claude Code

Guard uses two layers:

1. Deterministic rules for obvious risk, such as credential access, direct provider API calls with credential material, production mutations, and destructive persistent-resource operations.
2. A local Markov-chain risk model for sequence context in coding-agent workflows.

The shipped model is a JSON artifact under `models/guard/`. Lab is the private pipeline that ingests datasets and local traces, evaluates candidate models, and produces improved JSON files. Accepted model files are committed back to this repo by PR.
1. Deterministic policy for obvious risk, such as credential access, direct provider API calls with credential material, production mutations, and destructive persistent-resource operations.
2. A local LLM judge for cases deterministic policy allows.

## Local judge

Expand Down Expand Up @@ -121,16 +119,13 @@ Public in `kontext-cli`:
- `kontext guard ...` commands
- Claude Code local hook adapter
- local daemon, SQLite store, dashboard, notifications
- deterministic risk rules
- shipped baseline/candidate model JSON files
- deterministic policy and local LLM judge wiring

Private in Lab:

- dataset ingestion
- OpenTelemetry/Claude trace import
- weak labeling
- model training/evaluation
- model promotion gates
- unpublished datasets and experiments

## Work tracking
Expand Down
141 changes: 76 additions & 65 deletions internal/guard/app/server/policy.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,14 +19,12 @@ type PolicyConfigProvider interface {
}

type RiskPolicyProvider struct {
scorer risk.Scorer
judge judge.Judge
policyEngine guardpolicy.Engine
policyConfig PolicyConfigProvider
}

type RiskPolicyProviderOptions struct {
Scorer risk.Scorer
Judge judge.Judge
PolicyEngine guardpolicy.Engine
PolicyConfig guardpolicy.Config
Expand All @@ -41,90 +39,106 @@ func (p staticPolicyConfigProvider) ActivePolicyConfig(context.Context) (guardpo
return p.config, nil
}

func NewRiskPolicyProvider(scorer risk.Scorer) RiskPolicyProvider {
return NewRiskPolicyProviderWithJudge(scorer, nil)
func NewRiskPolicyProvider() RiskPolicyProvider {
return NewRiskPolicyProviderWithJudge(nil)
}

func NewRiskPolicyProviderWithJudge(scorer risk.Scorer, localJudge judge.Judge) RiskPolicyProvider {
func NewRiskPolicyProviderWithJudge(localJudge judge.Judge) RiskPolicyProvider {
return NewRiskPolicyProviderWithOptions(RiskPolicyProviderOptions{
Scorer: scorer,
Judge: localJudge,
Judge: localJudge,
})
}

func NewRiskPolicyProviderWithOptions(opts RiskPolicyProviderOptions) RiskPolicyProvider {
scorer := opts.Scorer
if scorer == nil {
scorer = risk.NoopScorer{}
}
configProvider := opts.PolicyConfigProvider
if configProvider == nil {
configProvider = staticPolicyConfigProvider{config: opts.PolicyConfig}
}
return RiskPolicyProvider{
scorer: scorer,
judge: opts.Judge,
policyEngine: opts.PolicyEngine,
policyConfig: configProvider,
}
}

func (p RiskPolicyProvider) DecideHook(ctx context.Context, event risk.HookEvent) (risk.RiskDecision, error) {
if p.judge != nil && event.HookEventName == "PreToolUse" {
return p.decideWithJudge(ctx, event)
if event.HookEventName != "PreToolUse" {
return p.asyncTelemetryDecision(event), nil
}
return risk.DecideRisk(event, p.scorer)
}

func (p RiskPolicyProvider) decideWithJudge(ctx context.Context, event risk.HookEvent) (risk.RiskDecision, error) {
riskEvent := risk.NormalizeHookEvent(event)
score, err := p.scorer.Score(riskEvent)
if err != nil {
return risk.RiskDecision{}, err
}
riskEvent.RiskScore = score.RiskScore
riskEvent.ModelVersion = score.ModelVersion

policyResult := p.policyEngine.Evaluate(riskEvent, p.activePolicyConfig(ctx))
applyPolicyMetadata(&riskEvent, policyResult)
if policyResult.Decision == guardpolicy.DecisionDeny {
riskEvent.Decision = risk.DecisionDeny
riskEvent.ReasonCode = policyResult.ReasonCode
riskEvent.GuardID = policyResult.RuleID
riskEvent.DecisionStage = risk.DecisionStageDeterministicDeny
return risk.RiskDecision{
Decision: risk.DecisionDeny,
Reason: policyResult.Reason,
ReasonCode: policyResult.ReasonCode,
RiskScore: score.RiskScore,
Threshold: score.Threshold,
ModelVersion: score.ModelVersion,
GuardID: policyResult.RuleID,
RiskEvent: riskEvent,
}, nil
return deterministicDenyDecision(riskEvent, policyResult), nil
}
if p.judge == nil {
return deterministicAllowDecision(riskEvent, policyResult), nil
}

result, err := p.judge.Decide(ctx, judgeInputFromRiskEvent(event, riskEvent, policyResult))
if err != nil {
failureKind := judge.FailureKind(err)
metadata := judgeMetadata(p.judge)
riskEvent.Decision = risk.DecisionAllow
riskEvent.ReasonCode = "judge_unavailable_allow"
riskEvent.DecisionStage = risk.DecisionStageJudgeFailOpen
riskEvent.JudgeRuntime = metadata.Runtime
riskEvent.JudgeModel = metadata.Model
riskEvent.JudgeFailureKind = failureKind
return risk.RiskDecision{
Decision: risk.DecisionAllow,
Reason: "local judge unavailable; allowing by fail-open policy",
ReasonCode: "judge_unavailable_allow",
RiskScore: score.RiskScore,
Threshold: score.Threshold,
ModelVersion: score.ModelVersion,
RiskEvent: riskEvent,
}, nil
return judgeFailOpenDecision(riskEvent, p.judge, err), nil
}
return judgeDecision(riskEvent, result), nil
}

func (p RiskPolicyProvider) asyncTelemetryDecision(event risk.HookEvent) risk.RiskDecision {
riskEvent := risk.NormalizeHookEvent(event)
riskEvent.Decision = risk.DecisionAllow
riskEvent.ReasonCode = "async_telemetry"
riskEvent.DecisionStage = "async_telemetry"
return risk.RiskDecision{
Decision: risk.DecisionAllow,
Reason: "async telemetry event recorded",
ReasonCode: "async_telemetry",
RiskEvent: riskEvent,
}
}

func deterministicDenyDecision(riskEvent risk.RiskEvent, policyResult guardpolicy.Result) risk.RiskDecision {
riskEvent.Decision = risk.DecisionDeny
riskEvent.ReasonCode = policyResult.ReasonCode
riskEvent.GuardID = policyResult.RuleID
riskEvent.DecisionStage = risk.DecisionStageDeterministicDeny
return risk.RiskDecision{
Decision: risk.DecisionDeny,
Reason: policyResult.Reason,
ReasonCode: policyResult.ReasonCode,
GuardID: policyResult.RuleID,
RiskEvent: riskEvent,
}
}

func deterministicAllowDecision(riskEvent risk.RiskEvent, policyResult guardpolicy.Result) risk.RiskDecision {
riskEvent.Decision = risk.DecisionAllow
riskEvent.ReasonCode = policyResult.ReasonCode
riskEvent.DecisionStage = "deterministic_allow"
return risk.RiskDecision{
Decision: risk.DecisionAllow,
Reason: policyResult.Reason,
ReasonCode: policyResult.ReasonCode,
RiskEvent: riskEvent,
}
}

func judgeFailOpenDecision(riskEvent risk.RiskEvent, localJudge judge.Judge, err error) risk.RiskDecision {
failureKind := judge.FailureKind(err)
metadata := judgeMetadata(localJudge)
riskEvent.Decision = risk.DecisionAllow
riskEvent.ReasonCode = "judge_unavailable_allow"
riskEvent.DecisionStage = risk.DecisionStageJudgeFailOpen
riskEvent.JudgeRuntime = metadata.Runtime
riskEvent.JudgeModel = metadata.Model
riskEvent.JudgeFailureKind = failureKind
return risk.RiskDecision{
Decision: risk.DecisionAllow,
Reason: "local judge unavailable; allowing by fail-open policy",
ReasonCode: "judge_unavailable_allow",
RiskEvent: riskEvent,
}
}

func judgeDecision(riskEvent risk.RiskEvent, result judge.Result) risk.RiskDecision {
decision := risk.DecisionAllow
reasonCode := risk.DecisionStageJudgeAllow
if result.Output.Decision == judge.DecisionDeny {
Expand All @@ -143,15 +157,12 @@ func (p RiskPolicyProvider) decideWithJudge(ctx context.Context, event risk.Hook
riskEvent.JudgeCategories = result.Output.Categories

return risk.RiskDecision{
Decision: decision,
Reason: result.Output.Reason,
ReasonCode: reasonCode,
RiskScore: score.RiskScore,
Threshold: score.Threshold,
ModelVersion: score.ModelVersion,
GuardID: "local_llm_judge",
RiskEvent: riskEvent,
}, nil
Decision: decision,
Reason: result.Output.Reason,
ReasonCode: reasonCode,
GuardID: "local_llm_judge",
RiskEvent: riskEvent,
}
}

func (p RiskPolicyProvider) activePolicyConfig(ctx context.Context) guardpolicy.Config {
Expand Down
10 changes: 4 additions & 6 deletions internal/guard/app/server/server.go
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,6 @@ type ProcessResponse struct {
}

type Options struct {
Scorer risk.Scorer
Judge judge.Judge
PolicyConfig policy.Config
PolicyConfigProvider PolicyConfigProvider
Expand All @@ -67,8 +66,8 @@ type ActivatePolicyProfileRequest struct {
Profile policy.Profile `json:"profile"`
}

func NewServer(store *sqlite.Store, scorer risk.Scorer) (*Server, error) {
return NewServerWithOptions(store, Options{Scorer: scorer})
func NewServer(store *sqlite.Store) (*Server, error) {
return NewServerWithOptions(store, Options{})
}

func NewServerWithOptions(store *sqlite.Store, opts Options) (*Server, error) {
Expand All @@ -85,7 +84,6 @@ func NewServerWithOptions(store *sqlite.Store, opts Options) (*Server, error) {
}
}
return NewServerWithPolicyConfig(store, NewRiskPolicyProviderWithOptions(RiskPolicyProviderOptions{
Scorer: opts.Scorer,
Judge: opts.Judge,
PolicyConfigProvider: configProvider,
}), policyStore)
Expand Down Expand Up @@ -391,8 +389,8 @@ func openPolicyStoreForSQLite(store *sqlite.Store) (*policyconfig.Store, error)
return policyconfig.Open(filepath.Dir(store.Path()))
}

func OpenDefaultServer(dbPath string, scorer risk.Scorer) (*Server, func() error, error) {
return OpenDefaultServerWithOptions(dbPath, Options{Scorer: scorer})
func OpenDefaultServer(dbPath string) (*Server, func() error, error) {
return OpenDefaultServerWithOptions(dbPath, Options{})
}

func OpenDefaultServerWithOptions(dbPath string, opts Options) (*Server, func() error, error) {
Expand Down
Loading
Loading