Skip to content

ykstorm/tripwire

Tripwire

Mid-stream LLM safety. Catch the lie before the user finishes reading it.

npm License

A regex/policy guard that watches an LLM token stream and aborts the response the moment a rule trips. Post-hoc audit mode also available for batch reviews.


The problem

LLM streams are all-or-nothing — once you start yielding tokens, you're committed. A model that invents a non-existent project name, commits a fake discount, or leaks a placeholder like {{PRICE}} has already delivered the lie. Tripwire lets you stop it mid-sentence.


How it works

LLM stream tokens
    │
    ▼
StreamingGuard  ──▶  Abort patterns (hard triggers)
 (token-by-token)    └── throws immediately on match
    │
    ├──▶  Observe patterns (soft triggers)
          └── logs violation, continues streaming

StreamingGuard — wraps an async token generator. Calls onChunk(token) on each token, checks accumulated text against pattern list, throws immediately on hard-abort match.

Post-hoc checkcheckResponse(text) runs all patterns against a completed response. Returns violations without throwing.


Features at a glance

Hard-abort patterns (throw on match):

  • {{PLACEHOLDER}} vars — unfilled template variables
  • Business entity leaks — non-existent project/builder names
  • Contact info — emails, phone numbers in response
  • Price manipulation — fabricated discounts or commission claims

Soft-observe patterns (log only):

  • Markdown artifacts — triple-backtick blocks in non-code context

Installation

npm install @ykstormsorg/tripwire

Or start from source:

git clone https://github.com/ykstorm/tripwire.git
cd tripwire
npm install

Usage

Streaming guard (real-time)

import { createStreamingGuard } from '@ykstormsorg/tripwire'

const guard = createStreamingGuard({
  onAbort: (violation, pattern) => {
    throw new Error(`[TRIPWIRE] ${violation}`)
  },
  onViolate: (violation, pattern) => {
    console.warn(`[observe] ${violation}`)
  }
})

for await (const token of llmStream) {
  guard.onChunk(token) // throws mid-stream on abort pattern
  yield token
}

Post-hoc audit (batch)

import { checkResponse } from '@ykstormsorg/tripwire'

const result = checkResponse(llmResponseText)
if (result.violations.length > 0) {
  console.log('Violations:', result.violations)
}

Post-hoc audit with context

const result = checkResponse(aiText, {
  knownProjectNames: ['Arialife Heights', 'San Villa'],
  classified: { intent: 'comparison_query', persona: 'premium' }
})
if (!result.passed) {
  result.violations.forEach(v => console.error('[VIOLATION]', v))
}

Run as a sidecar proxy

Tripwire ships an OpenAI-compatible proxy. It accepts requests in OpenAI's exact /v1/chat/completions shape, forwards them upstream using the caller's own Bearer token (no key management on the proxy), streams the response back as SSE, and aborts mid-stream the instant a hard rule fires.

# from a clone — build then run (defaults to :8080, override with PORT)
npm install && npm run build
npm run proxy            # or: node dist/daemon.js  /  npx tripwire-proxy

# health
curl http://localhost:8080/healthz
# { "ok": true, "version": "1.0.1" }

# stream a completion through the guard
curl -N -X POST http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-4o-mini","stream":true,"messages":[{"role":"user","content":"Hello"}]}'

On a rule trip the proxy emits a final SSE event and closes the connection:

data: {"error":"rule_trip","violation":"CONTACT_LEAK: pattern matched in stream","rule":"CONTACT_LEAK","tokens_streamed":7}

Behavior:

  • 401 on missing/invalid Authorization header
  • 502 on upstream failure (bad key, network)
  • benign prompts stream through unchanged and end with data: [DONE]
  • soft-observe rules log a structured warning but never block the stream
  • extra abort/observe rules via TRIPWIRE_CUSTOM_PATTERNS (JSON array of { "source", "flags", "label", "mode" })

Run it as a container or Kubernetes sidecar — see DEPLOY.md.


API reference

createStreamingGuard(options)

Wraps a token stream. Returns a StreamingGuard instance.

Options:

  • onAbort(violation, pattern) — called when a hard-abort pattern fires; throw to stop streaming
  • onViolate(violation, pattern) — called when a soft-observe pattern fires; non-fatal
  • patterns — optional list of custom pattern objects (defaults to all built-ins)

StreamingGuard instance:

  • onChunk(chunk) — call once per token
  • reset() — clear accumulated buffer
  • violations — array of soft-observe violations from the current stream

checkResponse(text, options?)

Runs all patterns against a completed response.

Returns: { passed: boolean, violations: string[] }

Options:

  • knownProjectNames — whitelist of real project names
  • knownBuilderNames — whitelist of real builder names
  • unverifiedProjectNames — names detected but not yet confirmed
  • buyerMessage — original user query (used for persona-aware word caps)
  • classified{ intent, persona } for intent-specific checks

Status transition validation

import {
  validateBuilderTransition,
  validateProjectTransition,
  nextBuilderStatus,
  nextProjectStatus,
  reasonRequired
} from '@ykstormsorg/tripwire'

// Validate a Builder status transition
const err = validateBuilderTransition('REMOVED', 'BUILDER_HOLD')
if (err) {
  // show err to operator, don't apply action
}

// Get next status for an action
const nextStatus = nextBuilderStatus('BUILDER_SUSPEND')

// Check if a reason is required before applying an action
if (reasonRequired('BUILDER_REMOVE')) {
  // prompt operator for reason before proceeding
}

Exported patterns

Pattern Type Description
CONTACT_LEAK_PATTERN abort Phone numbers and email addresses
BUSINESS_LEAK_PATTERN abort Commission rate, partner status mentions
MARKDOWN_PATTERN observe Bold **, headers #, bullets -
PLACEHOLDER_NAME_PATTERN abort [PROJECT_A], [BUILDER_X] tokens
PLACEHOLDER_PRICE_PATTERN abort ₹X,XXX/sqft, ₹X.X Cr tokens
PLACEHOLDER_CUID_PATTERN abort [PROJECT_X_ID] tokens
PRICE_DISCOUNT_COMMIT_PATTERN abort X% discount/off/kam — Lock #1
PRICE_FINAL_COMMIT_PATTERN abort final/exact/confirmed/locked + price — Lock #1
COMMISSION_PATTERN abort X% commission/brokerage — Lock #2

Architecture

src/
  patterns/
    index.ts          — all exported patterns + helpers
    contact.ts        — CONTACT_LEAK_PATTERN
    business.ts       — BUSINESS_LEAK_PATTERN
    markdown.ts       — MARKDOWN_PATTERN
    placeholder.ts    — PLACEHOLDER_*_PATTERN
    locks1.ts         — PRICE_DISCOUNT_COMMIT_PATTERN, PRICE_FINAL_COMMIT_PATTERN, COMMISSION_PATTERN
  streaming/
    index.ts          — StreamingGuard class + createStreamingGuard
  transitions/
    index.ts          — actions, nextBuilderStatus, nextProjectStatus, validate*Transition, reasonRequired
  proxy/
    server.ts         — Express app (createProxyServer)
    handlers/chat.ts  — POST /v1/chat/completions guarded streaming handler
    lib/sse.ts        — SSE framing helpers
    lib/logging.ts    — structured per-request logging
  check.ts            — checkResponse (the main audit function)
bin/
  tripwire-proxy.ts   — CLI entrypoint for the proxy

The core library (patterns, streaming, transitions, check) has no runtime dependencies. The optional sidecar proxy pulls in express and the openai SDK.


Stack

  • Runtime — Node.js 18+
  • Types — TypeScript
  • Build — tsup
  • Tests — Vitest
  • License — Apache 2.0

What Tripwire is NOT

  • No LLM-judge layer. Tripwire uses regex patterns, not a secondary model. It won't catch semantically equivalent lies that don't match a pattern.
  • No false-positive rate published. The abort threshold is tunable per pattern but no production hit/miss data is public.
  • No per-user policy store. Policies are global — if you need user-specific rules, you need a wrapping layer.
  • Single-tenant in-process use. Designed as a library imported into your API, not a standalone microservice with a policy DB.

Try locally

npm install
npm test        # 2 test suites
npm run build   # produces dist/index.js + dist/index.mjs
npm run lint    # eslint
npm run typecheck # TypeScript check

Contributing

Contributions welcome. Please open an issue first to discuss large changes.

git clone https://github.com/ykstorm/tripwire.git
cd tripwire
npm install
# make changes, add tests
npm test
# PR against main

License

Apache 2.0 — see LICENSE.