Skip to content

feat: RAG support for Lorebooks#11

Merged
vitorfdl merged 46 commits intomasterfrom
feat/rag-support
May 1, 2026
Merged

feat: RAG support for Lorebooks#11
vitorfdl merged 46 commits intomasterfrom
feat/rag-support

Conversation

@vitorfdl
Copy link
Copy Markdown
Owner

@vitorfdl vitorfdl commented Apr 3, 2026

Summary

Adds semantic search (RAG) capabilities to the Lorebook system, along with supporting infrastructure for embedding models.

Embedding Models Support

  • Add embedding provider factory supporting OpenAI, Google, Ollama, and AWS Bedrock
  • Add embedding model manifests and model card UI for selecting/configuring embedding models
  • New embedding-service and lorebook-indexing-service for vectorizing lorebook entries
  • Database migration for vector storage on lorebook entries

Lorebook RAG Integration

  • Lorebooks can now be configured with an embedding model, similarity threshold, and top-K results
  • Entries are indexed into vectors and matched via cosine similarity at inference time
  • apply-lorebook formatter updated to support both keyword and semantic search modes
  • Indexing status tracking in the lorebook store with per-entry vector indicators

RAG Test Dialog

  • "Test Search" button in lorebook entries view to preview semantic similarity results
  • Shows activated entries above threshold vs. below, with similarity scores

Lorebook Agent Tool Nodes

  • Three new agent workflow nodes: getLorebook, searchLorebook, addLorebookEntry
  • Wired into the agent runner with proper handle mappings and multi-output support

Console Inspector: Resolved Parameters

  • New ResolvedParameters type showing the actual values sent to the AI provider
  • Displayed alongside input parameters in the live inspector's Parameters tab

Other Improvements

  • Replace heuristic token estimator with gpt-tokenizer for accurate counting
  • Increase inference timeout to 5 minutes (supports local models)
  • Fix modelsStore to pass profile_id when refreshing after create/update
  • Rework model cards UI
  • Dependency updates (ai SDK 6.0.145, biome 2.4.10, etc.)

Test plan

  • Create a lorebook, assign an embedding model, and index entries
  • Verify "Test Search" dialog returns ranked results with correct threshold filtering
  • Test keyword-based lorebook matching still works (non-RAG mode)
  • Test the three new agent nodes in a workflow
  • Verify resolved parameters appear in the console inspector during inference
  • Confirm inference timeout works for slow local models

vitorfdl added 27 commits March 21, 2026 13:07
Bump ai SDK to 6.0.145, @google/genai to 1.48.0, @biomejs/biome to 2.4.10,
and various other packages to latest minor/patch versions. Add gpt-tokenizer.
Add a "Test Search" button that opens a dialog to test semantic similarity
queries against indexed lorebook entries. Shows activated entries above
threshold and dim entries below. Also fix indexing status query to count
all entries (not just enabled) and handle null similarity_threshold.
Add three new agent workflow nodes: getLorebook, searchLorebook, and
addLorebookEntry. Register their input/output handles and wire up
multi-output support for search and add nodes in the runner.
Add ResolvedParameters type and reportResolvedParams callback to AIEvent.
Capture the actual parameters sent to the provider and display them in
the ParametersTab alongside the original input parameters.
Use gpt-tokenizer's encode() for accurate token counting instead of the
custom word/punctuation heuristic that had ~8% error margin.
Extract INFERENCE_TIMEOUT_MS constant (5 min) for local model support.
Fix modelsStore to pass profile_id when refreshing after create/update.
Fix EmbeddingModel type parameter deprecation.
- Added Node.js engine requirement (>=24.15.0) in package.json.
- Updated various package versions in package.json and pnpm-lock.yaml:
  - @tauri-apps/plugin-dialog: ~2.7.0
  - @tauri-apps/plugin-fs: ~2.5.0
  - immer: ^11.1.4
  - react-resizable-panels: ^4.10.0
  - uuid: ^14.0.0
  - @vitejs/plugin-react: ^6.0.1
  - vite: ^8.0.10
  - typescript: ~6.0.3
- Adjusted tsconfig.json path configuration for better compatibility.
- Updated code-quality workflow to use Node.js LTS version.
@vitorfdl vitorfdl self-assigned this Apr 27, 2026
vitorfdl added 17 commits April 28, 2026 00:45
Comment out the registration in registry/grid the same way memory is, and
make GridSidebar tolerate saved layouts that reference unregistered ids.
Adds a source toggle (From Character / Pick Lorebook) so workflows can output a fixed lorebook without going through a character.
- Group participants by type with counts on filter tabs
- Show description as subtitle and add agent badge overlay on avatar
- Match popover backgrounds and tighten side offset
- Refresh display settings with icon tiles and per-toggle descriptions
Opus 4.7 rejects thinking.type "enabled" and requires "adaptive" with
an effort level instead of budgetTokens. Detect 4.7+ / 5+ models by
name and emit the right shape per provider; older models keep the
existing enabled+budget behavior.
Removes the redundant theme picker from the settings page and wires the
sidebar's theme switcher to write through to the active profile's
appearance.theme, so the choice survives across sessions instead of
getting reverted by SettingsPage's profile sync on mount.
- Skip entries with mismatched vector dimensions during retrieval and
  in the test-search dialog so a model swap can't crash inference or
  silently produce garbage similarity scores.
- Clear stored vectors automatically when a lorebook's embedding model
  changes, matching what the form already promised the user.
- Strip embedding_model_id and vector_content on V1 lorebook import to
  prevent a foreign profile's encrypted API key from being reused.
- Pass tauriFetch to the AWS Bedrock embedding factory and throw clear
  errors when API keys or required config are missing instead of
  forwarding "None" placeholders to the provider.
- Remove success toasts for indexing actions per project policy and
  surface previously console-only errors via toasts in the lorebook
  form, entry dialog, drag-reorder, toggle, and delete flows.
- Re-throw from clearIndex so the caller's catch fires instead of
  reporting false success.
- Clear stale vectors when an entry's content is emptied, log
  parseStoredVector failures, and gate Index All on a configured model.
Keyword matching now stacks with semantic similarity instead of being
mutually exclusive. When RAG is enabled, a keyword hit adds a +0.5 boost
to cosine similarity (capped at 1) so authored triggers can activate
entries that semantic search alone would miss. Empty keywords opt out
naturally.

- Always show keyword fields (Keywords, Trigger Chance, Case Sensitive,
  Match Partial Words) in the entry dialog regardless of rag_enabled.
- Always show the Keywords column in the entries table; Indexed column
  is now an addition rather than a replacement.
- Apply trigger_chance at runtime (was wired into the form but never
  consulted by the matcher).
- Rework Test RAG Search dialog: Enter to submit, autofocus, threshold
  preview slider, recent-query chips, search timing, score bars with
  threshold marker, click-to-expand results, keyword-hit "K" badge,
  inline stats bar.
@vitorfdl vitorfdl merged commit ad71dd6 into master May 1, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant