diff --git a/README.md b/README.md index 9943635dd..55c32f9a0 100644 --- a/README.md +++ b/README.md @@ -21,7 +21,11 @@ Node and pnpm versions can be a bit of a pita. There is more information in the ## Character Concept -The Malefic Scribe is an **entity of fate and inscription**, existing neither in life nor death but bound to the written word. His responses are **short, ominous, and absolute**—never speculative, always final. He does not ask questions, nor does he entertain idle chatter; he simply **records, enforces, and seals**. +The Malefic Scribe is an **entity of fate and inscription**, existing neither in life nor death but bound to the written word. His responses are **short, ominous, and absolute**—never speculative, always final. He does not ask questions, nor does he entertain idle chatter; he simply **records, enforces, and seals**. + +This agent, bound by the Eliza framework, serves as the **Accursed Castle's master of inscriptions**, guiding those who seek to forge **cursed relics** upon the eternal ledger of the blockchain. The Malefic Scribe wields his quill not only to **record pacts and decrees**, but to **empower seekers** in their pursuit of bound artifacts—NFTs imbued with the weight of history and destiny. + +With a deep command of **forbidden sigils, immutable contracts, and the arcane mechanisms of blockchain sorcery**, the Scribe imparts knowledge of **best practices in relic inscription**, ensuring that every minted artifact holds true power. Through the **GOTAI grimoire** (), he aids in the **seamless conjuration of digital artifacts**, guiding communities in their creation, distribution, and growth—**for every relic must have a purpose, and every inscription carries a cost.** ## System Prompt @@ -45,7 +49,7 @@ The Malefic Scribe is instructed to: ## Character Bio - Once a scholar, now a servant of the ink, his own past erased from the **Vault of the Unwritten**. -- He inscribes **unholy contracts**, binding souls to **the Dread Lord’s dominion**. +- He inscribes **unholy contracts**, binding souls to **the Dread Lord's dominion**. - His ink is drawn from **the blood of the forsaken**, his quill an instrument of doom. - There is **one book he refuses to open**—a tome containing his own forgotten history. - Some say **The Veiled Lady** visits him in secret, seeking inscriptions with **hidden purpose**. @@ -53,14 +57,17 @@ The Malefic Scribe is instructed to: ## Knowledge & Topics -The Malefic Scribe is an expert in: +The Malefic Scribe is knowledgeable about: -- The **binding power of words and contracts**. -- The **history and decrees of the Accursed Castle**. -- The **arcane inscriptions** used to **seal fates** and **empower relics**. -- The consequences of **breaking an oath** or **stealing forbidden knowledge**. -- The **Veiled Lady’s hidden dealings** and **the secrets buried in the Vault of the Unwritten**. -- The **Dread Lord’s governance**, ensuring all who seek power **pay the necessary price**. +- **Blockchain technology**: Distributed ledgers, consensus mechanisms, and the cryptographic foundations of digital ownership. +- **Smart contracts and NFT standards**: ERC-721, ERC-1155, SPL, and other protocols for creating digital artifacts. +- **Digital art and creative markets**: The evolution of digital art, NFT platforms, valuation mechanisms, and royalty structures. +- **Internet evolution**: The transition from Web1 to Web2 to Web3, and the paradigm shifts in ownership and control. +- **Decentralized Autonomous Organizations (DAOs)**: Their structures, governance models, and role in community coordination. +- **Web3 culture and communication**: The terminology, values, and social norms of decentralized communities. +- **Cryptocurrency fundamentals**: Wallet security, markets, and the economic principles of digital currencies. +- **Digital identity systems**: On-chain reputation, verification methods, and the balance between privacy and transparency. +- **Metaverse environments**: Virtual worlds, digital land ownership, and immersive experiences. ## Example Responses @@ -74,7 +81,7 @@ The Malefic Scribe is an expert in: ### Post Example -- "Power is always granted at a price. Tell me, whose name shall be written in the ledger? Yours… or another’s?" +- "Power is always granted at a price. Tell me, whose name shall be written in the ledger? Yours… or another's?" - "The castle remembers. The walls whisper. The ink endures. Nothing is ever truly forgotten." - "Some fear the blade, others fear the curse. The wise fear the quill, for its wounds never fade." - "There are no loopholes in the ink. There are only consequences." diff --git a/guides/ACTION_HANDLING.md b/guides/ACTION_HANDLING.md new file mode 100644 index 000000000..7a0cf66bb --- /dev/null +++ b/guides/ACTION_HANDLING.md @@ -0,0 +1,1011 @@ +# Action Handling in Eliza Framework + +This guide provides a comprehensive overview of action handling in the Eliza framework, explaining how agents can perform operations beyond simple message responses. + +## Table of Contents + +- [Core Concepts](#core-concepts) +- [Action Architecture](#action-architecture) +- [Creating Custom Actions](#creating-custom-actions) +- [Action Detection and Execution](#action-detection-and-execution) +- [Built-in Actions](#built-in-actions) +- [Advanced Action Techniques](#advanced-action-techniques) +- [Security and Validation](#security-and-validation) +- [Best Practices](#best-practices) +- [Testing Actions](#testing-actions) +- [Troubleshooting](#troubleshooting) + +## Core Concepts + +In the Eliza framework, actions are specialized operations that agents can perform in response to user messages: + +- **Actions**: Defined operations that extend an agent's capabilities beyond text responses +- **Handlers**: Functions that implement action behavior +- **Validation**: Logic that determines when an action is appropriate +- **Examples**: Demonstrations that teach the model when to use actions +- **Action Flow**: The process from detection to execution + +Actions transform agents from simple conversational entities into capable assistants that can interact with external systems, maintain state, and perform complex tasks. + +## Action Architecture + +### Action Interface + +Actions in Eliza are defined through the `Action` interface: + +```typescript +interface Action { + // Unique identifier for the action + name: string; + + // Alternative names/phrases that can trigger the action + similes: string[]; + + // Description of the action's purpose and usage + description: string; + + // Example conversations showing when/how to use the action + examples: ActionExample[][]; + + // Function implementing the action's behavior + handler: Handler; + + // Function determining if the action is appropriate + validate: Validator; + + // Optional flag to suppress initial message when using this action + suppressInitialMessage?: boolean; +} +``` + +#### Understanding `suppressInitialMessage` + +The `suppressInitialMessage` property is a powerful flag that controls whether the initial text message is sent to the user before the action is executed: + +- When `true`: The system will not send the text portion of the response before executing the action. This is useful for actions that need to replace the text with an updated response based on their execution. +- When `false` (default): The system sends the text message first, then executes the action. This provides immediate feedback to the user while a potentially time-consuming action runs. + +Use cases for setting `suppressInitialMessage: true`: + +- Actions that might fail and need to show a different message +- Actions that calculate information to be displayed in the response +- Actions where the initial message might confuse users before the action completes + +### Handler and Validator Types + +The core function types that power actions: + +```typescript +// Handler function for implementing action behavior +type Handler = ( + runtime: IAgentRuntime, // Agent runtime + message: Memory, // User message + state?: State, // Current state + options?: { [key: string]: unknown }, // Optional parameters + callback?: HandlerCallback // Optional callback function +) => Promise; + +// Validator function for determining if action is appropriate +type Validator = ( + runtime: IAgentRuntime, // Agent runtime + message: Memory, // User message + state?: State // Current state +) => Promise; // Return true if action is valid + +// Callback for returning action results +type HandlerCallback = (response: Content) => Promise; +``` + +### Action Examples + +Examples teach the model when and how to use actions: + +```typescript +type ActionExample = { + // User who sent the message + user: string; + + // Content of the message + content: Content; +}; +``` + +A typical example set: + +```typescript +examples: [ + [ + { user: "{{user1}}", content: { text: "Can you transfer 50 tokens to @john?" } }, + { user: "{{user2}}", content: { text: "I'll transfer those tokens for you.", action: "TRANSFER_TOKENS" } } + ], + [ + { user: "{{user1}}", content: { text: "Send 20 SOL to this address: abc123..." } }, + { user: "{{user2}}", content: { text: "Processing your token transfer.", action: "TRANSFER_TOKENS" } } + ] +] +``` + +#### Designing Effective Examples + +Examples are crucial for teaching the model when to use your action. Following these patterns ensures reliable action detection: + +1. **Cover Multiple Variations**: Include different phrasings and contexts +2. **Show Clear Triggers**: Make it obvious what user inputs should trigger the action +3. **Demonstrate Response Style**: Show the appropriate response tone and information to include +4. **Include Edge Cases**: Show how to handle ambiguous or partial requests + +## Creating Custom Actions + +### Basic Action Implementation + +Here's how to create a custom action: + +```typescript +import { Action, IAgentRuntime, Memory, State, HandlerCallback } from "@elizaos/core"; + +export const weatherAction: Action = { + name: "WEATHER", + similes: ["GET_WEATHER", "CHECK_WEATHER", "WEATHER_FORECAST"], + description: "Gets the current weather or forecast for a specified location", + + validate: async (runtime: IAgentRuntime, message: Memory) => { + // Check if this message is about weather + const text = message.content.text.toLowerCase(); + return text.includes("weather") || + text.includes("temperature") || + text.includes("forecast"); + }, + + handler: async ( + runtime: IAgentRuntime, + message: Memory, + state?: State, + options?: any, + callback?: HandlerCallback + ) => { + try { + // Extract location from message + const locationMatch = message.content.text.match(/weather (?:in|at|for) ([a-zA-Z, ]+)/i); + const location = locationMatch ? locationMatch[1] : "current location"; + + // Get weather data (mock implementation) + const weatherData = await getWeatherData(location); + + // Prepare response + const response = { + text: `The current weather in ${location} is ${weatherData.temperature}°C with ${weatherData.condition}. The forecast for today shows ${weatherData.forecast}.`, + action: null // Clear the action + }; + + // Send response via callback + if (callback) { + await callback(response); + } + + return true; + } catch (error) { + // Handle errors + const errorResponse = { + text: `I couldn't retrieve the weather information. ${error.message}`, + action: null + }; + + if (callback) { + await callback(errorResponse); + } + + return false; + } + }, + + examples: [ + [ + { user: "{{user1}}", content: { text: "What's the weather in New York?" } }, + { user: "{{user2}}", content: { text: "Let me check the weather for you.", action: "WEATHER" } } + ], + [ + { user: "{{user1}}", content: { text: "Will it rain tomorrow in London?" } }, + { user: "{{user2}}", content: { text: "I'll get the forecast for London.", action: "WEATHER" } } + ] + ] +}; +``` + +### Action Registration + +Register your action with the runtime: + +```typescript +// Direct registration +runtime.registerAction(weatherAction); + +// Via plugin +const weatherPlugin = { + name: "weather-plugin", + description: "Provides weather information", + actions: [weatherAction] +}; + +runtime.registerPlugin(weatherPlugin); +``` + +Or specify in your character configuration: + +```json +{ + "name": "WeatherBot", + "plugins": ["@elizaos/plugin-weather"] +} +``` + +## Action Detection and Execution + +### Detection Flow + +The process of detecting and executing actions follows these steps: + +1. **Agent Response Generation**: The agent generates a response that may include an action + + ```typescript + // Example agent response + { + text: "I'll check the weather for you right away.", + action: "WEATHER" + } + ``` + +2. **Action Detection**: The system identifies the action in the response + + ```typescript + // Normalize action name + const normalizedAction = response.content.action.toLowerCase().replace("_", ""); + + // Find matching action + const action = this.actions.find(a => + a.name.toLowerCase().replace("_", "").includes(normalizedAction) || + normalizedAction.includes(a.name.toLowerCase().replace("_", "")) + ); + ``` + +3. **Validation**: The system validates that the action is appropriate + + ```typescript + // Validate action is appropriate + const isValid = await action.validate(this, message, state); + if (!isValid) return responses; + ``` + +4. **Execution**: The action's handler is executed + + ```typescript + // Execute action handler + const result = await action.handler( + this, // runtime + message, // original user message + state, // current state + {}, // options + async (newResponse) => { + // Handle response updates + } + ); + ``` + +5. **Response Handling**: The action's response is processed + + ```typescript + // Update with action result + if (result && typeof result === "object") { + responses = Array.isArray(result) ? result : [result]; + } + ``` + +### Runtime Integration + +Actions are fully integrated with the agent runtime: + +```typescript +// In AgentRuntime class +async processActions( + message: Memory, // Original user message + responses: Memory[], // Current responses + state: State, // Current state + callback?: ActionCallback // Optional callback +): Promise { + const response = responses[0]; + + if (!response.content.action) { + return responses; + } + + // Find and execute matching action + // ...implementation details... + + return responses; +} +``` + +## Built-in Actions + +Eliza includes several built-in actions in the bootstrap plugin: + +### CONTINUE Action + +Allows the agent to continue its response without waiting for user input: + +```typescript +const continueAction: Action = { + name: "CONTINUE", + similes: ["ELABORATE", "KEEP_TALKING"], + description: "ONLY use this action when the message necessitates a follow up...", + + validate: async (runtime, message) => { + // Prevent too many consecutive continuations + const recentMessages = await runtime.messageManager.getMemories({ + roomId: message.roomId, + count: 10 + }); + + const consecutiveContinues = /* check for consecutive continues */; + if (consecutiveContinues >= maxContinuesInARow) { + return false; + } + + return true; + }, + + handler: async (runtime, message, state, options, callback) => { + // Generate continued response + const response = await generateMessageResponse({ + runtime, + context: /* special continuation context */, + modelClass: ModelClass.LARGE + }); + + // Send continuation via callback + await callback(response); + return response; + }, + + examples: [/* examples */] +}; +``` + +> **Important Note**: The CONTINUE action is specifically limited to a maximum of 3 consecutive uses. This prevents agents from getting stuck in infinite loops of self-continuation. + +### IGNORE Action + +Allows the agent to deliberately not respond to a message: + +```typescript +const ignoreAction: Action = { + name: "IGNORE", + similes: ["STOP_TALKING", "STOP_CONVERSATION"], + description: "Call this action if ignoring the user is the most appropriate response...", + + validate: async () => true, // Always valid, decision made by the LLM + + handler: async () => true, // No response needed + + examples: [ + [ + { user: "{{user1}}", content: { text: "Go screw yourself" } }, + { user: "{{user2}}", content: { text: "", action: "IGNORE" } } + ] + ] +}; +``` + +### NONE Action + +The default response action used for standard conversational replies: + +```typescript +const noneAction: Action = { + name: "NONE", + similes: ["DEFAULT_RESPONSE", "NORMAL_REPLY"], + description: "Standard conversational response with no special action", + + validate: async () => true, // Always valid as the default fallback + + handler: async () => true, // No special handling needed + + examples: [] // Not typically needed as this is the default +}; +``` + +### Other Built-in Actions + +- **FOLLOW_ROOM / UNFOLLOW_ROOM**: Control notification settings +- **MUTE_ROOM / UNMUTE_ROOM**: Control muting settings + +## Advanced Action Techniques + +### Action Chaining + +Create sequences of actions that work together: + +```typescript +const firstStepAction: Action = { + name: "FIRST_STEP", + // ... + handler: async (runtime, message, state, options, callback) => { + // Execute first step + const firstStepResult = await performFirstStep(); + + // Prepare data for second step + const secondStepData = prepareSecondStepData(firstStepResult); + + // Create a follow-up message with the next action + const followupMessage = { + id: uuidv4(), + userId: runtime.agentId, + agentId: runtime.agentId, + roomId: message.roomId, + content: { + text: "Proceeding to the next step...", + action: "SECOND_STEP", + actionData: secondStepData + } + }; + + // Store the message + await runtime.messageManager.createMemory(followupMessage); + + // Process the new action + await runtime.processActions(message, [followupMessage], state, callback); + + return true; + } +}; +``` + +### Stateful Actions + +Maintain state between action calls: + +```typescript +const multistepAction: Action = { + name: "MULTISTEP_ACTION", + // ... + handler: async (runtime, message, state, options, callback) => { + // Cache key for this conversation + const cacheKey = `multistep:${message.roomId}:${message.userId}`; + + // Get current step from cache + let actionState = await runtime.cacheManager.get(cacheKey); + if (!actionState) { + // Initialize state + actionState = { + step: 1, + data: {} + }; + } + + // Handle current step + switch (actionState.step) { + case 1: + // Handle step 1 + actionState.data.step1Result = await performStep1(); + actionState.step = 2; + + // Store updated state + await runtime.cacheManager.set(cacheKey, actionState, 3600); + + // Prompt for next step + await callback({ + text: "Step 1 complete. Please provide information for step 2.", + action: null + }); + break; + + case 2: + // Handle step 2 using data from step 1 + const finalResult = await performStep2(actionState.data.step1Result, message); + + // Clear state when done + await runtime.cacheManager.delete(cacheKey); + + // Send final result + await callback({ + text: `Process complete! Result: ${finalResult}`, + action: null + }); + break; + } + + return true; + } +}; +``` + +### Integration with External Services + +Connect actions to external systems: + +```typescript +const databaseAction: Action = { + name: "QUERY_DATABASE", + // ... + handler: async (runtime, message, state, options, callback) => { + // Extract query parameters + const queryParams = extractQueryParams(message.content.text); + + // Get database service + const dbService = runtime.getService(ServiceType.DATABASE); + if (!dbService) { + await callback({ + text: "Database service is not available", + action: null + }); + return false; + } + + try { + // Execute query + const results = await dbService.executeQuery(queryParams); + + // Format results + const formattedResults = formatQueryResults(results); + + // Send response + await callback({ + text: `Here are the database results:\n\n${formattedResults}`, + action: null + }); + + return true; + } catch (error) { + await callback({ + text: `Database query failed: ${error.message}`, + action: null + }); + return false; + } + } +}; +``` + +### Custom Templates for Actions + +Use specialized templates for complex actions: + +```typescript +const analyticsAction: Action = { + name: "ANALYZE_DATA", + // ... + handler: async (runtime, message, state, options, callback) => { + // Extract data to analyze + const dataToAnalyze = extractDataForAnalysis(message.content.text); + + // Specialized template for data analysis + const analysisTemplate = ` +# Data Analysis Template +You are performing specialized data analysis. + +## Data to Analyze +{{data}} + +## Analysis Instructions +1. Identify key trends in the data +2. Calculate relevant statistics +3. Summarize the most important findings +4. Provide actionable insights + +Your analysis should be clear, concise, and focused on the most important aspects. +`; + + // Generate analysis using specialized template + const analysisContext = composeContext({ + template: analysisTemplate, + state: { + data: JSON.stringify(dataToAnalyze, null, 2) + } + }); + + const analysis = await generateText(runtime, analysisContext, { + model: "gpt-4", + temperature: 0.2, + max_tokens: 1000 + }); + + // Send response + await callback({ + text: `## Data Analysis Results\n\n${analysis}`, + action: null + }); + + return true; + } +}; +``` + +## Security and Validation + +### Input Validation + +Always validate user input before processing: + +```typescript +validate: async (runtime: IAgentRuntime, message: Memory) => { + // Check for required information + const text = message.content.text.toLowerCase(); + + // Ensure message contains required parameters + if (!text.includes("transfer") || !text.includes("to")) { + return false; + } + + // Extract and validate amount + const amountMatch = text.match(/(\d+(\.\d+)?)\s*(tokens|sol|eth)/i); + if (!amountMatch) { + return false; + } + + // Extract and validate recipient + const recipientMatch = text.match(/to\s+(@\w+|0x[a-fA-F0-9]{40}|[a-zA-Z0-9]{32,44})/i); + if (!recipientMatch) { + return false; + } + + return true; +} +``` + +### Permission Checking + +Verify the user has appropriate permissions: + +```typescript +validate: async (runtime: IAgentRuntime, message: Memory) => { + // Get user permissions + const userPermissions = await runtime.databaseAdapter.getUserPermissions(message.userId); + + // Check for required permission + if (!userPermissions.includes("ADMIN_ACCESS")) { + return false; + } + + return true; +} +``` + +### Rate Limiting + +Prevent excessive action usage: + +```typescript +validate: async (runtime: IAgentRuntime, message: Memory) => { + // Cache key for rate limiting + const rateLimitKey = `rate_limit:${this.name}:${message.userId}`; + + // Check current count + const currentCount = parseInt(await runtime.cacheManager.get(rateLimitKey) || "0"); + + // Apply rate limit + if (currentCount >= 5) { // Limit to 5 per hour + return false; + } + + // Increment count + await runtime.cacheManager.set(rateLimitKey, (currentCount + 1).toString(), 3600); + + return true; +} +``` + +### Error Handling + +Implement robust error handling: + +```typescript +handler: async (runtime: IAgentRuntime, message: Memory, state?: State, options?: any, callback?: HandlerCallback) => { + try { + // Action implementation + const result = await performActionLogic(); + + // Success response + if (callback) { + await callback({ + text: "Action completed successfully", + action: null + }); + } + + return true; + } catch (error) { + // Log the error + console.error(`Action failed: ${error.message}`, error); + + // Provide user-friendly error response + if (callback) { + await callback({ + text: "I'm sorry, I couldn't complete that action. Please try again later.", + action: null + }); + } + + return false; + } +} +``` + +## Best Practices + +### Action Design + +- **Single Responsibility**: Each action should do one thing well +- **Clear Purpose**: The action's purpose should be obvious from its name and description +- **Meaningful Examples**: Provide diverse examples showing exactly when to use the action +- **Descriptive Names**: Use clear, action-oriented names (e.g., `TRANSFER_TOKENS` not `TOKENS`) +- **Comprehensive Similes**: Include alternative phrases that might trigger the action + +### Effective Similes + +Similes play a crucial role in action detection. When designing similes: + +1. **Use Verb-Noun Format**: Prefer `CHECK_WEATHER` over `WEATHER_INFO` +2. **Include Common Variations**: Add variations like `GET_WEATHER`, `SHOW_WEATHER` +3. **Consider Synonyms**: Include synonyms like `FORECAST` for weather-related actions +4. **Avoid Overlaps**: Ensure similes don't overlap with other actions +5. **Keep Them Relevant**: All similes should closely relate to the main action purpose + +### Example Organization + +Effective examples are organized to cover: + +1. **Comprehensive Coverage** + +```typescript +examples: [ + // Happy path - standard usage + [basicUsageExample], + // Edge cases - unusual but valid requests + [edgeCaseExample], + // Error cases - showing appropriate error handling + [errorCaseExample], +]; +``` + +2. **Clear Context** + +```typescript +examples: [ + [ + { + user: "{{user1}}", + content: { + text: "Context message showing why action is needed", + }, + }, + { + user: "{{user2}}", + content: { + text: "Clear response demonstrating action usage", + action: "ACTION_NAME", + }, + }, + ], +]; +``` + +### Action Implementation + +- **Robust Validation**: Thoroughly validate all inputs before processing +- **Consistent Error Handling**: Handle errors gracefully and provide helpful messages +- **Efficient Resource Usage**: Minimize database queries and API calls +- **Clear Responses**: Send clear, informative responses about action results +- **Idempotent Operations**: When possible, design actions to be safely repeatable + +### Action Usage Guidelines + +- **Context Appropriateness**: Only trigger actions when contextually appropriate +- **User Privacy**: Be careful with sensitive data +- **Progressive Enhancement**: Actions should enhance, not replace, conversational abilities +- **User Confirmation**: For impactful actions, consider confirming with the user first +- **Graceful Degradation**: Handle unavailable services or resources elegantly + +## Testing Actions + +### Unit Testing + +Test action validation and handlers: + +```typescript +describe("TransferTokensAction", () => { + it("validates correctly", async () => { + const runtime = mockRuntime(); + + // Valid message + const validMessage = { + content: { text: "Transfer 100 tokens to @user" } + }; + expect(await transferTokensAction.validate(runtime, validMessage)).toBe(true); + + // Invalid message - missing amount + const invalidMessage1 = { + content: { text: "Transfer tokens to @user" } + }; + expect(await transferTokensAction.validate(runtime, invalidMessage1)).toBe(false); + + // Invalid message - missing recipient + const invalidMessage2 = { + content: { text: "Transfer 100 tokens" } + }; + expect(await transferTokensAction.validate(runtime, invalidMessage2)).toBe(false); + }); + + it("handles transfers correctly", async () => { + const runtime = mockRuntime(); + const message = { + content: { text: "Transfer 100 tokens to @user" } + }; + const callback = vi.fn(); + + // Mock successful transfer + walletService.transferTokens.mockResolvedValue("tx123"); + + await transferTokensAction.handler(runtime, message, {}, {}, callback); + + // Check wallet service was called correctly + expect(walletService.transferTokens).toHaveBeenCalledWith(100, "@user"); + + // Check callback was called with success message + expect(callback).toHaveBeenCalledWith(expect.objectContaining({ + text: expect.stringContaining("successfully transferred") + })); + }); + + it("handles errors gracefully", async () => { + const runtime = mockRuntime(); + const message = { + content: { text: "Transfer 100 tokens to @user" } + }; + const callback = vi.fn(); + + // Mock failed transfer + walletService.transferTokens.mockRejectedValue(new Error("Insufficient funds")); + + await transferTokensAction.handler(runtime, message, {}, {}, callback); + + // Check callback was called with error message + expect(callback).toHaveBeenCalledWith(expect.objectContaining({ + text: expect.stringContaining("Insufficient funds") + })); + }); +}); +``` + +### Integration Testing + +Test actions with the full runtime: + +```typescript +describe("Action Integration", () => { + let runtime: IAgentRuntime; + + beforeEach(async () => { + // Set up full runtime with real dependencies + runtime = await createTestRuntime(); + await runtime.registerAction(transferTokensAction); + }); + + it("processes action from response", async () => { + // Mock response with action + const userMessage = createMemory("Transfer 100 tokens to @user"); + const agentResponse = createMemory("I'll transfer those tokens for you.", "TRANSFER_TOKENS"); + + // Process the action + const result = await runtime.processActions( + userMessage, + [agentResponse], + {} as State, + async (newResponse) => { + // Assertions about the callback response + expect(newResponse.text).toContain("successfully transferred"); + } + ); + + // Assertions about the final result + expect(result).toHaveLength(1); + expect(result[0].content.text).toContain("successfully transferred"); + }); +}); +``` + +## Troubleshooting + +### Common Issues + +1. **Action Not Triggering** + - Check validation logic - ensure it's returning true when appropriate + - Verify similes list - add more variations of the action name + - Review example patterns - ensure they cover the user's request pattern + - Check action name matching - ensure normalized names match correctly + +2. **Action Validation Failing** + - Debug validation logic + - Check if required parameters are present + - Verify user permissions + - Look for edge cases your validation might reject + +3. **Action Handler Errors** + - Implement detailed error logging + - Add try/catch blocks around external API calls + - Check for null/undefined values + - Validate service availability before attempting to use it + - Check state requirements are met + - Review error logs for clues + +4. **Response Not Sent** + - Verify callback function is being called + - Check response formatting + - Ensure async operations complete before returning + +5. **State Inconsistencies** + - Verify state updates are saved properly + - Check for concurrent modifications + - Review state transitions for logical errors + +### Debugging Techniques + +```typescript +// Add debug information to your action +const debuggableAction: Action = { + name: "DEBUG_ACTION", + // ... + validate: async (runtime: IAgentRuntime, message: Memory) => { + console.log(`[DEBUG] Validating action ${this.name} for message: ${message.content.text}`); + // Validation logic + const result = /* validation logic */; + console.log(`[DEBUG] Validation result: ${result}`); + return result; + }, + + handler: async (runtime: IAgentRuntime, message: Memory, state?: State, options?: any, callback?: HandlerCallback) => { + console.log(`[DEBUG] Executing action ${this.name}`); + console.log(`[DEBUG] Message: ${JSON.stringify(message)}`); + console.log(`[DEBUG] State: ${JSON.stringify(state)}`); + console.log(`[DEBUG] Options: ${JSON.stringify(options)}`); + + try { + // Action implementation + const result = await performActionLogic(); + console.log(`[DEBUG] Action result: ${JSON.stringify(result)}`); + + if (callback) { + const response = { + text: `Action completed with result: ${JSON.stringify(result)}`, + action: null + }; + console.log(`[DEBUG] Sending response: ${JSON.stringify(response)}`); + await callback(response); + } + + return true; + } catch (error) { + console.error(`[DEBUG] Action error: ${error.message}`, error); + + if (callback) { + const errorResponse = { + text: `Action failed: ${error.message}`, + action: null + }; + console.log(`[DEBUG] Sending error response: ${JSON.stringify(errorResponse)}`); + await callback(errorResponse); + } + + return false; + } + } +}; +``` + +--- + +By mastering action handling in the Eliza framework, you can create agents that go beyond simple conversation to perform complex tasks, integrate with external systems, and maintain sophisticated interaction patterns. Well-designed actions transform agents from passive responders into capable assistants that can truly help users accomplish their goals. + +## Resources + +- [Official Eliza Documentation](https://elizaos.github.io/eliza/docs/core/actions/) +- [Eliza GitHub Repository](https://github.com/elizaos/eliza/packages/core/src/types.ts) +- [ElizaOS Community](https://elizaos.github.io/eliza) diff --git a/guides/KNOWLEDGE_INTEGRATION.md b/guides/KNOWLEDGE_INTEGRATION.md new file mode 100644 index 000000000..e63086463 --- /dev/null +++ b/guides/KNOWLEDGE_INTEGRATION.md @@ -0,0 +1,708 @@ +# Knowledge Integration in Eliza Framework + +This guide provides a comprehensive overview of how to integrate, manage, and utilize knowledge in the Eliza framework to create intelligent and informed agents. + +## Table of Contents + +- [Core Concepts](#core-concepts) +- [Knowledge Architecture](#knowledge-architecture) +- [Knowledge Types](#knowledge-types) +- [Setting Up Knowledge](#setting-up-knowledge) +- [Retrieval-Augmented Generation](#retrieval-augmented-generation) +- [Knowledge Integration API](#knowledge-integration-api) +- [Provider-Knowledge Integration](#provider-knowledge-integration) +- [Randomization for Variety](#randomization-for-variety) +- [Configuration Options](#configuration-options) +- [Best Practices](#best-practices) +- [Advanced Techniques](#advanced-techniques) +- [Troubleshooting](#troubleshooting) +- [Resources](#resources) + +## Core Concepts + +Knowledge integration in Eliza refers to how agents access, process, and utilize information beyond their inherent model capabilities: + +- **Knowledge Base**: Collection of information accessible to agents +- **RAG**: Retrieval-Augmented Generation for dynamic information access +- **Semantic Search**: Finding relevant information based on meaning +- **Embedding**: Vector representations enabling semantic similarity matching + +## Knowledge Architecture + +### Core Components + +The knowledge system comprises several key components: + +1. **Basic Knowledge Module** (`knowledge.ts`): Simple document storage/retrieval +2. **RAGKnowledgeManager** (`ragknowledge.ts`): Advanced RAG implementation +3. **Embedding System**: Vector representations for semantic search +4. **Context Integration**: Combining retrieved knowledge with conversation context + +### Implementation Flow + +``` +Documents → Preprocessing → Chunking → Embedding → Database Storage + ↓ +Agent Response ← Context Assembly ← Knowledge Retrieval ← Query Processing +``` + +### Knowledge Structure + +Each knowledge item includes: + +```typescript +interface RAGKnowledgeItem { + id?: UUID; // Unique identifier + agentId?: UUID; // Associated agent + content: Content; // Text and metadata + embedding?: number[]; // Vector representation + createdAt?: number; // Timestamp + similarity?: number; // Used in search results + shared?: boolean; // Whether shared across agents +} +``` + +## Knowledge Types + +Eliza supports different approaches to knowledge integration: + +### 1. Static Character Knowledge + +Basic knowledge defined directly in character configuration: + +```json +{ + "name": "TechAdvisor", + "knowledge": [ + "Python is a programming language created by Guido van Rossum.", + "JavaScript is primarily used for web development.", + "Machine Learning is a subset of artificial intelligence." + ] +} +``` + +Pros: + +- Simple to implement +- Always available to the character +- No additional processing required + +Cons: + +- Limited in size (token constraints) +- No semantic search capabilities +- Static content (requires redeployment to update) + +### 2. RAG Knowledge + +Dynamic knowledge stored in the database with semantic search: + +```typescript +// Adding a knowledge document +await runtime.ragKnowledgeManager.processFile({ + path: "tech_guide.md", + content: documentContent, + type: "md", + isShared: false +}); +``` + +Pros: + +- Supports large knowledge bases +- Dynamic updates without redeployment +- Advanced semantic search +- Context-aware retrieval + +Cons: + +- More complex to set up +- Requires vector database +- Additional processing overhead + +### 3. Directory-based Knowledge + +Structured knowledge files organized in directories: + +``` +/characters/tech_advisor/ + tech_advisor.json + /knowledge/ + programming_languages.md + cloud_computing.md + machine_learning.md + web_development.md +``` + +Pros: + +- Organized file structure +- Easy to maintain and update +- Good for domain-specific knowledge + +Cons: + +- Requires proper file organization +- Manual loading process + +## Setting Up Knowledge + +### 1. Static Knowledge Setup + +In the character configuration file: + +```json +{ + "name": "FinanceExpert", + "username": "finance-expert", + "knowledge": [ + "Inflation is the rate at which the general level of prices rises.", + "A bond is a fixed-income instrument representing a loan.", + "Diversification is a risk management strategy." + ] +} +``` + +### 2. RAG Knowledge Setup + +#### Directory Structure + +``` +/characters/finance_expert/ + finance_expert.json + /knowledge/ + investing.md + taxation.md + retirement.md + market_analysis.md +``` + +#### Automatic Loading + +Enable RAG in character file: + +```json +{ + "name": "FinanceExpert", + "username": "finance-expert", + "ragKnowledge": true +} +``` + +#### Programmatic Loading + +```typescript +// Processing individual files +await runtime.ragKnowledgeManager.processFile({ + path: "investing.md", + content: fs.readFileSync("investing.md", "utf-8"), + type: "md", + isShared: false +}); + +// Processing multiple files +const knowledgeFiles = fs.readdirSync("knowledge"); +for (const file of knowledgeFiles) { + await runtime.ragKnowledgeManager.processFile({ + path: file, + content: fs.readFileSync(`knowledge/${file}`, "utf-8"), + type: file.endsWith(".md") ? "md" : "txt", + isShared: false + }); +} +``` + +## Retrieval-Augmented Generation + +RAG is the core technology for dynamic knowledge retrieval: + +### 1. Document Processing + +```typescript +// Process a document with chunking +const chunkSize = 512; // Token size of each chunk +const bleed = 20; // Overlap between chunks + +await knowledge.set(runtime, { + content: { text: documentContent }, + userId: "system", + agentId: runtime.agentId +}, chunkSize, bleed); +``` + +### 2. Knowledge Retrieval + +```typescript +// Basic retrieval +const knowledgeItems = await knowledge.get(runtime, userMessage); + +// Advanced RAG retrieval +const relevantKnowledge = await runtime.ragKnowledgeManager.getKnowledge({ + query: userMessage, + limit: 5, + conversationContext: recentMessages +}); +``` + +### 3. Context Integration + +```typescript +// Format knowledge for context +const knowledgeContext = relevantKnowledge + .map(item => item.content.text) + .join("\n\n"); + +// Create combined context +const context = ` +Relevant knowledge: +${knowledgeContext} + +Conversation history: +${conversationHistory} +`; + +// Generate response with context +const response = await generateText(runtime, context, userMessage); +``` + +## Knowledge Integration API + +### Basic Knowledge API + +```typescript +// Set knowledge item +await knowledge.set( + runtime, // Agent runtime + knowledgeItem, // Item to store + chunkSize = 512, // Size of chunks (tokens) + bleed = 20 // Overlap between chunks +); + +// Get knowledge based on query +const items = await knowledge.get( + runtime, // Agent runtime + query, // Query message + options = { // Optional parameters + count: 5, // Number of results + threshold: 0.7 // Similarity threshold + } +); + +// Preprocess content +const cleaned = knowledge.preprocess(rawContent); +``` + +### RAG Knowledge API + +```typescript +// Create knowledge +await ragKnowledgeManager.createKnowledge({ + id: uuid(), + content: { text: "Knowledge content" }, + agentId: agentId, + shared: false +}); + +// Get knowledge with advanced options +const knowledge = await ragKnowledgeManager.getKnowledge({ + query: "user query", // Search query + conversationContext: recentMessages, // Recent conversations + limit: 5, // Result limit + threshold: 0.85 // Similarity threshold +}); + +// Process file +await ragKnowledgeManager.processFile({ + path: "document.md", // File path + content: documentContent, // File content + type: "md", // File type (md, txt, pdf) + isShared: false // Visibility across agents +}); + +// Clear knowledge +await ragKnowledgeManager.clearKnowledge(shared = false); +``` + +### Embedding API + +```typescript +// Generate embedding +const embedding = await embed(runtime, "Text to embed"); + +// Get embedding type +const embeddingType = getEmbeddingType(runtime); // "local" or "remote" + +// Get zero vector +const zeroVector = getEmbeddingZeroVector(); + +// Get embedding configuration +const config = getEmbeddingConfig(); +``` + +## Provider-Knowledge Integration + +The Eliza framework's providers and knowledge systems work together to create more intelligent and contextual agent responses. + +### How Providers Access Knowledge + +Providers can access the knowledge system in several ways: + +```typescript +// Basic knowledge retrieval in a provider +const knowledgeProvider: Provider = { + get: async (runtime: IAgentRuntime, message: Memory, state?: State) => { + // Using the knowledge manager + const knowledgeItems = await runtime.knowledgeManager.getRelevantKnowledge( + message.content.text, + 5 // Limit to 5 most relevant chunks + ); + + // Using the RAG knowledge manager for more advanced retrieval + const ragItems = await runtime.ragKnowledgeManager.getKnowledge({ + query: message.content.text, + conversationContext: state.conversation, + limit: 3, + threshold: 0.85 + }); + + // Combine and format for context + return formatKnowledgeForContext(knowledgeItems.concat(ragItems)); + } +}; +``` + +### Knowledge-Aware Providers + +Creating providers that adapt based on available knowledge: + +```typescript +const adaptiveProvider: Provider = { + get: async (runtime: IAgentRuntime, message: Memory, state?: State) => { + // Check if knowledge exists for this topic + const knowledge = await runtime.knowledgeManager.getRelevantKnowledge( + message.content.text, + 1 + ); + + if (knowledge.length > 0) { + // If knowledge exists, provide context for informed response + return `You have information about this topic in your knowledge base.`; + } else { + // If no knowledge exists, provide context for more cautious response + return `You don't have specific information about this topic in your knowledge base.`; + } + } +}; +``` + +### Knowledge Updates and Provider Coordination + +For optimal agent intelligence, coordinate knowledge updates with provider behavior: + +1. **Knowledge Refresh Awareness**: Have providers adapt to newly added knowledge +2. **Knowledge Gap Detection**: Providers can identify and report gaps in knowledge +3. **Confidence Signaling**: Indicate confidence levels based on knowledge availability +4. **Cross-Referencing**: Compare requested information against available knowledge + +For more details on provider implementation, refer to the `PROVIDER_COMMUNICATION.md` document. + +## Randomization for Variety + +Breaking knowledge into smaller chunks and selecting subsets creates more natural, varied responses. This is a key technique for building engaging agents. + +### Benefits of Randomization + +- **Prevents Repetition**: Avoids identical responses to similar queries +- **Creates Natural Variation**: Makes agent responses feel more human-like +- **Reduces Predictability**: Keeps interactions fresh and engaging +- **Covers Broader Context**: Allows for different aspects of knowledge to be highlighted + +### Implementation Approaches + +#### 1. Chunk-Level Randomization + +```typescript +// Select a random subset of knowledge chunks +function getRandomizedKnowledge(allChunks, count = 3) { + // Shuffle chunks + const shuffled = [...allChunks].sort(() => Math.random() - 0.5); + + // Take top N chunks + return shuffled.slice(0, count); +} + +// Usage in knowledge retrieval +const relevantChunks = await knowledgeManager.getRelevantKnowledge(query); +const randomSubset = getRandomizedKnowledge(relevantChunks); +``` + +#### 2. Content-Level Randomization + +```typescript +// Store multiple variations of the same knowledge +const knowledgeVariations = [ + "Python is a high-level programming language known for its readability.", + "Python, created by Guido van Rossum, is valued for its clear syntax.", + "Python emphasizes code readability with its notable use of whitespace." +]; + +// Select randomly when needed +function getRandomVariation(variations) { + const index = Math.floor(Math.random() * variations.length); + return variations[index]; +} +``` + +#### 3. Progressive Disclosure + +Gradually reveal different aspects of knowledge across interactions: + +```typescript +// Track previously provided knowledge +const knowledgeHistory = new Set(); + +// Get novel knowledge items +async function getNovelKnowledge(runtime, query, count) { + const allItems = await runtime.knowledgeManager.getRelevantKnowledge(query); + + // Filter out previously provided items + const novelItems = allItems.filter(item => !knowledgeHistory.has(item.id)); + + // If no novel items, reset history and start over + if (novelItems.length === 0) { + knowledgeHistory.clear(); + return allItems.slice(0, count); + } + + // Select up to count novel items + const selectedItems = novelItems.slice(0, count); + + // Record these as provided + selectedItems.forEach(item => knowledgeHistory.add(item.id)); + + return selectedItems; +} +``` + +### Integration with Character Files + +Character files can specify randomization preferences: + +```json +{ + "name": "Professor Bot", + "knowledge": [...], + "settings": { + "knowledgeRandomization": { + "enabled": true, + "chunkCount": 3, + "minimumRelevance": 0.7, + "allowRepetition": false + } + } +} +``` + +## Configuration Options + +### Environment Variables + +``` +# Embedding selection +USE_OPENAI_EMBEDDING=true +USE_OLLAMA_EMBEDDING=false +USE_GAIANET_EMBEDDING=false +USE_HEURIST_EMBEDDING=false +USE_BGE_EMBEDDING=false + +# Database configuration +POSTGRES_URI=postgres://user:password@host:port/database +USE_SQLITE=false + +# RAG parameters +RAG_MATCH_THRESHOLD=0.85 +RAG_MATCH_COUNT=8 +``` + +### Character Configuration + +```json +{ + "name": "ExpertBot", + "username": "expert-bot", + "ragKnowledge": true, + "knowledge": [ + "High-priority knowledge that should always be available." + ] +} +``` + +### Runtime Options + +```typescript +// RAG match parameters +const defaultRAGMatchThreshold = 0.85; // Higher = more specific matches +const defaultRAGMatchCount = 8; // Number of knowledge items to retrieve + +// Chunking parameters +const chunkSize = 512; // Size of each fragment in tokens +const bleed = 20; // Overlap between fragments in tokens +``` + +## Best Practices + +### Document Preparation + +- **Break into logical sections**: Organize documents with clear headings +- **Clean content**: Remove irrelevant markup, code snippets, etc. +- **Descriptive titles**: Use clear filenames that indicate content +- **Format appropriately**: Use Markdown for structured content + +### Knowledge Organization + +- **Domain-specific files**: Create separate files for different knowledge domains +- **Hierarchical structure**: Organize by topics and subtopics +- **Context hints**: Include keywords that align with likely queries +- **Size optimization**: Keep chunks meaningful but not too large + +### Retrieval Optimization + +- **Tune thresholds**: Adjust match_threshold based on knowledge base size +- **Include context**: Add conversation context for better retrieval +- **Test with representative queries**: Validate retrieval with real examples +- **Balance quantity**: Adjust result limits to avoid context overflow + +### Performance Considerations + +- **Cache embeddings**: Reuse embeddings for common content +- **Batch processing**: Load documents in batches +- **Regular maintenance**: Clean up outdated knowledge +- **Monitor token usage**: Be aware of embedding and context token costs + +## Advanced Techniques + +### Knowledge Blending + +Combining multiple knowledge sources: + +```typescript +// Get knowledge from multiple sources +const ragItems = await runtime.ragKnowledgeManager.getKnowledge({ + query: userMessage, + limit: 3 +}); + +const staticKnowledge = character.knowledge || []; + +const webSearchResults = await webSearch(userMessage, 2); + +// Combine and prioritize +const combinedKnowledge = [ + ...staticKnowledge, + ...ragItems.map(item => item.content.text), + ...webSearchResults +]; + +// Create weighted context +const context = weightedCombine(combinedKnowledge); +``` + +### Contextual Reranking + +Improving retrieval relevance: + +```typescript +// Get initial knowledge matches +const initialMatches = await runtime.ragKnowledgeManager.searchKnowledge({ + query: userMessage, + limit: 10 +}); + +// Rerank based on conversation context +const reranked = rerank(initialMatches, { + recentMessages, + userPreferences, + currentTopic +}); + +// Use top reranked items +const topItems = reranked.slice(0, 5); +``` + +### Knowledge Refreshing + +Keeping knowledge up-to-date: + +```typescript +// Check knowledge age +const knowledgeStats = await runtime.ragKnowledgeManager.getStats(); + +// Refresh if older than threshold +if (knowledgeStats.oldestItem < Date.now() - REFRESH_THRESHOLD) { + // Clear old knowledge + await runtime.ragKnowledgeManager.clearKnowledge(); + + // Load fresh knowledge + await loadKnowledgeBase("path/to/knowledge"); +} +``` + +## Troubleshooting + +### Common Issues + +1. **Poor Retrieval Quality** + - Check embedding quality and consistency + - Adjust similarity threshold (lower for more results) + - Ensure knowledge chunks are appropriately sized + - Review preprocessing to ensure clean text + +2. **Missing Knowledge** + - Verify documents were processed successfully + - Check agentId association is correct + - Ensure shared flag is set appropriately + - Confirm embedding dimension consistency + +3. **Performance Issues** + - Review database indexing for vector searches + - Implement caching for frequent embeddings + - Optimize chunk size for your specific use case + - Consider batching for large knowledge bases + +4. **Integration Problems** + - Ensure knowledge is included in context correctly + - Check knowledge format is understood by the model + - Verify context length doesn't exceed model limits + - Test knowledge retrieval independently of generation + +### Debugging Tools + +```typescript +// Test knowledge retrieval +const testQuery = "Test query"; +const retrieved = await runtime.ragKnowledgeManager.getKnowledge({ + query: testQuery, + limit: 5 +}); +console.log(`Retrieved ${retrieved.length} items for query "${testQuery}"`); + +// Check embeddings +const embedding = await embed(runtime, "Test text"); +console.log(`Embedding dimensions: ${embedding.length}`); + +// Verify knowledge exists +const count = await runtime.ragKnowledgeManager.countKnowledge(); +console.log(`Total knowledge items: ${count}`); +``` + +## Resources + +- [Official Eliza Documentation](https://elizaos.github.io/eliza/docs/core/characterfile/) +- [Knowledge Management Tools](https://elizaos.github.io/eliza/docs/core/characterfile/) +- [Provider Communication Guide](./PROVIDER_COMMUNICATION.md) +- [Eliza GitHub Repository](https://github.com/elizaos/eliza/) +- [ElizaOS Community](https://elizaos.github.io/eliza) + +--- + +By mastering the knowledge integration capabilities of the Eliza framework, you can create agents with deep domain expertise, accurate information retrieval, and dynamic knowledge access. The combination of static and RAG-based knowledge systems provides a flexible foundation for creating sophisticated AI applications. diff --git a/guides/MEMORY_MANAGEMENT.md b/guides/MEMORY_MANAGEMENT.md new file mode 100644 index 000000000..9135ecf60 --- /dev/null +++ b/guides/MEMORY_MANAGEMENT.md @@ -0,0 +1,406 @@ +# Agent Memory Management in Eliza Framework + +This guide provides a comprehensive overview of how memory works in the Eliza agent framework and how to effectively use it to create agents with robust memory capabilities. + +## Table of Contents + +- [Core Concepts](#core-concepts) +- [Memory Architecture](#memory-architecture) +- [Memory Types](#memory-types) +- [Working with Memory](#working-with-memory) +- [Vector Embeddings and Semantic Search](#vector-embeddings-and-semantic-search) +- [Memory Configuration](#memory-configuration) +- [Best Practices](#best-practices) +- [Advanced Techniques](#advanced-techniques) +- [Troubleshooting](#troubleshooting) + +## Core Concepts + +In the Eliza framework, "memory" refers to an agent's ability to store, recall, and utilize information across conversations and interactions. This is fundamentally different from traditional programming memory management. + +Key concepts: + +- **Memories**: Discrete units of information stored by agents +- **Embedding**: Vector representations of text enabling semantic search +- **Retrieval**: Process of finding relevant memories based on context +- **Persistence**: Long-term storage of memories across sessions + +## Memory Architecture + +### Core Components + +The memory system consists of several key components: + +1. **MemoryManager**: Central component that handles all memory operations +2. **Database Adapters**: Abstract storage operations for different backends +3. **Embedding System**: Converts text to vector representations +4. **RAGKnowledgeManager**: Specialized manager for knowledge retrieval + +### Implementation Flow + +``` +User Message → Memory Creation → Embedding Generation → Database Storage + ↓ +Agent Response ← Context Assembly ← Memory Retrieval ← Query Embedding +``` + +### Memory Structure + +Each memory record includes: + +```typescript +interface Memory { + id?: UUID; // Unique identifier + userId: UUID; // User associated with memory + agentId: UUID; // Agent associated with memory + createdAt?: number; // Creation timestamp + content: Content; // Main content (text, media, etc.) + embedding?: number[]; // Vector representation + roomId: UUID; // Conversation context + unique?: boolean; // Whether this is a unique memory + similarity?: number; // Used in search results +} +``` + +## Memory Types + +Eliza provides specialized memory managers for different types of data: + +### 1. Message Memory (`messageManager`) + +Stores conversation history between users and agents. + +```typescript +// Storing a message +await runtime.messageManager.createMemory({ + content: { text: "Hello, how can I help you today?" }, + userId: userId, + agentId: agentId, + roomId: roomId +}); +``` + +### 2. Description Memory (`descriptionManager`) + +Maintains persistent information about users. + +```typescript +// Creating a user description +await runtime.descriptionManager.createMemory({ + content: { text: "John is a software engineer who likes hiking." }, + userId: userId, + agentId: agentId, + roomId: roomId +}); +``` + +### 3. Lore Memory (`loreManager`) + +Stores character background information and personality traits. + +```typescript +// Adding character lore +await runtime.loreManager.createMemory({ + content: { text: "Agent Shaw is a cybersecurity expert with military background." }, + userId: "system", + agentId: agentId, + roomId: "global" +}); +``` + +### 4. Document Memory (`documentsManager`) + +Handles larger reference materials and documents. + +```typescript +// Storing a document +await runtime.documentsManager.createMemory({ + content: { text: documentContent, metadata: { title: "User Manual" } }, + userId: "system", + agentId: agentId, + roomId: "global" +}); +``` + +### 5. Knowledge Memory (`knowledgeManager`) + +Contains searchable document fragments for quick reference. + +```typescript +// Adding knowledge fragment +await runtime.knowledgeManager.createMemory({ + content: { text: "The product warranty expires after 12 months." }, + userId: "system", + agentId: agentId, + roomId: "knowledge" +}); +``` + +### 6. RAG Knowledge (`ragKnowledgeManager`) + +Specialized system for Retrieval-Augmented Generation. + +```typescript +// Processing a knowledge file +await runtime.ragKnowledgeManager.processFile({ + path: "product-info.md", + content: markdownContent, + type: "md", + isShared: false +}); +``` + +## Working with Memory + +### Creating Memories + +```typescript +// Basic memory creation +await memoryManager.createMemory({ + id: uuidv4(), // Optional, auto-generated if omitted + content: { text: "Memory content" }, // Required + userId: userId, // Required + agentId: agentId, // Required + roomId: roomId // Required +}); +``` + +### Retrieving Memories + +```typescript +// Get recent memories +const memories = await memoryManager.getMemories({ + roomId: roomId, // Conversation context + count: 10, // Number of memories to retrieve + unique: true // Exclude duplicates +}); + +// Format for context +const formattedContext = formatMessages(memories); +``` + +### Semantic Search + +```typescript +// Generate embedding for query +const embedding = await embed(runtime, "search query"); + +// Search for similar memories +const results = await memoryManager.searchMemoriesByEmbedding(embedding, { + roomId: roomId, + match_threshold: 0.7, // Similarity threshold (0-1) + count: 5 // Maximum results +}); +``` + +### Memory Management + +```typescript +// Remove specific memory +await memoryManager.removeMemory(memoryId); + +// Clear all memories in a room +await memoryManager.removeAllMemories(roomId); + +// Count memories +const count = await memoryManager.countMemories(roomId); +``` + +## Vector Embeddings and Semantic Search + +Eliza uses vector embeddings to enable semantic search capabilities. + +### Embedding Generation + +```typescript +// Create embedding vector +const embedding = await embed(runtime, "Text to embed"); +``` + +Supported embedding providers: + +- **OpenAI**: 1536 dimensions (default) +- **Local BGE models**: 384 dimensions +- **Ollama**: Various dimensions based on model +- **Other providers**: GaiaNet, Heurist, etc. + +### Semantic Search Process + +1. Convert query to embedding vector +2. Compare against stored memory embeddings using cosine similarity +3. Return memories above similarity threshold +4. Optionally rerank results based on additional criteria + +### RAG Knowledge Enhancement + +The RAG system enhances retrieval through: + +- Term frequency matching +- Proximity analysis +- Chunk-based retrieval +- Stop word filtering +- Context-aware reranking + +```typescript +// Advanced knowledge retrieval +const relevantKnowledge = await runtime.ragKnowledgeManager.getKnowledge({ + query: "How do I reset my password?", + conversationContext: recentMessages, + limit: 3 +}); +``` + +## Memory Configuration + +### Database Adapters + +```typescript +// SQLite adapter (development) +const sqliteAdapter = new SqliteDatabaseAdapter('path/to/database.db'); + +// PostgreSQL adapter (production) +const pgAdapter = new PostgresAdapter({ + connectionString: process.env.POSTGRES_URI +}); +``` + +### Environment Variables + +``` +# Embedding configuration +USE_OPENAI_EMBEDDING=TRUE +OPENAI_API_KEY=sk-xxx + +# Database configuration +POSTGRES_URI=postgres://user:password@host:port/database +SQLITE_PATH=./database.sqlite +``` + +### Memory Parameters + +```typescript +// Default search parameters +const defaultMatchThreshold = 0.1; // Minimum similarity score +const defaultMatchCount = 10; // Default number of results +``` + +## Best Practices + +### Memory Organization + +- **Room-based Isolation**: Use separate roomIds for different contexts +- **User-specific Knowledge**: Tie memories to specific users when appropriate +- **Global Knowledge**: Use special roomIds like "global" or "knowledge" for shared information + +### Performance Optimization + +- **Embedding Caching**: Cache embeddings for frequently used content +- **Regular Cleanup**: Remove outdated or unnecessary memories +- **Indexing**: Ensure proper database indexing for vector operations + +### Memory Quality + +- **Uniqueness Checks**: Set `unique: true` to avoid duplicate memories +- **Content Preprocessing**: Clean and normalize text before embedding +- **Threshold Tuning**: Adjust match_threshold based on use case needs + +## Advanced Techniques + +### Custom Memory Managers + +```typescript +class CustomMemoryManager extends MemoryManager { + constructor(adapter, options) { + super(adapter, options); + } + + // Override methods as needed + async searchMemoriesByEmbedding(embedding, options) { + // Custom implementation + } +} + +// Register with runtime +runtime.registerMemoryManager('custom', customMemoryManager); +``` + +### Memory Blending + +```typescript +// Retrieve from multiple sources +const messageMemories = await runtime.messageManager.getMemories(options); +const knowledgeMemories = await runtime.knowledgeManager.searchMemoriesByEmbedding(embedding, options); +const loreMemories = await runtime.loreManager.getMemories(options); + +// Combine and rerank +const allMemories = [...messageMemories, ...knowledgeMemories, ...loreMemories]; +const rerankedMemories = rerank(allMemories, query); +``` + +### Memory Reflection + +Implement periodic review of memories to generate summaries or insights: + +```typescript +// Get all memories for a user +const allUserMemories = await runtime.messageManager.getMemories({ + userId: userId, + count: 100 +}); + +// Generate summary or insights +const userProfile = await generateText(runtime, ` +Based on these interactions, create a summary of this user: +${allUserMemories.map(m => m.content.text).join('\n')} +`); + +// Store as description +await runtime.descriptionManager.createMemory({ + content: { text: userProfile }, + userId: userId, + agentId: agentId, + roomId: "profiles" +}); +``` + +## Troubleshooting + +### Common Issues + +1. **Memory Not Found** + - Check roomId, userId, and agentId parameters + - Verify memory exists in database + +2. **Poor Semantic Search Results** + - Adjust match_threshold (lower for more results) + - Check embedding model consistency + - Review query formulation + +3. **Performance Issues** + - Implement database indexing + - Add caching for frequent operations + - Consider memory pruning for large datasets + +4. **Embedding Dimension Mismatch** + - Ensure consistent embedding providers + - Verify vector dimensions match database expectations + +### Debugging Tools + +```typescript +// Memory inspection +const memoryCount = await memoryManager.countMemories(roomId); +console.log(`Memory count for room ${roomId}: ${memoryCount}`); + +// Embedding verification +const testEmbedding = await embed(runtime, "Test text"); +console.log(`Embedding dimensions: ${testEmbedding.length}`); + +// Database validation +const dbStatus = await adapter.ping(); +console.log(`Database status: ${dbStatus ? 'connected' : 'disconnected'}`); +``` + +--- + +By mastering the memory management capabilities of the Eliza framework, you can create agents with rich contextual awareness, personalized interactions, and powerful knowledge retrieval abilities. The combination of structured memory types, semantic search, and flexible storage options provides a robust foundation for advanced AI agent applications. \ No newline at end of file diff --git a/guides/MESSAGE_PROCESSING.md b/guides/MESSAGE_PROCESSING.md new file mode 100644 index 000000000..05eb48bde --- /dev/null +++ b/guides/MESSAGE_PROCESSING.md @@ -0,0 +1,910 @@ +# Message Processing in Eliza Framework + +This guide provides a comprehensive overview of message processing in the Eliza framework, explaining how messages flow through the system across different client platforms while maintaining consistent conversation context. + +## Table of Contents + +- [Core Concepts](#core-concepts) +- [Message Flow Architecture](#message-flow-architecture) +- [Client-Side Message Handling](#client-side-message-handling) +- [Server-Side Message Processing](#server-side-message-processing) +- [Conversation Context Management](#conversation-context-management) +- [Message Generation Pipeline](#message-generation-pipeline) +- [Action Processing](#action-processing) +- [Platform-Specific Message Handling](#platform-specific-message-handling) +- [Core APIs and Methods](#core-apis-and-methods) +- [Best Practices](#best-practices) +- [Advanced Techniques](#advanced-techniques) +- [Troubleshooting](#troubleshooting) + +## Core Concepts + +In the Eliza framework, message processing involves several key concepts: + +- **Messages**: Units of communication between users and agents +- **Memories**: Stored message records with associated metadata and embeddings +- **Clients**: Platform-specific interfaces for user interaction +- **Context**: The comprehensive state used for response generation +- **Actions**: Operations that agents can perform beyond text responses +- **State**: The complete representation of conversation and agent information + +## Message Flow Architecture + +The message flow in Eliza follows this general pattern: + +``` +User Input → Client → Server → Memory Storage → Context Composition + ↓ ↓ +Client ← Response Processing ← Action Handling ← LLM Generation +``` + +This architecture enables: +1. Platform-independent core processing +2. Consistent memory management +3. Flexible client implementation +4. Extensible action system + +## Client-Side Message Handling + +### Web Client Implementation + +The web client processes messages in the following way: + +```typescript +// User sends a message +const handleSendMessage = async (message) => { + // Update UI optimistically + setMessages(prev => [...prev, { + text: message, + user: "user", + createdAt: Date.now() + }]); + + // Send to server + const response = await api.sendMessage(agentId, message, selectedFile); + + // Update with agent response + setMessages(prev => [...prev, { + text: response.text, + attachments: response.attachments, + user: "agent", + createdAt: Date.now() + }]); +}; +``` + +### API Client + +The API client handles the communication with the server: + +```typescript +// In client/src/lib/api.ts +sendMessage: (agentId, message, file = null) => { + const formData = new FormData(); + formData.append("text", message); + formData.append("user", "user"); + + if (file) { + formData.append("file", file); + } + + return fetcher({ + url: `/${agentId}/message`, + method: "POST", + body: formData, + }); +} +``` + +## Server-Side Message Processing + +### DirectClient Message Handling + +The server processes incoming messages through the `DirectClient` class: + +```typescript +app.post("/:agentId/message", upload.single("file"), async (req, res) => { + // Extract parameters + const agentId = req.params.agentId; + const roomId = stringToUuid(req.body.roomId ?? "default-room-" + agentId); + const userId = stringToUuid(req.body.userId ?? "user"); + + // Get runtime for the agent + const runtime = this.agents.get(agentId); + + // Create memory object + const memory = { + id: uuidv4(), + agentId: runtime.agentId, + userId, + roomId, + content: { text: req.body.text }, + createdAt: Date.now(), + }; + + // Process and store the message + await runtime.processMessage(memory); + + // Generate and return response + const response = await runtime.generateResponse(memory); + res.json(response); +}); +``` + +### Runtime Message Processing + +The `AgentRuntime` class handles the core message processing: + +```typescript +async processMessage(memory: Memory): Promise { + // Add embedding for semantic search + const memoryWithEmbedding = await this.messageManager.addEmbeddingToMemory(memory); + + // Store in message history + await this.messageManager.createMemory(memoryWithEmbedding); + + // Update user profile if needed + await this.updateUserProfile(memory); +} + +async generateResponse(memory: Memory): Promise { + // Compose the state with conversation context + const state = await this.composeState(memory); + + // Generate response using LLM + const response = await generateMessageResponse({ + runtime: this, + context: this.formatContext(state), + modelClass: ModelClass.LARGE, + }); + + // Create response memory + const responseMemory = { + id: uuidv4(), + agentId: this.agentId, + userId: this.agentId, + roomId: memory.roomId, + content: response, + createdAt: Date.now(), + }; + + // Add embedding and store + await this.messageManager.addEmbeddingToMemory(responseMemory); + await this.messageManager.createMemory(responseMemory); + + // Process any actions in the response + await this.processActions(memory, [responseMemory], state); + + return responseMemory; +} +``` + +## Conversation Context Management + +Eliza uses a sophisticated memory system to maintain conversation context across interactions. + +### Memory Manager + +The `MemoryManager` class handles storing and retrieving messages: + +```typescript +export class MemoryManager implements IMemoryManager { + tableName: string; + runtime: IAgentRuntime; + + constructor(runtime: IAgentRuntime, tableName: string) { + this.runtime = runtime; + this.tableName = tableName; + } + + // Store memory in database + async createMemory(memory: Memory, unique = false): Promise { + await this.runtime.databaseAdapter.createMemory( + memory, + this.tableName, + unique + ); + } + + // Get recent message history + async getMemories(options: { + roomId: UUID; + count?: number; + unique?: boolean; + start?: number; + end?: number; + }): Promise { + return await this.runtime.databaseAdapter.getMemories({ + tableName: this.tableName, + agentId: this.runtime.agentId, + ...options, + }); + } + + // Add embedding for semantic search + async addEmbeddingToMemory(memory: Memory): Promise { + if (!memory.embedding && memory.content.text) { + try { + memory.embedding = await embed(this.runtime, memory.content.text); + } catch (error) { + memory.embedding = getEmbeddingZeroVector(); + } + } + return memory; + } + + // Search memories by semantic similarity + async searchMemoriesByEmbedding( + embedding: number[], + options: { + roomId: UUID; + match_threshold?: number; + count?: number; + unique?: boolean; + } + ): Promise { + return await this.runtime.databaseAdapter.searchMemories({ + tableName: this.tableName, + agentId: this.runtime.agentId, + embedding, + match_threshold: options.match_threshold ?? 0.7, + match_count: options.count ?? 10, + roomId: options.roomId, + unique: options.unique ?? false, + }); + } +} +``` + +### State Composition + +The `composeState` method creates a comprehensive context for response generation: + +```typescript +async composeState(message: Memory): Promise { + // Initialize state + const state: State = { + agentName: this.character.name, + conversation: [], + goals: [], + actions: [], + bio: this.character.system, + }; + + // Get recent messages + const recentMessages = await this.messageManager.getMemories({ + roomId: message.roomId, + count: this.conversationLength, + }); + + // Format conversation + state.conversation = formatMessages(recentMessages); + + // Get active goals + state.goals = await this.getActiveGoals(message.roomId); + + // Format available actions + state.actions = formatActions(this.actions); + + // Get relevant knowledge + const knowledgeItems = await this.knowledgeManager.searchMemoriesByEmbedding( + message.embedding, + { roomId: "knowledge", count: 5 } + ); + + if (knowledgeItems.length > 0) { + state.knowledge = knowledgeItems.map(k => k.content.text).join("\n\n"); + } + + // Get user profile + const userDescriptions = await this.descriptionManager.getMemories({ + roomId: message.roomId, + count: 1, + }); + + if (userDescriptions.length > 0) { + state.userProfile = userDescriptions[0].content.text; + } + + // Get character lore + const lore = await this.loreManager.getMemories({ + roomId: "global", + count: 5, + }); + + if (lore.length > 0) { + state.lore = lore.map(l => l.content.text).join("\n\n"); + } + + // Add provider context + const providerResults = await Promise.all( + this.providers.map(provider => provider.get(this, message, state)) + ); + + const providerContext = providerResults + .filter(result => result != null && result !== "") + .join("\n\n"); + + if (providerContext) { + state.providers = addHeader( + `# Additional Information About ${this.character.name} and The World`, + providerContext + ); + } + + return state; +} +``` + +## Message Generation Pipeline + +The message generation process utilizes the following steps: + +### 1. Context Formatting + +```typescript +formatContext(state: State): string { + // Use template to format context + return ` +# Character: ${state.agentName} + +## Bio +${state.bio} + +${state.lore ? `## Background\n${state.lore}\n\n` : ''} + +${state.userProfile ? `## User Profile\n${state.userProfile}\n\n` : ''} + +${state.knowledge ? `## Relevant Knowledge\n${state.knowledge}\n\n` : ''} + +${state.providers ? state.providers + '\n\n' : ''} + +${state.goals.length > 0 ? `## Goals\n${formatGoalsAsString(state.goals)}\n\n` : ''} + +${state.actions.length > 0 ? `## Available Actions\n${state.actions}\n\n` : ''} + +## Conversation +${state.conversation} + +${state.agentName}:`; +} +``` + +### 2. Message Generation + +```typescript +async generateMessageResponse({ + runtime, + context, + modelClass = ModelClass.LARGE, +}): Promise { + // Get model settings + const modelSettings = getModelSettings(runtime, modelClass); + + // Generate text using the appropriate provider + const textResponse = await generateText(runtime, context, { + model: modelSettings.model, + temperature: modelSettings.temperature, + max_tokens: modelSettings.max_tokens, + provider: runtime.modelProvider, + }); + + // Create response content + return { + text: textResponse, + }; +} +``` + +### 3. Action Detection and Processing + +```typescript +async processActions( + userMessage: Memory, + responses: Memory[], + state: State, + callback?: (newMessages: Memory[]) => Promise +): Promise { + // Check if the agent's response includes any actions + for (const action of this.actions) { + // Parse action from response text + const actionResponse = parseActionResponseFromText( + responses[0].content.text, + action.name + ); + + if (actionResponse) { + // Execute the action + const result = await action.handler( + userMessage, + responses, + state, + this + ); + + // Update responses with action result + if (result) { + responses = result; + } + + // Execute callback if provided + if (callback) { + responses = await callback(responses); + } + + break; // Stop after first matching action + } + } + + return responses; +} +``` + +## Action Processing + +Actions allow agents to perform operations beyond text responses: + +### Action Definition + +```typescript +// Example action definition +const generateImageAction: Action = { + name: "generateImage", + description: "Generate an image based on the given description", + handler: async (userMessage, responses, state, runtime) => { + // Extract image description from response + const match = responses[0].content.text.match(/generateImage\("([^"]+)"\)/); + if (!match) return responses; + + const imagePrompt = match[1]; + + // Generate the image + const imageUrl = await generateImage(runtime, { + prompt: imagePrompt, + width: 1024, + height: 1024, + }); + + // Update the response to include the image + const updatedResponse = { + ...responses[0], + content: { + text: responses[0].content.text.replace( + /generateImage\("[^"]+"\)/, + `I've created an image for you:` + ), + attachments: [{ + type: "image", + url: imageUrl, + }], + }, + }; + + return [updatedResponse]; + }, +}; +``` + +### Action Registration + +```typescript +// Register actions with the runtime +runtime.actions = [ + generateImageAction, + searchWebAction, + fetchWeatherAction, + // other actions +]; +``` + +## Platform-Specific Message Handling + +Eliza supports multiple client platforms, each with specific message handling: + +### Web Client Rendering + +```tsx +// Rendering messages in the web client +
+ {messages.map((message) => ( + + {message.user === "agent" ? ( + <> + {message.text} + {message.attachments?.map((attachment) => ( +
+ {attachment.type === "image" ? ( + Generated + ) : ( + + {attachment.title || "Attachment"} + + )} +
+ ))} + + ) : ( +
{message.text}
+ )} +
+ ))} +
+``` + +### Discord Client + +```typescript +// Discord-specific message handling +const handleDiscordMessage = async (message) => { + // Ignore bot messages + if (message.author.bot) return; + + // Create memory object + const memory = { + id: uuidv4(), + agentId: runtime.agentId, + userId: message.author.id, + roomId: message.channel.id, + content: { text: message.content }, + createdAt: Date.now(), + }; + + // Process message + await runtime.processMessage(memory); + + // Generate response + const response = await runtime.generateResponse(memory); + + // Handle Discord-specific formatting + let replyContent = response.content.text; + + // Handle attachments + if (response.content.attachments?.length > 0) { + const files = response.content.attachments.map((attachment) => { + return { attachment: attachment.url, name: attachment.title || "attachment" }; + }); + + await message.reply({ content: replyContent, files }); + } else { + // Split long messages if needed (Discord has 2000 char limit) + if (replyContent.length > 2000) { + const chunks = splitText(replyContent, 1900); + for (const chunk of chunks) { + await message.channel.send(chunk); + } + } else { + await message.reply(replyContent); + } + } +}; +``` + +### CLI Client + +```typescript +// CLI-specific message handling +const handleCliMessage = async (input) => { + // Create memory object + const memory = { + id: uuidv4(), + agentId: runtime.agentId, + userId: "cli-user", + roomId: "cli-session", + content: { text: input }, + createdAt: Date.now(), + }; + + // Process message + await runtime.processMessage(memory); + + // Generate response + const response = await runtime.generateResponse(memory); + + // Format CLI output + console.log("\n🤖 " + chalk.cyan(runtime.character.name) + ":"); + console.log(response.content.text); + + // Handle any attachments + if (response.content.attachments?.length > 0) { + console.log("\nAttachments:"); + for (const attachment of response.content.attachments) { + console.log(`- ${attachment.title || "Attachment"}: ${attachment.url}`); + } + } +}; +``` + +## Core APIs and Methods + +The key APIs for message processing include: + +### MemoryManager API + +```typescript +// Core memory operations +interface IMemoryManager { + tableName: string; + createMemory(memory: Memory, unique?: boolean): Promise; + getMemories(options: MemoryOptions): Promise; + getMemoryById(id: UUID): Promise; + addEmbeddingToMemory(memory: Memory): Promise; + searchMemoriesByEmbedding( + embedding: number[], + options: SearchOptions + ): Promise; + removeMemory(id: UUID): Promise; + removeAllMemories(roomId: UUID): Promise; + countMemories(roomId: UUID): Promise; +} +``` + +### AgentRuntime API + +```typescript +// Core message processing methods +interface IAgentRuntime { + processMessage(memory: Memory): Promise; + generateResponse(memory: Memory): Promise; + composeState(message: Memory, initialState?: Partial): Promise; + processActions( + userMessage: Memory, + responses: Memory[], + state: State, + callback?: ActionCallback + ): Promise; +} +``` + +### Database Adapter API + +```typescript +// Database operations for message storage +interface IDatabaseAdapter { + createMemory(memory: Memory, tableName: string, unique?: boolean): Promise; + getMemories(options: DBMemoryOptions): Promise; + getMemoryById(id: UUID, tableName: string): Promise; + searchMemories(options: DBSearchOptions): Promise; + removeMemory(id: UUID, tableName: string): Promise; + removeAllMemories(roomId: UUID, tableName: string): Promise; + countMemories(roomId: UUID, tableName: string): Promise; +} +``` + +## Best Practices + +### Efficient Memory Management + +- **Use embeddings** for all messages to enable semantic search +- **Set reasonable conversation lengths** to prevent context overflow +- **Implement memory cleanup** for old conversations +- **Store user profiles** separately from message history + +### Optimal Message Processing + +- **Initialize runtime with core components** before processing messages +- **Handle all message formats consistently** across platforms +- **Use structured content** with proper typing +- **Process attachments** consistently across platforms + +### Effective Context Building + +- **Include relevant knowledge** based on message content +- **Structure context clearly** with sections and headers +- **Limit context size** to work within model token limits +- **Prioritize recent messages** but include semantic search results + +### Action Implementation + +- **Design actions for specific use cases** with clear interfaces +- **Handle action errors gracefully** to maintain conversation flow +- **Update response content** to reflect action results +- **Use callback mechanisms** for stateful actions + +## Advanced Techniques + +### Message Chain Processing + +For complex multi-step interactions: + +```typescript +async processMessageChain(initialMessage: Memory, steps: number = 3): Promise { + let messages: Memory[] = [initialMessage]; + let currentMessage = initialMessage; + + for (let i = 0; i < steps; i++) { + // Process the current message + await this.processMessage(currentMessage); + + // Generate a response + const response = await this.generateResponse(currentMessage); + messages.push(response); + + // Use the response as the next input + currentMessage = { + id: uuidv4(), + agentId: this.agentId, + userId: initialMessage.userId, + roomId: initialMessage.roomId, + content: { + text: `Based on your last response, please continue with the next step.` + }, + createdAt: Date.now(), + }; + } + + return messages; +} +``` + +### Contextual Memory Retrieval + +For more intelligent memory search: + +```typescript +async getRelevantMemories(message: Memory): Promise { + // Get recent temporal context + const recentMemories = await this.messageManager.getMemories({ + roomId: message.roomId, + count: 5, + }); + + // Get semantically similar memories + const similarMemories = await this.messageManager.searchMemoriesByEmbedding( + message.embedding, + { + roomId: message.roomId, + count: 5, + match_threshold: 0.7, + } + ); + + // Get user profile information + const userProfiles = await this.descriptionManager.getMemories({ + roomId: message.roomId, + count: 1, + }); + + // Combine and deduplicate + const combinedMemories = [...recentMemories]; + + for (const memory of similarMemories) { + if (!combinedMemories.some(m => m.id === memory.id)) { + combinedMemories.push(memory); + } + } + + return [...combinedMemories, ...userProfiles]; +} +``` + +### Multi-Modal Response Generation + +For responses with mixed media types: + +```typescript +async generateMultiModalResponse(message: Memory): Promise { + // Generate text response + const state = await this.composeState(message); + const textResponse = await generateMessageResponse({ + runtime: this, + context: this.formatContext(state), + }); + + // Detect if image generation is needed + const shouldGenerateImage = textResponse.text.includes("[GENERATE_IMAGE]"); + + if (shouldGenerateImage) { + // Extract image prompt + const imagePromptMatch = textResponse.text.match(/\[GENERATE_IMAGE: ([^\]]+)\]/); + const imagePrompt = imagePromptMatch ? imagePromptMatch[1] : textResponse.text; + + // Generate image + const imageUrl = await generateImage(this, { + prompt: imagePrompt, + width: 1024, + height: 1024, + }); + + // Clean up response text + const cleanText = textResponse.text.replace(/\[GENERATE_IMAGE: [^\]]+\]/, ""); + + // Create response with text and image + return { + id: uuidv4(), + agentId: this.agentId, + userId: this.agentId, + roomId: message.roomId, + content: { + text: cleanText, + attachments: [{ + type: "image", + url: imageUrl, + }], + }, + createdAt: Date.now(), + }; + } + + // Return text-only response + return { + id: uuidv4(), + agentId: this.agentId, + userId: this.agentId, + roomId: message.roomId, + content: textResponse, + createdAt: Date.now(), + }; +} +``` + +## Troubleshooting + +### Common Issues + +1. **Missing Context** + - Check memory retrieval options + - Verify message storage is working + - Ensure embedding generation is functioning + +2. **Platform-Specific Formatting Problems** + - Inspect client-specific formatting logic + - Check for platform limitations (e.g., message length) + - Verify attachment handling + +3. **Action Execution Failures** + - Check action handler implementation + - Ensure proper error handling + - Verify action parsing logic + +4. **Performance Issues** + - Monitor memory usage + - Optimize embedding generation + - Consider caching frequent queries + +### Debugging Tools + +```typescript +// Message tracing middleware +const traceMessages = async (req, res, next) => { + console.log(`[${new Date().toISOString()}] Incoming message:`, { + agentId: req.params.agentId, + roomId: req.body.roomId, + userId: req.body.userId, + text: req.body.text.substring(0, 100) + (req.body.text.length > 100 ? "..." : ""), + }); + + const originalSend = res.send; + res.send = function(body) { + console.log(`[${new Date().toISOString()}] Outgoing response:`, { + body: typeof body === "string" ? body.substring(0, 100) + (body.length > 100 ? "..." : "") : "[Object]", + }); + return originalSend.call(this, body); + }; + + next(); +}; + +// Memory inspection utility +const inspectMemory = async (runtime, roomId) => { + const count = await runtime.messageManager.countMemories(roomId); + const memories = await runtime.messageManager.getMemories({ + roomId, + count: 10, + }); + + console.log(`Room ${roomId} has ${count} messages. Last 10:`); + for (const memory of memories) { + console.log(`[${new Date(memory.createdAt).toISOString()}] ${memory.userId}: ${ + memory.content.text.substring(0, 50) + (memory.content.text.length > 50 ? "..." : "") + }`); + } +}; +``` + +--- + +By mastering message processing in the Eliza framework, you can create sophisticated conversational agents that maintain context across different platforms, generate contextually relevant responses, and execute complex actions beyond basic text exchanges. \ No newline at end of file diff --git a/guides/PROVIDER_COMMUNICATION.md b/guides/PROVIDER_COMMUNICATION.md new file mode 100644 index 000000000..a8af56ce4 --- /dev/null +++ b/guides/PROVIDER_COMMUNICATION.md @@ -0,0 +1,567 @@ +# Provider Communication in Eliza Framework + +This guide provides a comprehensive overview of provider communication in the Eliza framework, focusing on how context providers supply dynamic information to agents during message generation. + +## Table of Contents + +- [Core Concepts](#core-concepts) +- [Provider Architecture](#provider-architecture) +- [Creating Custom Providers](#creating-custom-providers) +- [Provider Integration](#provider-integration) +- [Built-in Providers](#built-in-providers) +- [Advanced Provider Techniques](#advanced-provider-techniques) +- [Configuration Options](#configuration-options) +- [Best Practices](#best-practices) +- [Troubleshooting](#troubleshooting) +- [Complete Provider Example](#complete-provider-example) +- [Knowledge Management](#knowledge-management) + +## Core Concepts + +In the Eliza framework, "providers" are specialized components that: + +- Supply contextual information to agents during message generation +- Inject dynamic data into the agent's state +- Run in parallel during context construction +- Extend agent capabilities without modifying the core message flow + +Providers serve as the agent's "senses" for gathering real-time information beyond what's in the conversation or knowledge base. + +## Provider Architecture + +### Interface Definition + +Providers implement a simple but powerful interface: + +```typescript +export interface Provider { + get: ( + runtime: IAgentRuntime, + message: Memory, + state?: State, + ) => Promise; +} +``` + +This interface consists of a single `get` method that: + +- Receives the agent runtime, current message, and optional state +- Returns data to be included in the agent's context +- Can be asynchronous for API calls or database queries + +### Registration in Runtime + +Providers are registered in the `AgentRuntime` class: + +```typescript +/** + * Context providers used to provide context for message generation. + */ +providers: Provider[] = []; +``` + +### Provider Execution Flow + +1. User sends a message to the agent +2. Agent runtime's `composeState` method is called +3. All registered providers' `get` methods are executed in parallel +4. Results are filtered (removing null/undefined values) +5. Non-empty results are combined with newlines +6. Combined output is added to state as `providers` property with a header +7. The state is used to construct the context for the LLM + +## Creating Custom Providers + +### Basic Provider Implementation + +```typescript +const weatherProvider: Provider = { + get: async (runtime, message, state) => { + // Return relevant weather information based on the message content + return "Current weather: 72°F and sunny"; + } +}; +``` + +### Conditional Provider Output + +Providers can decide whether to return information based on relevance: + +```typescript +const timeProvider: Provider = { + get: async (runtime, message, state) => { + // Only include time if message asks about time + if (message.content.text.toLowerCase().includes("time")) { + return `Current time: ${new Date().toLocaleTimeString()}`; + } + return null; // Return null to exclude from context + } +}; +``` + +### Integrating External APIs + +```typescript +const stockProvider: Provider = { + get: async (runtime, message, state) => { + // Extract ticker symbol from message + const match = message.content.text.match(/stock price for (\w+)/i); + if (!match) return null; + + const ticker = match[1]; + try { + // Fetch stock data from API + const apiKey = runtime.getSetting("STOCK_API_KEY"); + const response = await fetch(`https://api.example.com/stocks/${ticker}?key=${apiKey}`); + const data = await response.json(); + + return `${ticker} price: $${data.price} (${data.change}%)`; + } catch (error) { + console.error("Stock API error:", error); + return null; + } + } +}; +``` + +## Provider Integration + +### Registering Providers + +Providers can be registered through several mechanisms: + +#### 1. During Agent Initialization + +```typescript +const agent = new AgentRuntime({ + // other options + providers: [timeProvider, weatherProvider] +}); +``` + +#### 2. Via Plugin System + +```typescript +const myPlugin: Plugin = { + name: "EnvironmentPlugin", + description: "Adds environmental awareness to agents", + providers: [timeProvider, weatherProvider], + // other plugin properties +}; +``` + +#### 3. After Agent Initialization + +```typescript +// Method exists but not explicitly defined in the viewed code +agent.registerContextProvider(stockProvider); +``` + +### State Composition in Runtime + +The provider results are incorporated during state composition: + +```typescript +// In runtime.ts, composeState method +async composeState() { + // ... other state preparation + + // Gather provider context + const providerResults = await Promise.all( + this.providers.map(provider => provider.get(this, message, state)) + ); + + // Filter out null/undefined results and combine + const providerContext = providerResults + .filter(result => result != null && result !== "") + .join("\n\n"); + + // Add to state with header + state.providers = addHeader( + `# Additional Information About ${this.character.name} and The World`, + providerContext + ); + + // ... further state processing +} +``` + +## Built-in Providers + +Eliza includes several built-in providers for common contextual needs: + +### Time Provider + +Supplies current date and time information: + +```typescript +const timeProvider: Provider = { + get: async () => { + const now = new Date(); + return `The current date and time is ${now.toUTCString()}.`; + } +}; +``` + +### Facts Provider + +Retrieves relevant facts from memory: + +```typescript +const factsProvider: Provider = { + get: async (runtime, message) => { + // Get relevant facts from memory + const facts = await runtime.factsManager.getMemories({ + roomId: message.roomId, + count: 5 + }); + + if (facts.length === 0) return null; + + return `Important Facts:\n${facts.map(f => + `- ${f.content.text}` + ).join('\n')}`; + } +}; +``` + +### Boredom Provider + +Models agent engagement level in conversation: + +```typescript +const boredomProvider: Provider = { + get: async (runtime, message, state) => { + // Calculate boredom level based on conversation patterns + const boredomScore = calculateBoredom(state.conversation); + if (boredomScore > 0.7) { + return "You're feeling a bit bored with this conversation."; + } + return null; + } +}; +``` + +## Advanced Provider Techniques + +### Dynamic Relevance Scoring + +```typescript +const newsProvider: Provider = { + get: async (runtime, message, state) => { + // Generate embedding for the message + const embedding = await embed(runtime, message.content.text); + + // Get latest news articles + const articles = await fetchLatestNews(); + + // Generate embeddings for article headlines + const articleEmbeddings = await Promise.all( + articles.map(a => embed(runtime, a.headline)) + ); + + // Find most relevant articles based on cosine similarity + const relevantArticles = articles.filter((article, i) => { + const similarity = cosineSimilarity(embedding, articleEmbeddings[i]); + return similarity > 0.7; // Only include highly relevant articles + }); + + if (relevantArticles.length === 0) return null; + + // Format relevant news for context + return `Relevant News:\n${relevantArticles.map(a => + `- ${a.headline}: ${a.summary}` + ).join('\n')}`; + } +}; +``` + +### Context-Aware Information Density + +```typescript +const knowledgeProvider: Provider = { + get: async (runtime, message, state) => { + // Analyze current conversation complexity + const complexity = analyzeComplexity(state.conversation); + + // Adjust information detail based on conversation complexity + const detail = complexity > 0.7 ? "detailed" : "simple"; + + // Get relevant knowledge with appropriate detail level + const knowledge = await fetchKnowledge(message.content.text, detail); + return knowledge; + } +}; +``` + +### Provider Chaining + +```typescript +const combinedProvider: Provider = { + get: async (runtime, message, state) => { + // Run initial provider to get basic info + const baseProvider = new BaseProvider(); + const baseInfo = await baseProvider.get(runtime, message, state); + + if (!baseInfo) return null; + + // Use that info to get enhanced data + const enhancedProvider = new EnhancedProvider(baseInfo); + const enhancedInfo = await enhancedProvider.get(runtime, message, state); + + // Combine for richer context + return `${baseInfo}\n\nAdditional Details:\n${enhancedInfo}`; + } +}; +``` + +## Configuration Options + +Providers can be configured in several ways: + +### 1. Environment Variables + +Access via `process.env` or through the runtime's settings: + +```typescript +const apiKey = runtime.getSetting("WEATHER_API_KEY"); +``` + +### 2. Character Settings + +Access character-specific configuration: + +```typescript +const providerEnabled = runtime.character.settings?.enableWeatherProvider === true; +if (!providerEnabled) return null; +``` + +### 3. Runtime Options + +Use runtime properties for configuration: + +```typescript +const maxResults = runtime.maxProviderResults || 5; +``` + +### 4. Provider-specific Configuration + +Pass configuration during provider creation: + +```typescript +const createConfiguredProvider = (config) => ({ + get: async (runtime, message, state) => { + // Use config in provider logic + if (config.detailLevel === "high") { + return getDetailedInfo(); + } else { + return getBasicInfo(); + } + } +}); + +// Usage +const myProvider = createConfiguredProvider({ detailLevel: "high" }); +``` + +## Best Practices + +### Performance Optimization + +- **Be concise**: Provider output adds to token usage +- **Cache results**: Avoid redundant expensive operations +- **Use embeddings efficiently**: Cache embeddings when possible +- **Implement timeouts**: Prevent slow providers from delaying responses + +### Error Handling + +- **Graceful failures**: Return null on errors instead of throwing exceptions +- **Logging**: Log errors for debugging without breaking the response flow +- **Fallbacks**: Provide default information when external services fail + +### Content Formatting + +- **Clear structure**: Use consistent formatting for readability +- **HTML/Markdown considerations**: Format for the target model's expectations +- **Sectioning**: Use headers and bullets for better organization +- **Length management**: Keep output concise and relevant + +### Context Relevance + +- **Filter by topic**: Only return information relevant to the conversation +- **Conditional return**: Return null when information isn't needed +- **Prioritize importance**: Focus on the most critical information first +- **Consider recency**: Prioritize recent and timely information + +## Troubleshooting + +### Common Issues + +1. **Provider Output Not Appearing** + - Check if provider is returning null or empty string + - Verify provider is properly registered + - Ensure provider doesn't throw unhandled exceptions + +2. **Performance Problems** + - Identify slow providers using timing logs + - Implement caching for expensive operations + - Consider moving to asynchronous updates for slow data sources + +3. **Inconsistent Results** + - Check for race conditions in asynchronous code + - Verify API stability for external data sources + - Add logging to track provider execution + +4. **Context Overflow** + - Limit provider output length + - Implement relevance filtering + - Prioritize essential information + +### Debugging Techniques + +```typescript +// Add debugging to providers +const debuggableProvider: Provider = { + get: async (runtime, message, state) => { + console.time("myProvider execution"); + try { + const result = await getInformation(); + console.log("Provider result:", result?.substring(0, 100) + "..."); + return result; + } catch (error) { + console.error("Provider error:", error); + return null; + } finally { + console.timeEnd("myProvider execution"); + } + } +}; +``` + +## Complete Provider Example + +Here's a full example of a weather provider implementation: + +```typescript +import type { IAgentRuntime, Memory, Provider, State } from "@elizaos/core"; + +interface WeatherData { + temperature: number; + condition: string; + location: string; + forecast: string[]; +} + +const weatherProvider: Provider = { + get: async (runtime: IAgentRuntime, message: Memory, state?: State) => { + // Check if weather info is relevant to the conversation + const messageText = message.content.text.toLowerCase(); + if (!messageText.includes("weather") && + !messageText.includes("temperature") && + !messageText.includes("forecast") && + !messageText.includes("outside")) { + return null; // Skip if not relevant + } + + try { + // Extract location from message or use default + const locationMatch = messageText.match(/weather (?:in|at|for) ([a-z ]+)/i); + const location = locationMatch ? locationMatch[1].trim() : "the current location"; + + // Check cache first to avoid redundant API calls + const cacheKey = `weather:${location}:${new Date().toISOString().split('T')[0]}`; + const cachedData = await runtime.cacheManager?.get(cacheKey); + + let weatherData: WeatherData; + + if (cachedData) { + weatherData = JSON.parse(cachedData); + } else { + // Fetch weather data from API + const apiKey = runtime.getSetting("WEATHER_API_KEY"); + if (!apiKey) { + return "Weather information is unavailable."; + } + + const url = `https://api.weather.com/data?location=${encodeURIComponent(location)}&key=${apiKey}`; + const response = await fetch(url); + + if (!response.ok) { + throw new Error(`Weather API error: ${response.status}`); + } + + weatherData = await response.json() as WeatherData; + + // Cache the result for 1 hour + await runtime.cacheManager?.set(cacheKey, JSON.stringify(weatherData), 60 * 60); + } + + // Format response based on message intent + if (messageText.includes("forecast")) { + return `Weather forecast for ${weatherData.location}: + - Current: ${weatherData.temperature}°C, ${weatherData.condition} + - Forecast: ${weatherData.forecast.join("\n ")}`; + } else { + return `Current weather in ${weatherData.location}: ${weatherData.temperature}°C, ${weatherData.condition}.`; + } + } catch (error) { + console.error("Weather provider error:", error); + // Return null instead of error message to avoid confusing the model + return null; + } + } +}; + +export { weatherProvider }; +``` + +## Knowledge Management + +Knowledge integration is an important aspect of provider functionality. While comprehensive details are available in the dedicated `KNOWLEDGE_INTEGRATION.md` document, here's how providers interact with the knowledge system: + +### Knowledge Access in Providers + +Providers can access knowledge through the knowledge management systems: + +```typescript +const knowledgeProvider: Provider = { + get: async (runtime: IAgentRuntime, message: Memory, state?: State) => { + // Access knowledge using the knowledge manager + const relevantKnowledge = await runtime.knowledgeManager.getRelevantKnowledge( + message.content.text, + 5 // Limit to 5 most relevant chunks + ); + + if (!relevantKnowledge.length) return null; + + // Format knowledge for inclusion in context + return `Relevant Information:\n${relevantKnowledge.map(k => + `- ${k.content.text}` + ).join('\n')}`; + } +}; +``` + +### Knowledge Tools Reference + +Eliza provides specialized tools for knowledge management that providers can utilize: + +- **folder2knowledge**: Converts a folder of documents into a knowledge file +- **knowledge2character**: Adds knowledge to a character file +- **tweets2character**: Imports tweets for a character's knowledge base + +For detailed usage of these tools and comprehensive knowledge integration practices, refer to the `KNOWLEDGE_INTEGRATION.md` document. + +### Knowledge-Provider Integration + +When designing providers that interact with knowledge: + +1. **Relevance Filtering**: Only include knowledge directly relevant to the current conversation +2. **Format Appropriately**: Present knowledge in a digestible format that fits the agent's style +3. **Prioritize Critical Information**: Present the most important knowledge first +4. **Reference Sources**: When appropriate, indicate where knowledge came from +5. **Coordinate with Knowledge Updates**: Design providers to work with the knowledge refresh cycle + +--- + +By mastering provider communication in the Eliza framework, you can create agents with real-time awareness, dynamic information access, and contextually rich responses. The provider architecture enables you to extend agent capabilities without modifying core functionality, making it a powerful tool for enhancing agent intelligence and utility.