Skip to content

feat(preloadmemorytool,tools): handle rich content types and add tool registry/discovery system #129

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 24 commits into
base: main
Choose a base branch
from

Conversation

kshitizz36
Copy link
Contributor

Summary

This PR enhances the PreloadMemoryTool class to support richer content types beyond plain text.

Changes

  • Added support for processing multiple content part types in memory events:
    • Function calls (with function name and arguments)
    • Function responses (with function name and response data)
    • Inline data (with MIME type information)
    • Text (existing functionality)
  • Implemented human-readable formatting for each content type
  • Ensured all content parts are properly joined when building memory context

Benefits

  • Improves LLM context by including data about past interactions with functions and non-textual content
  • Maintains readability of the preloaded memory for the LLM
  • Ensures comprehensive representation of past user-system interactions

@kshitizz36
Copy link
Contributor Author

@hangfei @boyangsvl PTAL.

@kshitizz36
Copy link
Contributor Author

@Jacksunwei PTAL

@kshitizz36
Copy link
Contributor Author

Add Tool Registry and Discovery System

Changes

  • Added tool_registry.py with a singleton ToolRegistry class for managing tool registrations
  • Updated base_tool.py to support automatic registration of tools
  • Added tool_discovery.py to demonstrate selecting appropriate tools based on context
  • Added support for categorizing tools for easier organization

Benefits

  • Better organization of the growing tool ecosystem
  • Runtime flexibility in tool selection based on user needs and context
  • Easier extension of the framework with new tools
  • More intelligent agent capabilities through context-aware tool selection

Testing

The changes have been tested with the existing tools to ensure backward compatibility.

@kshitizz36
Copy link
Contributor Author

Query

Would it be possible to create a new branch off the base branch main for the commit Add Tool Registry and Discovery System?
@hangfei

@Jacksunwei
Copy link
Collaborator

Thanks for the PR!

For preload_memory_tool, IIUC, you're implementing the TODO. However, the todo is meant to be muti-part of text, instead of other types of parts.

For the tool_discovery and tool_registry, This is a big change to api and we need to evaluate it further. Unless LlmModel, we don't see a substantially big gain of having a tool register. Do you have a sample agent that benefit from this?

@kshitizz36
Copy link
Contributor Author

Hi @Jacksunwei , thanks for clarifying the requirement for handling text parts specifically.

Based on your feedback, I've pushed an update. The logic in preload_memory_tool now explicitly loops through event.content.parts, checks if part.text:, collects these text strings, and then joins them. This addresses the multi-part TODO while correctly focusing only on the textual content.

Let me know if this looks better!

@kshitizz36
Copy link
Contributor Author

Thanks for the feedback on the tool registry implementation. I'd like to address your concerns about the value proposition and provide some concrete examples of how this feature enhances the framework.

Key Benefits with Concrete Examples

  1. Dynamic Tool Discovery:

    • Current approach: Agents need to explicitly import and instantiate specific tools.
    • With registry: Agents can discover relevant tools at runtime based on task descriptions.
    • Example: A chatbot agent can dynamically select calendar tools when the user mentions scheduling, or calculator tools when the user needs calculations.
  2. Simplified Agent Implementation:

    • Current approach: Agents maintain hardcoded lists of available tools.
    • With registry: Agents can query the registry for appropriate tools.
    • Example: An agent handling email tasks can query "email" category tools without knowing all tool implementations.
  3. Extensibility:

    • Current approach: Adding new tools requires modifying agent code.
    • With registry: New tools can be added independently and discovered automatically.
    • Example: Third-party developers can create plugins that register seamlessly with the system.
  4. Testing and Mocking:

    • Current approach: Mock tools require changing imports or dependency injection.
    • With registry: Mock tools can be registered temporarily during tests.
    • Example: Tests can register mock implementations that don't make actual API calls.

Sample Agent Implementation

I've included a sample TaskAssistantAgent implementation below that demonstrates the benefits of the tool registry. This agent selects appropriate tools based on user requests rather than having a fixed set of predefined tools.

Example: TaskAssistantAgent

# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""TaskAssistantAgent that dynamically selects tools based on user requests."""

from __future__ import annotations

from typing import List, Optional

from google.genai import types

from ..agents.base_agent import BaseAgent
from ..models.llm_request import LlmRequest
from ..tools.base_tool import BaseTool
from ..tools.tool_context import ToolContext
from ..tools.tool_discovery import ToolDiscovery


class TaskAssistantAgent(BaseAgent):
  """An agent that dynamically selects tools based on the user's task."""

  name: str = "task_assistant"
  description: str = "A helpful assistant that selects appropriate tools for your task."

  def __init__(
      self,
      name: str = "task_assistant",
      description: str = "A helpful assistant that selects appropriate tools for your task.",
      tool_categories: Optional[List[str]] = None,
      max_tools_per_task: int = 5,
  ):
    """Initialize the TaskAssistantAgent.
    
    Args:
      name: The name of the agent.
      description: The description of the agent.
      tool_categories: Optional list of tool categories to consider. If None, all
        registered tools are considered.
      max_tools_per_task: Maximum number of tools to select for a given task.
    """
    super().__init__(name=name, description=description)
    self._tool_categories = tool_categories
    self._max_tools_per_task = max_tools_per_task
    self._current_tools: List[BaseTool] = []

  async def process_user_request(
      self, user_message: str, tool_context: ToolContext
  ) -> LlmRequest:
    """Process a user request by selecting appropriate tools and preparing the LLM request.
    
    Args:
      user_message: The user's message.
      tool_context: The tool context.
      
    Returns:
      LLM request configured with appropriate tools.
    """
    # Select tools based on the user's message
    self._current_tools = ToolDiscovery.get_tools_for_task(
        task_description=user_message,
        categories=self._tool_categories,
        max_tools=self._max_tools_per_task,
    )
    
    # Create an LLM request with the selected tools
    llm_request = LlmRequest(
        config=types.GenerateContentConfig(
            temperature=0.2,
        ),
        contents=[
            types.Content(
                role="user",
                parts=[
                    types.Part(text=user_message),
                ],
            ),
        ],
        tools_dict={},
    )
    
    # Add the selected tools to the request
    for tool in self._current_tools:
      await tool.process_llm_request(
          tool_context=tool_context, llm_request=llm_request
      )
    
    # Log the tools that were selected
    tool_names = [tool.name for tool in self._current_tools]
    print(f"Selected tools for this task: {', '.join(tool_names)}")
    
    return llm_request
    

@kshitizz36
Copy link
Contributor Author

Example of Using the TaskAssistantAgent

Here's a script that show how to use the TaskAssistantAgent:

# Example script showing TaskAssistantAgent in action

import asyncio
import os
from google.genai import GenerativeModel

from google.adk.agents.task_assistant_agent import TaskAssistantAgent
from google.adk.tools.calculator_tool import CalculatorTool
from google.adk.tools.weather_tool import WeatherTool
from google.adk.tools.calendar_tool import CalendarTool
from google.adk.tools.search_tool import SearchTool
from google.adk.tools.tool_context import ToolContext

# Register various tools that will be discovered by the agent
# In a real implementation, these would be registered through imports
# or as part of the package initialization

# First, ensure all tools are registered
# In practice, these imports would be enough to register the tools
# through the auto-registration mechanism

async def main():
    # Create the agent
    agent = TaskAssistantAgent(
        name="personal_assistant",
        description="A personal assistant that helps with various tasks",
        tool_categories=None,  # Consider all tool categories
        max_tools_per_task=3,  # Select up to 3 tools per task
    )
    
    # Create a tool context
    tool_context = ToolContext()
    
    # Example 1: Weather-related query
    print("\n=== Example 1: Weather Query ===")
    user_message = "What's the weather like in San Francisco today?"
    llm_request = await agent.process_user_request(user_message, tool_context)
    # In a real implementation, you would send this to the LLM and process the response
    
    # Example 2: Calendar-related query
    print("\n=== Example 2: Calendar Query ===")
    user_message = "Schedule a meeting with John tomorrow at 2pm"
    llm_request = await agent.process_user_request(user_message, tool_context)
    
    # Example 3: Math calculation
    print("\n=== Example 3: Math Calculation ===")
    user_message = "What is the square root of 144 divided by 2?"
    llm_request = await agent.process_user_request(user_message, tool_context)
    
    # Example 4: General knowledge query
    print("\n=== Example 4: Knowledge Query ===")
    user_message = "Who was the first person to walk on the moon?"
    llm_request = await agent.process_user_request(user_message, tool_context)

if __name__ == "__main__":
    asyncio.run(main())

@kshitizz36 kshitizz36 changed the title Enhance PreloadMemoryTool to handle rich content types feat(preloadmemorytool,tools): handle rich content types and add tool registry/discovery system Apr 19, 2025
@kshitizz36
Copy link
Contributor Author

@hangfei @Jacksunwei PTAL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants