-
-
Notifications
You must be signed in to change notification settings - Fork 269
feat: mcp server semantic search and update repository tools #212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: mcp server semantic search and update repository tools #212
Conversation
Summary of ChangesHello @luxannaxul, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refactors the Micro-Code-Processor (MCP) tools to enhance their adherence to project coding standards and improve maintainability. It introduces two key new functionalities: an update_repository tool for incremental graph database updates and a semantic_search tool for natural language-based code querying. These updates streamline tool integration and provide more flexible ways to interact with the codebase's knowledge graph. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request effectively introduces semantic search capabilities and a repository update tool, enhancing the MCP server's functionality. The implementation, particularly the handling of optional dependencies for semantic search, is well-structured. I've identified one critical issue related to a schema type mismatch that could cause runtime errors, along with a few medium-severity suggestions to improve code consistency, error handling, and remove redundancy. Addressing these points will further solidify this valuable feature addition.
| cs.MCPParamName.TOP_K: MCPInputSchemaProperty( | ||
| type=cs.MCPSchemaType.INTEGER, | ||
| description=td.MCP_PARAM_TOP_K, | ||
| default="5", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The TOP_K parameter is defined with type INTEGER, but its default value is provided as a string "5". This type mismatch can lead to schema validation errors or unexpected behavior when the tool is used with default parameters. The default value should be an integer to match the specified type.
| default="5", | |
| default=5, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
class MCPInputSchemaProperty(TypedDict, total=False):
type: str
description: str
default: str
| except Exception as e: | ||
| logger.error(lg.MCP_ERROR_UPDATING.format(error=e)) | ||
| return cs.MCP_UPDATE_ERROR.format(error=e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Catching a broad Exception can obscure the underlying cause of an error and make debugging more difficult. It's better to catch more specific exceptions that you expect updater.run() to raise. This allows for more precise error logging and handling. If GraphUpdater can raise specific custom exceptions, they should be caught here.
| return te.ERROR_WRAPPER.format(message=e) | ||
|
|
||
| async def semantic_search(self, natural_language_query: str, top_k: int = 5) -> str: | ||
| if self._semantic_search_tool is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've introduced a _semantic_search_available flag in the constructor, which is a great way to track the availability of this optional feature. For consistency and clarity, it would be better to use this flag here instead of checking if _semantic_search_tool is None.
| if self._semantic_search_tool is None: | |
| if not self._semantic_search_available: |
| result = await self._semantic_search_tool.function( | ||
| query=natural_language_query, top_k=top_k | ||
| ) | ||
| return str(result) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile SummaryAdded two new MCP tools to enhance repository management and code discovery capabilities. Key Changes:
Architecture: The conditional registration pattern uses Minor Issue: One syntax error in the Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Client as MCP Client
participant Registry as MCPToolsRegistry
participant DepCheck as has_semantic_dependencies()
participant SemanticTool as create_semantic_search_tool()
participant Updater as GraphUpdater
participant Ingestor as MemgraphIngestor
participant VectorStore as Qdrant/Embeddings
Note over Registry: __init__ - Tool Registration
Registry->>DepCheck: Check if semantic deps installed
alt Semantic dependencies available
DepCheck-->>Registry: True
Registry->>SemanticTool: Import and create tool
SemanticTool-->>Registry: semantic_search_tool
Note over Registry: Register SEMANTIC_SEARCH tool
else Dependencies not available
DepCheck-->>Registry: False
Note over Registry: Log warning, skip registration
end
Note over Registry: Always register UPDATE_REPOSITORY
Note over Client,VectorStore: Tool Execution Flow
alt semantic_search (if available)
Client->>Registry: semantic_search(query, top_k)
Registry->>SemanticTool: Execute async function
SemanticTool->>VectorStore: Embed query & search
VectorStore-->>SemanticTool: Similar node IDs + scores
SemanticTool->>Ingestor: Query nodes by IDs
Ingestor-->>SemanticTool: Node metadata
SemanticTool-->>Registry: Formatted results string
Registry-->>Client: Search results
else semantic_search (unavailable)
Client->>Registry: semantic_search(query, top_k)
Registry-->>Client: Error: Install semantic extras
end
alt update_repository (new)
Client->>Registry: update_repository()
Registry->>Updater: GraphUpdater.run()
Note over Updater: NO database wipe
Updater->>Ingestor: Incremental updates
Ingestor-->>Updater: Success
Updater-->>Registry: Complete
Registry-->>Client: Update success message
end
alt index_repository (existing)
Client->>Registry: index_repository()
Registry->>Ingestor: clean_database()
Note over Ingestor: FULL database wipe
Ingestor-->>Registry: Cleared
Registry->>Updater: GraphUpdater.run()
Updater->>Ingestor: Full rebuild
Ingestor-->>Updater: Success
Updater-->>Registry: Complete
Registry-->>Client: Index success message
end
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional Comments (1)
-
codebase_rag/mcp/tools.py, line 240-244 (link)syntax: default value should be integer not string
The
defaultfield expects the actual default value, not a string representation. Sincetop_khas typeINTEGER, the default should be5(int) not"5"(string).
4 files reviewed, 1 comment
|
@vitali87 |
This PR is an updated version of PR #203, revised to align with the current project policies and coding standards.
The changes focus on improving MCP tool integration and consistency with the existing codebase. In particular, the implementation was adjusted to follow established conventions around tool registration, schema definitions, and centralized configuration.
As part of this update:
Tool names, descriptions, and user-facing text are now sourced from the appropriate shared modules.
MCP input schemas consistently use the existing typed schema definitions.
The overall structure of the MCP tools was aligned with the current architecture and style guidelines.
The behavior of existing tools remains unchanged, and the update is intended to be backward-compatible.
Feedback and suggestions are very welcome.