feat: mcp server semantic search and update repository tools #212

luxannaxul · 2026-01-02T11:18:17Z

This PR is an updated version of PR #203, revised to align with the current project policies and coding standards.

The changes focus on improving MCP tool integration and consistency with the existing codebase. In particular, the implementation was adjusted to follow established conventions around tool registration, schema definitions, and centralized configuration.

As part of this update:

Tool names, descriptions, and user-facing text are now sourced from the appropriate shared modules.

MCP input schemas consistently use the existing typed schema definitions.

The overall structure of the MCP tools was aligned with the current architecture and style guidelines.

The behavior of existing tools remains unchanged, and the update is intended to be backward-compatible.

Feedback and suggestions are very welcome.

gemini-code-assist · 2026-01-02T11:18:33Z

Summary of Changes

Hello @luxannaxul, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the Micro-Code-Processor (MCP) tools to enhance their adherence to project coding standards and improve maintainability. It introduces two key new functionalities: an update_repository tool for incremental graph database updates and a semantic_search tool for natural language-based code querying. These updates streamline tool integration and provide more flexible ways to interact with the codebase's knowledge graph.

Highlights

MCP Tool Refactoring: The existing MCP tools have been refactored to align with current project policies and coding standards, ensuring consistency in tool registration, schema definitions, and centralized configuration.
New UPDATE_REPOSITORY Tool: A new tool, update_repository, has been introduced to allow incremental updates to the Memgraph knowledge graph without clearing existing data.
New SEMANTIC_SEARCH Tool: A new semantic_search tool has been added, enabling natural language queries for functions based on their purpose, with conditional availability based on installed dependencies.
Improved Tool Descriptions and Schemas: Tool names, descriptions, and user-facing text are now sourced from shared modules, and MCP input schemas consistently use existing typed schema definitions.
Backward Compatibility: The changes are designed to be backward-compatible, ensuring existing tool behavior remains unchanged.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request effectively introduces semantic search capabilities and a repository update tool, enhancing the MCP server's functionality. The implementation, particularly the handling of optional dependencies for semantic search, is well-structured. I've identified one critical issue related to a schema type mismatch that could cause runtime errors, along with a few medium-severity suggestions to improve code consistency, error handling, and remove redundancy. Addressing these points will further solidify this valuable feature addition.

gemini-code-assist · 2026-01-02T11:19:41Z

codebase_rag/mcp/tools.py

+                        cs.MCPParamName.TOP_K: MCPInputSchemaProperty(
+                            type=cs.MCPSchemaType.INTEGER,
+                            description=td.MCP_PARAM_TOP_K,
+                            default="5",


The TOP_K parameter is defined with type INTEGER, but its default value is provided as a string "5". This type mismatch can lead to schema validation errors or unexpected behavior when the tool is used with default parameters. The default value should be an integer to match the specified type.

Suggested change

default="5",

default=5,

class MCPInputSchemaProperty(TypedDict, total=False):
type: str
description: str
default: str

gemini-code-assist · 2026-01-02T11:19:41Z

codebase_rag/mcp/tools.py

+        except Exception as e:
+            logger.error(lg.MCP_ERROR_UPDATING.format(error=e))
+            return cs.MCP_UPDATE_ERROR.format(error=e)


Catching a broad Exception can obscure the underlying cause of an error and make debugging more difficult. It's better to catch more specific exceptions that you expect updater.run() to raise. This allows for more precise error logging and handling. If GraphUpdater can raise specific custom exceptions, they should be caught here.

gemini-code-assist · 2026-01-02T11:19:41Z

codebase_rag/mcp/tools.py

            return te.ERROR_WRAPPER.format(message=e)

+    async def semantic_search(self, natural_language_query: str, top_k: int = 5) -> str:
+        if self._semantic_search_tool is None:


You've introduced a _semantic_search_available flag in the constructor, which is a great way to track the availability of this optional feature. For consistency and clarity, it would be better to use this flag here instead of checking if _semantic_search_tool is None.

Suggested change

if self._semantic_search_tool is None:

if not self._semantic_search_available:

gemini-code-assist · 2026-01-02T11:19:42Z

codebase_rag/mcp/tools.py

+        result = await self._semantic_search_tool.function(
+            query=natural_language_query, top_k=top_k
+        )
+        return str(result)


The _semantic_search_tool.function already returns a formatted string. The str() cast here is redundant and can be removed.

Suggested change

return str(result)

return result

greptile-apps · 2026-01-02T11:21:32Z

Greptile Summary

Added two new MCP tools to enhance repository management and code discovery capabilities.

Key Changes:

update_repository tool: Provides incremental repository updates without database wipe, complementing the existing index_repository tool which performs full rebuilds
semantic_search tool: Conditionally registered only when semantic dependencies (torch, transformers, qdrant_client) are installed, preventing tool exposure in environments lacking required packages
Centralized string management: Tool names, descriptions, parameters, log messages, and response strings consistently sourced from constants.py, logs.py, and tool_descriptions.py following project conventions
Formatting improvements: Multi-element tuples in constants.py split across lines for consistency with project style guidelines

Architecture:

The conditional registration pattern uses has_semantic_dependencies() at initialization to check for optional dependencies. When unavailable, the tool is not registered and a clear log message guides users to install via uv sync --extra semantic. The update_repository tool leverages existing GraphUpdater infrastructure but skips the destructive clean_database() call that index_repository performs.

Minor Issue:

One syntax error in the top_k parameter default value (string "5" instead of integer 5).

Confidence Score: 4/5

Safe to merge after fixing the minor type error in default parameter value
The PR follows established project patterns for tool registration, centralized string management, and optional dependency handling. The implementation is well-structured with proper error handling and logging. One minor syntax issue exists (string default instead of integer for top_k parameter) that should be corrected before merge. The changes are backward-compatible and don't modify existing tool behavior.
codebase_rag/mcp/tools.py requires correction to the top_k default value (line 243)

Important Files Changed

Filename	Overview
codebase_rag/constants.py	Formatting changes for consistency - splits multi-element tuples to multiple lines following project style guidelines
codebase_rag/logs.py	Adds log message templates for new MCP tools (semantic_search, update_repository)
codebase_rag/tools/tool_descriptions.py	Adds descriptions for new MCP tools and parameters, following centralized string management pattern
codebase_rag/mcp/tools.py	Adds optional semantic_search tool and update_repository tool with conditional registration; minor logic issue in dictionary access

Sequence Diagram

sequenceDiagram
    participant Client as MCP Client
    participant Registry as MCPToolsRegistry
    participant DepCheck as has_semantic_dependencies()
    participant SemanticTool as create_semantic_search_tool()
    participant Updater as GraphUpdater
    participant Ingestor as MemgraphIngestor
    participant VectorStore as Qdrant/Embeddings

    Note over Registry: __init__ - Tool Registration

    Registry->>DepCheck: Check if semantic deps installed
    alt Semantic dependencies available
        DepCheck-->>Registry: True
        Registry->>SemanticTool: Import and create tool
        SemanticTool-->>Registry: semantic_search_tool
        Note over Registry: Register SEMANTIC_SEARCH tool
    else Dependencies not available
        DepCheck-->>Registry: False
        Note over Registry: Log warning, skip registration
    end

    Note over Registry: Always register UPDATE_REPOSITORY

    Note over Client,VectorStore: Tool Execution Flow

    alt semantic_search (if available)
        Client->>Registry: semantic_search(query, top_k)
        Registry->>SemanticTool: Execute async function
        SemanticTool->>VectorStore: Embed query & search
        VectorStore-->>SemanticTool: Similar node IDs + scores
        SemanticTool->>Ingestor: Query nodes by IDs
        Ingestor-->>SemanticTool: Node metadata
        SemanticTool-->>Registry: Formatted results string
        Registry-->>Client: Search results
    else semantic_search (unavailable)
        Client->>Registry: semantic_search(query, top_k)
        Registry-->>Client: Error: Install semantic extras
    end

    alt update_repository (new)
        Client->>Registry: update_repository()
        Registry->>Updater: GraphUpdater.run()
        Note over Updater: NO database wipe
        Updater->>Ingestor: Incremental updates
        Ingestor-->>Updater: Success
        Updater-->>Registry: Complete
        Registry-->>Client: Update success message
    end

    alt index_repository (existing)
        Client->>Registry: index_repository()
        Registry->>Ingestor: clean_database()
        Note over Ingestor: FULL database wipe
        Ingestor-->>Registry: Cleared
        Registry->>Updater: GraphUpdater.run()
        Updater->>Ingestor: Full rebuild
        Ingestor-->>Updater: Success
        Updater-->>Registry: Complete
        Registry-->>Client: Index success message
    end

greptile-apps

Additional Comments (1)

codebase_rag/mcp/tools.py, line 240-244 (link)

syntax: default value should be integer not string

The default field expects the actual default value, not a string representation. Since top_k has type INTEGER, the default should be 5 (int) not "5" (string).

_{4 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

luxannaxul · 2026-01-07T07:16:48Z

@vitali87
please review this PR.
I do not really understand, why there should be a type error with the default value, becouse my language server is fine with what I did and gives me the opposite type error, when i change the vallue as suggested by the AI. The code seems to be ok, becouse using the new semantic search tool in the mcp server works.

lux added 2 commits January 2, 2026 12:08

added update_repository and semantic search to mcp tools

fe6c02a

changed formatting

dc48e85

github-project-automation bot added this to @vitali87's graph code Jan 2, 2026

gemini-code-assist bot reviewed Jan 2, 2026

View reviewed changes

greptile-apps bot reviewed Jan 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat: mcp server semantic search and update repository tools #212

feat: mcp server semantic search and update repository tools #212

luxannaxul commented Jan 2, 2026

Uh oh!

gemini-code-assist bot commented Jan 2, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 2, 2026

Uh oh!

luxannaxul Jan 2, 2026

Uh oh!

gemini-code-assist bot Jan 2, 2026

Uh oh!

gemini-code-assist bot Jan 2, 2026

Uh oh!

gemini-code-assist bot Jan 2, 2026

Uh oh!

greptile-apps bot commented Jan 2, 2026

Uh oh!

greptile-apps bot left a comment •

edited

Loading

Uh oh!

luxannaxul commented Jan 7, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	if self._semantic_search_tool is None:
	if not self._semantic_search_available:

Uh oh!

feat: mcp server semantic search and update repository tools #212

Are you sure you want to change the base?

feat: mcp server semantic search and update repository tools #212

Conversation

luxannaxul commented Jan 2, 2026

Uh oh!

gemini-code-assist bot commented Jan 2, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

luxannaxul Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Jan 2, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Additional Comments (1)

Uh oh!

luxannaxul commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

greptile-apps bot left a comment •

edited

Loading

luxannaxul commented Jan 7, 2026 •

edited

Loading