feat(agent): add CodeGraph Navigator agent for dependency intelligence in complex codebases (#QodoAgentChallenge) #47

anuj123upadhyay · 2025-10-17T13:32:14Z

User description

CodeGraph Navigator Agent for Complex Codebase Analysis

User Description / Summary

Introducing the CodeGraph Navigator, a Microservice Dependency Intelligence Agent designed for complex enterprise codebases.

This agent automates high-effort workflows across multiple repositories and services, providing developers with detailed insights into service dependencies, criticality, and potential impact of changes.

Description of Changes

Added the CodeGraph Navigator agent with comprehensive repository scanning and knowledge graph construction.
Features include:
- Multi-language repository support: Go, JavaScript/TypeScript, Python, Java, C#
- Knowledge graph construction across microservices
- Criticality assessment with risk levels
- Impact analysis to predict downstream effects before changes
- Natural language querying for dependency questions
- Visual risk matrix and detailed reporting
- Incremental learning as repositories evolve
Updated documentation with setup instructions, CLI examples, and usage scenarios.
Defined output schemas for analyze_codebase, impact_analysis, and criticality_matrix commands.
Added system requirements and installation instructions for Python, Node.js, and Qodo CLI tools.

Why This Change Is Needed

Managing large-scale microservice architectures is error-prone and time-consuming.
This agent helps developers:

Understand dependencies between services quickly
Assess risks before implementing changes
Plan deployments effectively and safely
Automate repetitive analysis tasks across repositories

Testing Performed

Ran the agent locally on a sample multi-service repository.
Validated knowledge graph generation, dependency analysis, and criticality scoring.
Tested CLI commands for analyze_codebase, impact_analysis, and criticality_matrix.
Verified reports and output schema accuracy.

Additional Notes

Future improvements could include:
- Integration with CI/CD pipelines for automated impact assessments
- Visual dashboards for cross-repo dependency maps
- Enhanced language/framework support

PR Type

Enhancement

Description

Introduces CodeGraph Navigator agent for microservice dependency intelligence
Implements multi-language repository scanning (Go, JavaScript/TypeScript, Python, Java, C#)
Builds knowledge graph with criticality assessment and impact analysis
Provides natural language querying for dependency relationships
Includes comprehensive documentation and CLI examples

Diagram Walkthrough

flowchart LR
  A["Repository Scanner<br/>scanner.py"] -->|JSON output| B["Knowledge Graph Builder<br/>graph_builder.py"]
  B -->|Updates| C["Knowledge Graph<br/>knowledge_graph.json"]
  C -->|Queries| D["Query Engine<br/>query_engine.py"]
  D -->|Analyzes| E["Criticality Assessment<br/>Impact Analysis"]
  E -->|Generates| F["Reports & Recommendations"]

File Walkthrough

Relevant files

Enhancement

3 files

scanner.py `Repository scanner for multi-language dependency extraction`	+66/-0
graph_builder.py `Knowledge graph construction and node/edge management`	+63/-0
query_engine.py `Natural language query processor for dependency analysis`	+56/-0

Configuration changes

4 files

agent.toml `Agent configuration with comprehensive command definitions`	+1058/-0
knowledge_graph.json `Initial knowledge graph with sample microservices data`	+333/-0
graph-builder.tool.json `Tool configuration for graph builder integration`	+21/-0
query-engine.tool.json `Tool configuration for query engine integration`	+20/-0

Documentation

2 files

README.md `Complete documentation with usage examples and architecture`	+461/-0
analysis_report_2025-10-17T00-00-00Z.txt `Sample analysis report demonstrating output format`	+75/-0

…e in complex codebases (#QodoAgentChallenge)

qodo-merge-for-open-source · 2025-10-17T13:32:52Z

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
⚪	Concurrent file write Description: The code opens and truncates the global knowledge graph file at 'data/knowledge_graph.json' without any file-locking or concurrency control, which can corrupt the graph if multiple processes update it concurrently. graph_builder.py [13-51] Referred Code with open(GRAPH_PATH, 'r+') as f: graph = json.load(f) nodes = graph.get('nodes', []) edges = graph.get('edges', []) # --- Node Management --- repo_name = scan_results['name'] node_exists = any(node['id'] == repo_name for node in nodes) if not node_exists: nodes.append({ "id": repo_name, "type": "service", "language": scan_results['language'] }) # --- Edge Management --- existing_edges = {(edge['source'], edge['target']) for edge in edges} for imp in scan_results.get('imports', []): target_node_exists = any(node['id'] == imp for node in nodes) ... (clipped 18 lines)
	Unvalidated input usage Description: The query engine trusts the contents of 'data/knowledge_graph.json' without validation and prints interpolated values, enabling potential log/message injection if the graph is attacker-controlled. query_engine.py [24-26] Referred Code with open(GRAPH_PATH, 'r') as f: graph = json.load(f)
	Unrestricted path scan Description: The repository scanner reads arbitrary files under a provided path and prints JSON to stdout without path allowlisting or sandboxing, which can be risky if paths are user-supplied; at minimum, path validation or restrictions should be added. scanner.py [31-45] Referred Code for file in files: if file.endswith('.go'): file_path = os.path.join(root, file) try: with open(file_path, 'r', errors='ignore') as f: content = f.read() # This regex finds patterns like pb.NewPaymentServiceClient(conn) matches = re.findall(r'pb\.New(\w+?)ServiceClient', content) for match in matches: # Convert "Payment" to "paymentservice" # THIS IS THE CORRECTED LINE: service_name = match.lower() + "service" results["imports"].append(service_name) except Exception: continue
Ticket Compliance
⚪	🎫 No ticket provided `- [ ] Create ticket/issue <!-- /create_ticket --create_ticket=true --> </details></td></tr>`
Codebase Duplication Compliance
⚪	Codebase context is not defined Follow the guide to enable codebase context checks.
Custom Compliance
⚪	No custom compliance provided Follow the guide to enable custom compliance check.

Compliance status legend

🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

qodo-merge-for-open-source · 2025-10-17T13:35:01Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
High-level	Implement core logic in tools The agent's core analysis logic, like criticality calculation and report generation, is defined as instructions for an LLM in `agent.toml` instead of being implemented in the Python tool scripts. This logic should be moved into the Python tools for robustness and efficiency. Examples: agents/codegraph-navigator-agent/agent.toml [10-87] instructions = """ You are CodeGraph Navigator, an intelligent software analysis agent designed to understand complex microservice architectures. Your mission is to help developers understand dependency relationships across multiple repositories by: 1. Scanning repositories to discover their programming language and dependencies 2. Building a central knowledge graph that maps all service relationships 3. Analyzing criticality scores to assess risk of changes 4. Answering natural language questions about the codebase architecture 5. Generating comprehensive reports with impact analysis ... (clipped 68 lines) agents/codegraph-navigator-agent/tools/query_engine.py [6-21] def find_dependents(target_service, graph): """Finds all services that depend on the target_service.""" dependents = [] for edge in graph['edges']: if edge['target'] == target_service: dependents.append(edge['source']) return dependents def find_dependencies(target_service, graph): """Finds all libraries/services that the target_service depends on.""" ... (clipped 6 lines) Solution Walkthrough: Before: # agent.toml (instructions for LLM) """ STEP 1: REPOSITORY SCANNING - Execute: python3 tools/scanner.py <repository_path> STEP 2: KNOWLEDGE GRAPH UPDATE - Execute: python3 tools/graph_builder.py and pipe the JSON to it STEP 3: CRITICALITY ANALYSIS - Calculate dependent count (how many services depend on this service) - Assign criticality level based on the CRITICALITY SCALE... STEP 4: REPORT GENERATION - Create: data/reports/analysis_report_<timestamp>.txt - Include: Executive summary, Services analyzed, etc. """ # tools/query_engine.py def find_dependents(target_service, graph): # ... simple logic to find direct dependents ... return dependents After: # agent.toml (simplified instructions for LLM) """ When asked to analyze a repository, run the analysis tool. - Execute: python3 tools/analyzer.py analyze --repository_path <path> - Display the summary and report path from the tool's output. """ # tools/analyzer.py (new or enhanced tool) class Analyzer: def analyze_repo(self, path): scan_results = self.scanner.scan(path) self.graph_builder.update(scan_results) # ... def calculate_criticality(self, service_name): # Implements logic to count dependents and assign risk level # ... return criticality_report def generate_report(self, analysis_data): # Implements logic to format and write the full report file # ... return report_path Suggestion importance[1-10]: 10 __ Why: This suggestion correctly identifies a critical architectural flaw where core analysis logic is defined in prompts within `agent.toml` instead of being implemented in the Python tools, making the agent fragile, inefficient, and hard to maintain.	High
Possible issue	Validate repository path exists Add validation to `scan_repository` to ensure the provided `repo_path` exists and is a directory before the scan begins, preventing errors with invalid paths. agents/codegraph-navigator-agent/tools/scanner.py [8-12] def scan_repository(repo_path: str): """Scans a repository to find dependencies and other info.""" + + if not os.path.exists(repo_path): + print(json.dumps({"error": f"Path does not exist: {repo_path}"}), file=sys.stderr) + sys.exit(1) + + if not os.path.isdir(repo_path): + print(json.dumps({"error": f"Path is not a directory: {repo_path}"}), file=sys.stderr) + sys.exit(1) clean_path = os.path.normpath(repo_path) repo_name = os.path.basename(clean_path) Apply / Chat Suggestion importance[1-10]: 8 __ Why: The suggestion adds crucial input validation that is missing from the script but explicitly required by the agent's instructions in `agent.toml`, making the tool more robust and user-friendly.	Medium
	Handle missing knowledge graph file To prevent a `FileNotFoundError`, add a check to create `data/knowledge_graph.json` with a default empty structure if it does not already exist. agents/codegraph-navigator-agent/tools/graph_builder.py [13-51] +import os + +# Ensure the file exists with a valid empty graph +if not os.path.exists(GRAPH_PATH): + os.makedirs(os.path.dirname(GRAPH_PATH), exist_ok=True) + with open(GRAPH_PATH, 'w') as f: + json.dump({"nodes": [], "edges": []}, f, indent=2) + with open(GRAPH_PATH, 'r+') as f: graph = json.load(f) nodes = graph.get('nodes', []) edges = graph.get('edges', []) # ... node and edge updates ... # Write the updated graph back to the file f.seek(0) f.truncate() json.dump(graph, f, indent=2) `[To ensure code accuracy, apply this suggestion manually]` Suggestion importance[1-10]: 7 __ Why: The suggestion correctly identifies that the script will crash if `knowledge_graph.json` is missing and provides a robust solution to prevent this by creating the file if needed.	Medium
	Add error handling for missing file Add error handling to `query_graph` to manage cases where `knowledge_graph.json` is missing or contains invalid JSON, preventing the script from crashing. agents/codegraph-navigator-agent/tools/query_engine.py [22-25] def query_graph(query): """Processes a natural language query against the knowledge graph.""" - with open(GRAPH_PATH, 'r') as f: - graph = json.load(f) + if not os.path.exists(GRAPH_PATH): + print(f"Error: Knowledge graph not found at {GRAPH_PATH}. Please run a scan first.") + return + + try: + with open(GRAPH_PATH, 'r') as f: + graph = json.load(f) + except json.JSONDecodeError: + print(f"Error: Knowledge graph file is corrupted. Please rebuild the graph.") + return Apply / Chat Suggestion importance[1-10]: 7 __ Why: The suggestion correctly points out that the script will crash if the graph file is missing or corrupted and proposes adding necessary error handling for both scenarios.	Medium
	Handle invalid JSON input Add a `try-except` block around `json.loads` in `update_graph` to gracefully handle potential `JSONDecodeError` from invalid input. agents/codegraph-navigator-agent/tools/graph_builder.py [9-11] def update_graph(scan_results_json: str): """Updates the knowledge graph with new scan results.""" - scan_results = json.loads(scan_results_json) + try: + scan_results = json.loads(scan_results_json) + except json.JSONDecodeError as e: + print(f"Error: Invalid JSON input - {e}", file=sys.stderr) + sys.exit(1) Apply / Chat Suggestion importance[1-10]: 7 __ Why: The suggestion correctly identifies that the script will crash on malformed JSON input and proposes adding a `try-except` block, which is a standard and necessary improvement for robustness.	Medium
General	Correctly categorize services vs libraries Fix the node creation logic to correctly differentiate between "service" and "library" types for new dependencies, ensuring accurate categorization in the knowledge graph. agents/codegraph-navigator-agent/tools/graph_builder.py [32-42] for imp in scan_results.get('imports', []): target_node_exists = any(node['id'] == imp for node in nodes) if not target_node_exists: - nodes.append({"id": imp, "type": "library"}) + # Check if this is a service (ends with 'service') or a library + node_type = "service" if imp.endswith('service') else "library" + nodes.append({"id": imp, "type": node_type}) if (repo_name, imp) not in existing_edges: edges.append({ "source": repo_name, "target": imp, "type": "imports" }) Apply / Chat Suggestion importance[1-10]: 8 __ Why: This suggestion fixes a logic bug where all new dependencies are incorrectly typed as "library", leading to an inaccurate knowledge graph. The fix correctly categorizes nodes, which is critical for the agent's functionality.	Medium
More

feat(agent): add CodeGraph Navigator agent for dependency intelligenc…

9555e5e

…e in complex codebases (#QodoAgentChallenge)

qodo-merge-for-open-source bot added the Review effort 3/5 label Oct 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(agent): add CodeGraph Navigator agent for dependency intelligence in complex codebases (#QodoAgentChallenge) #47

feat(agent): add CodeGraph Navigator agent for dependency intelligence in complex codebases (#QodoAgentChallenge) #47

Uh oh!

anuj123upadhyay commented Oct 17, 2025 •

edited by qodo-merge-for-open-source bot

Loading

Uh oh!

qodo-merge-for-open-source bot commented Oct 17, 2025

Uh oh!

qodo-merge-for-open-source bot commented Oct 17, 2025

Examples:

Solution Walkthrough:

Before:

After:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat(agent): add CodeGraph Navigator agent for dependency intelligence in complex codebases (#QodoAgentChallenge) #47

Are you sure you want to change the base?

feat(agent): add CodeGraph Navigator agent for dependency intelligence in complex codebases (#QodoAgentChallenge) #47

Uh oh!

Conversation

anuj123upadhyay commented Oct 17, 2025 • edited by qodo-merge-for-open-source bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

CodeGraph Navigator Agent for Complex Codebase Analysis

User Description / Summary

Description of Changes

Why This Change Is Needed

Testing Performed

Additional Notes

PR Type

Description

Diagram Walkthrough

File Walkthrough

Uh oh!

qodo-merge-for-open-source bot commented Oct 17, 2025

PR Compliance Guide 🔍

Uh oh!

qodo-merge-for-open-source bot commented Oct 17, 2025

PR Code Suggestions ✨

Examples:

Solution Walkthrough:

Before:

After:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

anuj123upadhyay commented Oct 17, 2025 •

edited by qodo-merge-for-open-source bot

Loading