Have you ever wished you could customize your AI coding assistant to work exactly the way you want? What if you could build your own version of Cursor—an AI-powered code editor—using Cursor itself? That's exactly what we're doing in this tutorial: creating a customizable, open-source AI coding agent that operates right within Cursor.
In this step-by-step guide, we'll dive deep into the code to show you how to build a powerful AI assistant that can:
- Navigate and understand codebases
- Implement code changes based on natural language instructions
- Make intelligent decisions about which files to inspect or modify
- Learn from its own history of operations
Let's dive in!
- Understanding the Architecture
- Setting Up Your Environment
- The Core: Building with Pocket Flow
- Implementing Decision Making
- File Operations: Reading and Writing Code
- Code Analysis and Planning
- Applying Code Changes
- Running Your Agent
- Advanced: Customizing Your Agent
- Conclusion and Next Steps
Before we write a single line of code, let's understand the architecture of our Cursor Agent. The system is built on a flow-based architecture using Pocket Flow, a minimalist 100-line LLM framework that enables agentic development.
Here's a high-level overview of our architecture:
flowchart TD
A[MainDecisionAgent] -->|read_file| B[ReadFileAction]
A -->|grep_search| C[GrepSearchAction]
A -->|list_dir| D[ListDirAction]
A -->|edit_file| E[EditFileNode]
A -->|delete_file| F[DeleteFileAction]
A -->|finish| G[FormatResponseNode]
E --> H[AnalyzeAndPlanNode]
H --> I[ApplyChangesNode]
I --> A
This architecture separates concerns into distinct nodes:
- Decision making (what operation to perform next)
- File operations (reading, writing, and searching)
- Code analysis (understanding and planning changes)
- Code modification (safely applying changes)
Let's get our environment ready:
# Clone the repository
git clone https://github.com/The-Pocket/Tutorial-Cursor
cd Tutorial-Cursor
# Install dependencies
pip install -r requirements.txt
Our agent is built on the Pocket Flow framework, which provides three core abstractions:
- Nodes: Individual units of computation that perform specific tasks
- Flows: Directed graphs of nodes that define the program's execution path
- Shared Store: A dictionary that all nodes can access to share data
Let's look at the core imports and setup:
# flow.py
from pocketflow import Node, Flow, BatchNode
import os
import yaml
import logging
from datetime import datetime
from typing import List, Dict, Any, Tuple
# Import utility functions
from utils.call_llm import call_llm
from utils.read_file import read_file
from utils.delete_file import delete_file
from utils.replace_file import replace_file
from utils.search_ops import grep_search
from utils.dir_ops import list_dir
This imports the core classes from Pocket Flow and our custom utility functions that handle file operations and LLM calls.
At the heart of our agent is the MainDecisionAgent
, which determines what action to take based on the user's request and the current state of the system.
Here's how it's implemented:
class MainDecisionAgent(Node):
def prep(self, shared: Dict[str, Any]) -> Tuple[str, List[Dict[str, Any]]]:
# Get user query and history
user_query = shared.get("user_query", "")
history = shared.get("history", [])
return user_query, history
def exec(self, inputs: Tuple[str, List[Dict[str, Any]]]) -> Dict[str, Any]:
user_query, history = inputs
# Format history for context
history_str = format_history_summary(history)
# Create prompt for the LLM
prompt = f"""You are a coding assistant that helps modify and navigate code. Given the following request,
decide which tool to use from the available options.
User request: {user_query}
Here are the actions you performed:
{history_str}
Available tools:
1. read_file: Read content from a file
- Parameters: target_file (path)
2. edit_file: Make changes to a file
- Parameters: target_file (path), instructions, code_edit
[... more tool descriptions ...]
Respond with a YAML object containing:
```yaml
tool: one of: read_file, edit_file, delete_file, grep_search, list_dir, finish
reason: |
detailed explanation of why you chose this tool and what you intend to do
params:
# parameters specific to the chosen tool
```"""
# Call LLM to decide action
response = call_llm(prompt)
# Parse YAML response
yaml_content = extract_yaml_from_response(response)
decision = yaml.safe_load(yaml_content)
# Validate the required fields
assert "tool" in decision, "Tool name is missing"
assert "reason" in decision, "Reason is missing"
return decision
def post(self, shared: Dict[str, Any], prep_res: Any, exec_res: Dict[str, Any]) -> str:
# Add the decision to history
shared.setdefault("history", []).append({
"tool": exec_res["tool"],
"reason": exec_res["reason"],
"params": exec_res.get("params", {}),
"timestamp": datetime.now().isoformat()
})
# Return the name of the tool to determine which node to execute next
return exec_res["tool"]
This node:
- Gathers the user's query and the history of previous actions
- Formats a prompt for the LLM with all available tools
- Calls the LLM to decide what action to take
- Parses the response and validates it
- Adds the decision to the history
- Returns the name of the selected tool, which determines the next node to execute
Let's look at how our agent reads files, which is a fundamental operation:
class ReadFileAction(Node):
def prep(self, shared: Dict[str, Any]) -> str:
# Get parameters from the last history entry
history = shared.get("history", [])
last_action = history[-1]
file_path = last_action["params"].get("target_file")
# Ensure path is relative to working directory
working_dir = shared.get("working_dir", "")
full_path = os.path.join(working_dir, file_path) if working_dir else file_path
return full_path
def exec(self, file_path: str) -> Tuple[str, bool]:
# Call read_file utility which returns a tuple of (content, success)
return read_file(file_path)
def post(self, shared: Dict[str, Any], prep_res: str, exec_res: Tuple[str, bool]) -> str:
# Unpack the tuple returned by read_file()
content, success = exec_res
# Update the result in the last history entry
history = shared.get("history", [])
if history:
history[-1]["result"] = {
"success": success,
"content": content
}
return "decision" # Go back to the decision node
The read_file
utility function itself is implemented like this:
def read_file(target_file: str) -> Tuple[str, bool]:
"""
Read content from a file with support for line ranges.
Prepends 1-based line numbers to each line in the output.
Returns:
Tuple of (file content with line numbers, success status)
"""
try:
if not os.path.exists(target_file):
return f"Error: File {target_file} does not exist", False
with open(target_file, 'r', encoding='utf-8') as f:
lines = f.readlines()
# Add line numbers to each line
numbered_lines = [f"{i+1}: {line}" for i, line in enumerate(lines)]
return ''.join(numbered_lines), True
except Exception as e:
return f"Error reading file: {str(e)}", False
This provides a clean, line-numbered view of the file content that makes it easier for the LLM to reference specific lines in its analysis.
When the agent needs to modify code, it first analyzes the code and plans the changes using AnalyzeAndPlanNode
:
class AnalyzeAndPlanNode(Node):
def prep(self, shared: Dict[str, Any]) -> Dict[str, Any]:
# Get history
history = shared.get("history", [])
last_action = history[-1]
# Get file content and edit instructions
file_content = last_action.get("file_content")
instructions = last_action["params"].get("instructions")
code_edit = last_action["params"].get("code_edit")
return {
"file_content": file_content,
"instructions": instructions,
"code_edit": code_edit
}
def exec(self, params: Dict[str, Any]) -> List[Dict[str, Any]]:
file_content = params["file_content"]
instructions = params["instructions"]
code_edit = params["code_edit"]
# Generate a prompt for the LLM to analyze the edit
prompt = f"""
As a code editing assistant, I need to convert the following code edit instruction
and code edit pattern into specific edit operations (start_line, end_line, replacement).
FILE CONTENT:
{file_content}
EDIT INSTRUCTIONS:
{instructions}
CODE EDIT PATTERN (markers like "// ... existing code ..." indicate unchanged code):
{code_edit}
Analyze the file content and the edit pattern to determine exactly where changes should be made.
Return a YAML object with your reasoning and an array of edit operations:
```yaml
reasoning: |
Explain your thinking process about how you're interpreting the edit pattern.
operations:
- start_line: 10
end_line: 15
replacement: |
# New code here
```"""
# Call LLM to analyze the edit
response = call_llm(prompt)
# Parse the response and extract edit operations
yaml_content = extract_yaml_from_response(response)
result = yaml.safe_load(yaml_content)
# Store reasoning in shared memory
shared["edit_reasoning"] = result.get("reasoning", "")
# Return the operations
return result.get("operations", [])
This node:
- Extracts the file content, instructions, and code edit pattern from the history
- Creates a prompt for the LLM to analyze the edit
- Calls the LLM to determine the exact line numbers and replacement text
- Parses the response to extract the edit operations
- Stores the reasoning in shared memory
- Returns the operations as a list of dictionaries
Once the agent has planned the changes, it applies them using ApplyChangesNode
:
class ApplyChangesNode(BatchNode):
def prep(self, shared: Dict[str, Any]) -> List[Dict[str, Any]]:
# Get edit operations
edit_operations = shared.get("edit_operations", [])
# Sort edit operations in descending order by start_line
# This ensures that line numbers remain valid as we edit from bottom to top
sorted_ops = sorted(edit_operations, key=lambda op: op["start_line"], reverse=True)
# Get target file from history
history = shared.get("history", [])
last_action = history[-1]
target_file = last_action["params"].get("target_file")
# Ensure path is relative to working directory
working_dir = shared.get("working_dir", "")
full_path = os.path.join(working_dir, target_file) if working_dir else target_file
# Attach file path to each operation
for op in sorted_ops:
op["target_file"] = full_path
return sorted_ops
def exec(self, op: Dict[str, Any]) -> Tuple[bool, str]:
# Call replace_file utility to replace content
return replace_file(
target_file=op["target_file"],
start_line=op["start_line"],
end_line=op["end_line"],
content=op["replacement"]
)
def post(self, shared: Dict[str, Any], prep_res: List[Dict[str, Any]], exec_res_list: List[Tuple[bool, str]]) -> str:
# Check if all operations were successful
all_successful = all(success for success, _ in exec_res_list)
# Update edit result in history
history = shared.get("history", [])
if history:
history[-1]["result"] = {
"success": all_successful,
"operations": len(exec_res_list),
"details": [{"success": s, "message": m} for s, m in exec_res_list],
"reasoning": shared.get("edit_reasoning", "")
}
return "decision" # Go back to the decision node
This node is a BatchNode
, which allows it to process multiple operations in a single run. It:
- Gets the edit operations from shared memory
- Sorts them in descending order by start line to ensure edits remain valid
- Attaches the target file path to each operation
- Executes each operation using the
replace_file
utility - Updates the history with the results
- Returns to the decision node
The replace_file
utility works by combining remove_file
and insert_file
:
def replace_file(target_file: str, start_line: int, end_line: int, content: str) -> Tuple[str, bool]:
try:
# First, remove the specified lines
remove_result, remove_success = remove_file(target_file, start_line, end_line)
if not remove_success:
return f"Error during remove step: {remove_result}", False
# Then, insert the new content at the start line
insert_result, insert_success = insert_file(target_file, content, start_line)
if not insert_success:
return f"Error during insert step: {insert_result}", False
return f"Successfully replaced lines {start_line} to {end_line}", True
except Exception as e:
return f"Error replacing content: {str(e)}", False
Now that we've implemented all the key components, let's put it all together in our main.py
:
import os
import argparse
import logging
from flow import coding_agent_flow
def main():
# Parse command-line arguments
parser = argparse.ArgumentParser(description='Coding Agent - AI-powered coding assistant')
parser.add_argument('--query', '-q', type=str, help='User query to process', required=False)
parser.add_argument('--working-dir', '-d', type=str, default=os.path.join(os.getcwd(), "project"),
help='Working directory for file operations')
args = parser.parse_args()
# If no query provided via command line, ask for it
user_query = args.query
if not user_query:
user_query = input("What would you like me to help you with? ")
# Initialize shared memory
shared = {
"user_query": user_query,
"working_dir": args.working_dir,
"history": [],
"response": None
}
# Run the flow
coding_agent_flow.run(shared)
if __name__ == "__main__":
main()
And finally, let's create the flow in flow.py
:
# Define the nodes
main_decision = MainDecisionAgent()
read_file_action = ReadFileAction()
grep_search_action = GrepSearchAction()
list_dir_action = ListDirAction()
delete_file_action = DeleteFileAction()
edit_file_node = EditFileNode()
analyze_plan_node = AnalyzeAndPlanNode()
apply_changes_node = ApplyChangesNode()
format_response_node = FormatResponseNode()
# Connect the nodes
main_decision - "read_file" >> read_file_action
main_decision - "grep_search" >> grep_search_action
main_decision - "list_dir" >> list_dir_action
main_decision - "delete_file" >> delete_file_action
main_decision - "edit_file" >> edit_file_node
main_decision - "finish" >> format_response_node
# Connect action nodes back to main decision
read_file_action - "decision" >> main_decision
grep_search_action - "decision" >> main_decision
list_dir_action - "decision" >> main_decision
delete_file_action - "decision" >> main_decision
# Connect edit flow
edit_file_node - "analyze" >> analyze_plan_node
analyze_plan_node - "apply" >> apply_changes_node
apply_changes_node - "decision" >> main_decision
# Create the flow
coding_agent_flow = Flow(start=main_decision)
Now you can run your agent with:
python main.py --query "List all Python files" --working-dir ./project
One of the most powerful aspects of this architecture is how easy it is to customize. Let's explore a few ways you can extend this agent:
To add a new tool, simply:
- Create a new action node class
- Add it to the
MainDecisionAgent
's prompt - Connect it to the flow
For example, to add a "run_tests" tool:
class RunTestsAction(Node):
def prep(self, shared):
# Get test directory from parameters
history = shared.get("history", [])
last_action = history[-1]
test_dir = last_action["params"].get("test_dir")
return test_dir
def exec(self, test_dir):
# Run tests and capture output
import subprocess
result = subprocess.run(
["pytest", test_dir],
capture_output=True,
text=True
)
return result.stdout, result.returncode == 0
def post(self, shared, prep_res, exec_res):
# Update history with test results
output, success = exec_res
history = shared.get("history", [])
if history:
history[-1]["result"] = {
"success": success,
"output": output
}
return "decision"
# Then add to your flow:
run_tests_action = RunTestsAction()
main_decision - "run_tests" >> run_tests_action
run_tests_action - "decision" >> main_decision
You can enhance the code analysis capabilities by modifying the prompts in AnalyzeAndPlanNode
:
# Add language-specific hints
language_hints = {
".py": "This is Python code. Look for function and class definitions.",
".js": "This is JavaScript code. Look for function declarations and exports.",
# Add more languages as needed
}
# Update the prompt with language-specific hints
file_ext = os.path.splitext(target_file)[1]
language_hint = language_hints.get(file_ext, "")
prompt += f"\n\nLANGUAGE HINT: {language_hint}"
To give your agent more context, you could add a vector database to store and retrieve relevant information:
class VectorDBNode(Node):
def prep(self, shared):
# Get text to store
history = shared.get("history", [])
context = ""
for action in history:
if action["tool"] == "read_file" and action.get("result", {}).get("success", False):
content = action["result"]["content"]
context += f"File: {action['params']['target_file']}\n{content}\n\n"
return context
def exec(self, context):
# Store in vector DB
embeddings = OpenAIEmbeddings()
vectordb = Chroma.from_texts(
texts=[context],
embedding=embeddings,
persist_directory="./db"
)
return vectordb
def post(self, shared, prep_res, exec_res):
shared["vectordb"] = exec_res
return "decision"
Congratulations! You've built a customizable AI coding agent that can help you navigate and modify code based on natural language instructions. This agent demonstrates the power of agentic development, where AI systems help build better AI systems.
The possibilities for extending this agent are endless:
- Add support for more programming languages
- Implement code refactoring capabilities
- Create specialized tools for specific frameworks
- Add security checks before making changes
- Implement static analysis to catch potential bugs
As LLM capabilities continue to improve, agents like this will become even more powerful tools in a developer's arsenal.
Want to learn more? Subscribe to our YouTube channel for a step-by-step video tutorial on building and extending this agent.
Happy coding!