⚡️ Speed up function `llm_qna_graph` by 70% #217

codeflash-ai · 2025-10-25T12:22:11Z

📄 70% (0.70x) speedup for `llm_qna_graph` in `backend/python/app/modules/agents/qna/graph.py`

⏱️ Runtime : 102 milliseconds → 60.1 milliseconds (best of 35 runs)

📝 Explanation and details

Optimizations Applied:

File: app/modules/agents/qna/nodes.py
- Final Response Node: Optimized chunk streaming: replaced string concatenations with a list to accumulate answer chunks, reducing memory copying overhead for large responses. Merged chunk writing directly with list extension for reduced overhead.
- Conditional Retrieval Node: Optimized deduplication loop by using a more efficient approach with dictionary/set lookups and list comprehensions.
- Tool Execution Node: Built tools_by_name using a dictionary comprehension for faster lookups.
- Prepare Clean Prompt Node: Optimized tool description collection using a list comprehension for better clarity and marginal speed improvement.
- General: Moved repeated imports out of frequently executed sections to the top of the file where possible.
- These optimizations maintain the original logic and code style, but should improve performance and reduce memory usage for large data or high-throughput scenarios.

No optimizations applied to app/modules/agents/qna/graph.py, as the provided line profiler data showed that the majority of time is spent in workflow compilation (workflow.compile()), which depends on external libraries.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	🔘 None Found
🌀 Generated Regression Tests	✅ 22 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

🌀 Generated Regression Tests and Runtime

import asyncio
from typing import Any, Dict
from unittest.mock import MagicMock

# imports
import pytest
from app.modules.agents.qna.graph import llm_qna_graph

# --- Minimal stubs and mocks for dependencies ---

# Minimal ChatState type
class ChatState(dict):
    pass

# Minimal logger stub
class DummyLogger:
    def __init__(self):
        self.infos = []
        self.errors = []
        self.warnings = []
        self.debugs = []
    def info(self, msg): self.infos.append(msg)
    def error(self, msg, exc_info=None): self.errors.append((msg, exc_info))
    def warning(self, msg): self.warnings.append(msg)
    def debug(self, msg): self.debugs.append(msg)

# Minimal StreamWriter stub
class DummyWriter:
    def __init__(self):
        self.events = []
    def __call__(self, event):
        self.events.append(event)

# Minimal retrieval_service stub
class DummyRetrievalService:
    def __init__(self, results=None, status_code=200):
        self._results = results or []
        self._status_code = status_code
    async def search_with_filters(self, **kwargs):
        return {
            "searchResults": self._results,
            "status_code": self._status_code,
            "status": "ok",
            "message": "Success"
        }

# Minimal arango_service stub
class DummyArangoService:
    async def get_user_by_user_id(self, user_id):
        return {"fullName": "Test User", "designation": "QA"}
    async def get_document(self, org_id, collection):
        return {"name": "TestOrg", "accountType": "ENTERPRISE"}

# Minimal LLM stub
class DummyLLM:
    def __init__(self, response_content=None, tool_calls=None):
        self.response_content = response_content or "This is a test answer."
        self._tool_calls = tool_calls
        self._tools = []
    def bind_tools(self, tools):
        self._tools = tools
        return self
    async def ainvoke(self, messages):
        class DummyResponse:
            def __init__(self, content, tool_calls):
                self.content = content
                self.tool_calls = tool_calls
        return DummyResponse(self.response_content, self._tool_calls)
    def with_structured_output(self, schema):
        pass

class AIMessage:
    def __init__(self, content, tool_calls=None):
        self.content = content
        self.tool_calls = tool_calls or []

class HumanMessage:
    def __init__(self, content):
        self.content = content

class SystemMessage:
    def __init__(self, content):
        self.content = content
from app.modules.agents.qna.graph import llm_qna_graph

# --- Unit tests ---

@pytest.mark.asyncio
async def test_basic_graph_structure():
    # Test that llm_qna_graph returns the expected structure
    codeflash_output = llm_qna_graph(); graph = codeflash_output

@pytest.mark.asyncio















def test_should_continue_with_limit_execute_tools():
    # Should continue with execute_tools if pending and under limit
    state = ChatState({
        "pending_tool_calls": True,
        "all_tool_results": [{"tool_name": "tool1"}]*2,
        "logger": DummyLogger()
    })

def test_should_continue_with_limit_final():
    # Should return final if not pending or over limit
    state = ChatState({
        "pending_tool_calls": False,
        "all_tool_results": [{"tool_name": "tool1"}]*31,
        "logger": DummyLogger()
    })

def test_should_continue_with_limit_stuck_loop():
    # Should detect stuck loop of same tool
    state = ChatState({
        "pending_tool_calls": True,
        "all_tool_results": [{"tool_name": "tool1"}]*5,
        "logger": DummyLogger()
    })

# --- Edge Cases ---

@pytest.mark.asyncio






#------------------------------------------------
import asyncio
# Patch dependencies in the function under test
import sys
import types

# imports
import pytest
from app.modules.agents.qna.graph import llm_qna_graph

# --- Minimal stubs and mocks for dependencies ---

# Simulate langgraph StateGraph and END
class DummyStateGraph:
    def __init__(self, state_type):
        self.nodes = []
        self.edges = []
        self.cond_edges = []
        self.entry_point = None
        self.state_type = state_type

    def add_node(self, name, func):
        self.nodes.append((name, func))

    def set_entry_point(self, name):
        self.entry_point = name

    def add_edge(self, from_node, to_node):
        self.edges.append((from_node, to_node))

    def add_conditional_edges(self, from_node, func, mapping):
        self.cond_edges.append((from_node, func, mapping))

    def compile(self):
        # Returns self for test purposes
        return self

END = "END"

# Dummy ChatState (dict for simplicity)
class ChatState(dict):
    pass
from app.modules.agents.qna.graph import llm_qna_graph

# --- Unit tests for llm_qna_graph ---

# 1. BASIC TEST CASES



def test_graph_compile_returns_self():
    """
    Basic: The .compile() method should return the graph object itself (for test stub).
    """
    codeflash_output = llm_qna_graph(); graph = codeflash_output

# 2. EDGE TEST CASES

def test_graph_with_no_nodes():
    """
    Edge: If no nodes are added, the graph should have empty node list.
    """
    g = DummyStateGraph(ChatState)
    g.set_entry_point("analyze")
    compiled = g.compile()

def test_graph_with_missing_entry_point():
    """
    Edge: If entry point is not set, entry_point should be None.
    """
    g = DummyStateGraph(ChatState)
    compiled = g.compile()

def test_graph_conditional_edge_to_end():
    """
    Edge: Conditional edge mapping to END should be handled.
    """
    g = DummyStateGraph(ChatState)
    g.add_node("test", lambda s, w: s)
    g.add_conditional_edges("test", lambda s: "error", {"error": END})

def test_graph_duplicate_nodes():
    """
    Edge: Adding duplicate node names should result in both being present (since no deduplication).
    """
    g = DummyStateGraph(ChatState)
    g.add_node("foo", lambda s, w: s)
    g.add_node("foo", lambda s, w: s)

def test_graph_edge_cases_large_number_of_edges():
    """
    Edge: Graph handles a large number of edges (up to 1000).
    """
    g = DummyStateGraph(ChatState)
    for i in range(100):
        g.add_node(f"n{i}", lambda s, w: s)
    for i in range(99):
        g.add_edge(f"n{i}", f"n{i+1}")

def test_graph_conditional_edges_are_callable():
    """
    Edge: The function for conditional edges should be callable.
    """
    g = DummyStateGraph(ChatState)
    func = lambda s: "continue"
    g.add_conditional_edges("foo", func, {"continue": "bar"})

# 3. LARGE SCALE TEST CASES

def test_graph_large_scale_nodes_and_edges():
    """
    Large scale: Graph with 500 nodes and 499 edges.
    """
    g = DummyStateGraph(ChatState)
    for i in range(500):
        g.add_node(f"node_{i}", lambda s, w: s)
    for i in range(499):
        g.add_edge(f"node_{i}", f"node_{i+1}")
    compiled = g.compile()

def test_graph_large_scale_conditional_edges():
    """
    Large scale: Graph with 100 conditional edges.
    """
    g = DummyStateGraph(ChatState)
    for i in range(100):
        g.add_node(f"cond_{i}", lambda s, w: s)
        g.add_conditional_edges(f"cond_{i}", lambda s: "continue", {"continue": f"cond_{(i+1)%100}"})
    compiled = g.compile()

def test_graph_large_scale_entry_point():
    """
    Large scale: Entry point set on a large graph.
    """
    g = DummyStateGraph(ChatState)
    for i in range(100):
        g.add_node(f"n{i}", lambda s, w: s)
    g.set_entry_point("n0")
    compiled = g.compile()

def test_graph_large_scale_duplicate_nodes():
    """
    Large scale: Adding 1000 duplicate nodes.
    """
    g = DummyStateGraph(ChatState)
    for _ in range(1000):
        g.add_node("dup", lambda s, w: s)
    compiled = g.compile()

# 4. FUNCTIONALITY/BEHAVIORAL TESTS


def test_llm_qna_graph_edges_mutation_detection():
    """
    Mutation: If the edges are changed, the test should fail.
    """
    codeflash_output = llm_qna_graph(); graph = codeflash_output
    # These are required edges for the workflow
    required_edges = [
        ("get_user", "prepare_prompt"),
        ("execute_tools", "agent"),
        ("final", END)
    ]
    for edge in required_edges:
        pass


def test_graph_determinism():
    """
    Determinism: Multiple calls to llm_qna_graph return equivalent graphs.
    """
    codeflash_output = llm_qna_graph(); g1 = codeflash_output
    codeflash_output = llm_qna_graph(); g2 = codeflash_output

# 6. READABILITY/MAINTAINABILITY TEST


def test_graph_with_no_edges():
    """
    Edge: If no edges are added, edges list should be empty.
    """
    g = DummyStateGraph(ChatState)
    g.add_node("n1", lambda s, w: s)
    g.add_node("n2", lambda s, w: s)
    compiled = g.compile()

# 8. EDGE: Graph with only conditional edges

def test_graph_with_only_conditional_edges():
    """
    Edge: If only conditional edges are added, edges list should be empty but cond_edges not.
    """
    g = DummyStateGraph(ChatState)
    g.add_node("n1", lambda s, w: s)
    g.add_conditional_edges("n1", lambda s: "continue", {"continue": "n2"})
    compiled = g.compile()

# 9. EDGE: Graph with edge to itself

def test_graph_with_self_loop():
    """
    Edge: Graph supports an edge from a node to itself.
    """
    g = DummyStateGraph(ChatState)
    g.add_node("self", lambda s, w: s)
    g.add_edge("self", "self")
    compiled = g.compile()

# 10. EDGE: Graph with conditional edge to itself

def test_graph_with_conditional_self_loop():
    """
    Edge: Graph supports a conditional edge from a node to itself.
    """
    g = DummyStateGraph(ChatState)
    g.add_node("self", lambda s, w: s)
    g.add_conditional_edges("self", lambda s: "loop", {"loop": "self"})
    compiled = g.compile()
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-llm_qna_graph-mh691zbs and push.

**Optimizations Applied:** - **File: app/modules/agents/qna/nodes.py** - **Final Response Node:** Optimized chunk streaming: replaced string concatenations with a list to accumulate answer chunks, reducing memory copying overhead for large responses. Merged chunk writing directly with list extension for reduced overhead. - **Conditional Retrieval Node:** Optimized deduplication loop by using a more efficient approach with dictionary/set lookups and list comprehensions. - **Tool Execution Node:** Built `tools_by_name` using a dictionary comprehension for faster lookups. - **Prepare Clean Prompt Node:** Optimized tool description collection using a list comprehension for better clarity and marginal speed improvement. - **General:** Moved repeated imports out of frequently executed sections to the top of the file where possible. - These optimizations maintain the original logic and code style, but should improve performance and reduce memory usage for large data or high-throughput scenarios. No optimizations applied to `app/modules/agents/qna/graph.py`, as the provided line profiler data showed that the majority of time is spent in workflow compilation (`workflow.compile()`), which depends on external libraries. ---

codeflash-ai bot requested a review from mashraf-222 October 25, 2025 12:22

codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Oct 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡️ Speed up function `llm_qna_graph` by 70% #217

⚡️ Speed up function `llm_qna_graph` by 70% #217

codeflash-ai bot commented Oct 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

⚡️ Speed up function llm_qna_graph by 70% #217

Are you sure you want to change the base?

⚡️ Speed up function llm_qna_graph by 70% #217

Conversation

codeflash-ai bot commented Oct 25, 2025

📄 70% (0.70x) speedup for llm_qna_graph in backend/python/app/modules/agents/qna/graph.py

📝 Explanation and details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

⚡️ Speed up function `llm_qna_graph` by 70% #217

⚡️ Speed up function `llm_qna_graph` by 70% #217

📄 70% (0.70x) speedup for `llm_qna_graph` in `backend/python/app/modules/agents/qna/graph.py`