Skip to content
This repository was archived by the owner on Apr 12, 2026. It is now read-only.

Extract valid nested JSON in tool args#125

Open
sfc-gh-jsummer wants to merge 1 commit into
tuannvm:mainfrom
sfc-gh-jsummer:pass-nested-tool-args
Open

Extract valid nested JSON in tool args#125
sfc-gh-jsummer wants to merge 1 commit into
tuannvm:mainfrom
sfc-gh-jsummer:pass-nested-tool-args

Conversation

@sfc-gh-jsummer
Copy link
Copy Markdown

@sfc-gh-jsummer sfc-gh-jsummer commented Sep 10, 2025

Tool arguments that contain nested structures create malformed JSON and falls back to flattening the arguments. This causes a tool error. For example,

{
  "tool": "snowflake_create_object",
  "args": {
    "object_type": "database",
    "target_object": {
      "name": "abcxyz"
    }
  }
}

becomes

{
    "object_type": "database",
    "target_object": {
      "name": "abcxyz"

The nested tool arguments is common if fastMCP tools use Pydantic schema interpretation which my MCP server does.

Summary by CodeRabbit

  • Bug Fixes

    • Improved reliability of tool-call detection in responses, correctly handling nested and complex JSON structures to reduce parsing errors and missed calls.
  • Refactor

    • Updated parsing flow to attempt a more robust extraction method before regex, with the previous approach retained as a fallback. No changes to the public API.
  • Chores

    • Enhanced logging around tool-call extraction steps and outcomes to aid troubleshooting.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Sep 10, 2025

Walkthrough

Introduces a balanced-brace JSON extraction path in LLMMCPBridge. tryRegexJSONExtraction now first attempts tryBalancedBraceExtraction to detect nested JSON tool calls, validating and returning on success; otherwise, it falls back to the existing regex-based extraction. Logging added for balanced parsing steps. No exported API signatures changed.

Changes

Cohort / File(s) Summary
LLM MCP bridge parsing
internal/handlers/llm_mcp_bridge.go
Added tryBalancedBraceExtraction for brace-balanced JSON detection with string/escape handling; updated tryRegexJSONExtraction to prefer balanced parsing then fall back to regex; added logging and validation via isValidToolCall; retained existing constructToolCall path.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant R as LLM Response
  participant B as LLMMCPBridge
  participant BB as tryBalancedBraceExtraction
  participant RX as Regex Extraction
  participant V as isValidToolCall
  participant C as constructToolCall

  R->>B: tryRegexJSONExtraction(response)
  activate B
  Note over B: Attempt balanced-brace parse first
  B->>BB: parse balanced JSON object
  activate BB
  BB-->>B: ToolCall? / nil
  deactivate BB

  alt Balanced ToolCall found
    B->>V: validate ToolCall
    V-->>B: valid / invalid
    alt valid
      B-->>R: return ToolCall (balanced path)
    else invalid
      Note over B: Fallback to regex path
      B->>RX: extract via regex
      activate RX
      RX-->>B: raw JSON / nil
      deactivate RX
      opt JSON found
        B->>C: constructToolCall(raw)
        C-->>B: ToolCall / error
        B-->>R: return ToolCall or nil
      end
    end
  else No balanced match
    B->>RX: extract via regex
    activate RX
    RX-->>B: raw JSON / nil
    deactivate RX
    opt JSON found
      B->>C: constructToolCall(raw)
      C-->>B: ToolCall / error
      B-->>R: return ToolCall or nil
    end
  end

  Note over B: Logs steps and outcomes
  deactivate B
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title “Extract valid nested JSON in tool args” succinctly captures the core enhancement of adding balanced-brace parsing to correctly extract nested JSON structures in tool arguments, directly reflecting the change in extraction logic described in the PR summary.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.

Poem

I nibbled braces, one by one,
Balanced bites ’til parsing’s done.
If nests are deep, I’m not perplexed—
I hop to regex as my next.
With twitching ears and logs in tow,
I find the tools beneath the snow. 🐇✨

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
internal/handlers/llm_mcp_bridge.go (1)

360-366: Good fallback order; consider trying balanced extraction earlier.

Preferring balanced-brace extraction before the regex is the right call. Consider moving this check earlier than code-block parsing (or calling it directly from detectSpecificJSONToolCall) to avoid an extra regex pass and catch nested JSON inside fenced blocks without relying on the lazy code-block regex first.

Would you like me to wire tryBalancedBraceExtraction right after tryDirectJSONParsing and before tryCodeBlockJSONParsing?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1790fe4 and 9ca6458.

📒 Files selected for processing (1)
  • internal/handlers/llm_mcp_bridge.go (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
internal/handlers/**/*.go

📄 CodeRabbit inference engine (CLAUDE.md)

Tool handlers and LLM-MCP bridge should be implemented in internal/handlers/

Files:

  • internal/handlers/llm_mcp_bridge.go
🧠 Learnings (1)
📚 Learning: 2025-07-20T02:54:03.119Z
Learnt from: CR
PR: tuannvm/slack-mcp-client#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/handlers/**/*.go : Tool handlers and LLM-MCP bridge should be implemented in `internal/handlers/`

Applied to files:

  • internal/handlers/llm_mcp_bridge.go
🔇 Additional comments (1)
internal/handlers/llm_mcp_bridge.go (1)

385-454: Avoid logging raw JSON and maintain numeric fidelity

  • Change the log from
    b.logger.DebugKV("Found balanced JSON object", "tool", toolName, "json", jsonStr)
    to
    b.logger.DebugKV("Found balanced JSON object", "tool", toolName, "bytes", len(jsonStr))
  • Replace json.Unmarshal with a json.Decoder using UseNumber(), ensuring you import "strings" if it isn’t already:
    - var toolCall ToolCall
    - if err := json.Unmarshal([]byte(jsonStr), &toolCall); err == nil {
    + var toolCall ToolCall
    + dec := json.NewDecoder(strings.NewReader(jsonStr))
    + dec.UseNumber()
    + if err := dec.Decode(&toolCall); err == nil {
  • Optional: precompile the toolPattern regex at file scope.
  • Verify that "strings" is imported and that the regex still matches when "tool" isn’t the first field.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant