fix: Implement configurable timeout settings for MCP client and bridge operations#151
Conversation
WalkthroughThread global configuration through MCP initialization and replace multiple hardcoded timeouts with configurable durations (MCP init, bridge operations, Slack tool processing). Update timeout parameter types to time.Duration, propagate durations, and add timeout-specific error mapping and logging. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant CLI as cmd/main
participant MCP as MCP Init
participant Client as mcp.Client
CLI->>MCP: processSingleMCPServer(cfg, serverConf)
MCP->>Client: initializeMCPClientInstance(timeout: time.Duration)
Note right of Client `#e6f4ea`: uses context.WithTimeout(timeout)
Client-->>MCP: success / context.DeadlineExceeded
MCP-->>CLI: register client or record failed server
sequenceDiagram
autonumber
participant Slack as Slack client
participant Bridge as LLM Bridge
participant LLM as LLM/Agent
Slack->>Bridge: CallLLM / CallLLMAgent (bridgeTimeout)
Bridge->>LLM: request (with bridgeTimeout)
alt success
LLM-->>Bridge: response
Bridge-->>Slack: processed response
else timeout
LLM-->>Bridge: context.DeadlineExceeded
Bridge-->>Slack: error(code: bridge_operation_timeout)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (6)
internal/slack/client.go (2)
670-676: Configurable tool processing timeout is correct; consider guarding invalid/edge values.
ToolProcessingTimeoutis parsed and falls back cleanly to 1 minute on parse error, which matches the PR intent. You might optionally harden this by:
- Logging when
time.ParseDurationfails so misconfigured values are visible at runtime.- Treating
timeout <= 0as invalid and falling back to the default to avoid accidental “instant timeout” configs.
705-710: Timeout‑specific tool error message works; consider more robust timeout detection.The new branch that surfaces a clearer “processing the tool took too long…increase the toolProcessingTimeout” message is helpful. Since you’re matching on
strings.Contains(err.Error(), "context deadline exceeded"), consider (now or later) usingerrors.Is(err, context.DeadlineExceeded)first, then falling back to the string check for wrapped errors, to make the detection less brittle.internal/handlers/llm_mcp_bridge.go (2)
542-549: Bridge timeout parsing and defaults look good; consider centralizing and validating.Using
BridgeOperationTimeoutwith a 3‑minute fallback in bothCallLLMAgentandCallLLMmatches the desired behavior. To reduce duplication and edge‑case risk, you could:
- Factor this into a small helper (e.g.,
b.getBridgeTimeout()), including the default.- Log when parsing fails, and optionally treat non‑positive durations as invalid and fall back to the default.
This would keep bridge timeout behavior consistent as config evolves.
Also applies to: 590-596
575-583: Timeout‑specific error codes/messages are a nice improvement; consider using typed timeout checks.Mapping deadline‑exceeded errors to
bridge_operation_timeoutwith user‑facing guidance aboutbridgeOperationTimeoutis a solid UX improvement. Similar to the Slack client, you might want to:
- Prefer
errors.Is(err, context.DeadlineExceeded)before falling back to a substring check on"context deadline exceeded"so wrapped errors are handled more reliably.- Optionally add a brief comment or test to ensure these codes are what downstream Slack handlers expect, since they’re now domain
Slackerrors (WrapSlackError).Also applies to: 663-671
cmd/main.go (2)
303-316: MCP init timeout selection is correct; consider logging invalid MCPInitTimeout values.The precedence (per‑server
InitializeTimeoutSeconds→ globalTimeouts.MCPInitTimeout→ 5s default) is sensible and matches the PR description. Two small robustness tweaks to consider:
- If
cfg.Timeouts.MCPInitTimeoutis set buttime.ParseDurationfails, log a warning so mis‑typed config doesn’t silently fall back to 5s.- Optionally treat negative or zero values (from either source) as invalid and revert to the default to avoid accidental “instant timeout” configs.
This would also help when “checking server initialization timeouts when debugging MCP issues.” Based on learnings.
Also applies to: 319-319
514-518: initializeMCPClientInstance behavior looks good; pointer timeout could be passed by value.Using
context.WithTimeoutaroundclient.Initializeand annotating deadline‑exceeded vs “file already closed” errors preserves existing behavior while working with the newtime.Duration‑based timeout. Given you never mutate the timeout, you could simplify the signature to taketimeout time.Durationand pass it directly (no pointer), which would slightly clean up call sites and logging.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
cmd/main.go(4 hunks)internal/handlers/llm_mcp_bridge.go(3 hunks)internal/slack/client.go(2 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
internal/slack/**/*.go
📄 CodeRabbit inference engine (CLAUDE.md)
Slack client implementation and message formatting should be in
internal/slack/
Files:
internal/slack/client.go
internal/handlers/**/*.go
📄 CodeRabbit inference engine (CLAUDE.md)
Tool handlers and LLM-MCP bridge should be implemented in
internal/handlers/
Files:
internal/handlers/llm_mcp_bridge.go
cmd/main.go
📄 CodeRabbit inference engine (CLAUDE.md)
Application entry point and initialization should be in
cmd/main.go
Files:
cmd/main.go
🧠 Learnings (13)
📓 Common learnings
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/mcp/**/*.go : Check server initialization timeouts when debugging MCP issues
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Graceful degradation should occur when MCP servers fail
📚 Learning: 2025-07-20T02:54:03.119Z
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/mcp/**/*.go : Check server initialization timeouts when debugging MCP issues
Applied to files:
internal/slack/client.gointernal/handlers/llm_mcp_bridge.gocmd/main.go
📚 Learning: 2025-07-20T02:54:03.119Z
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/handlers/**/*.go : Tool handlers and LLM-MCP bridge should be implemented in `internal/handlers/`
Applied to files:
internal/handlers/llm_mcp_bridge.go
📚 Learning: 2025-07-20T02:54:03.119Z
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/llm/**/*.go : Update provider constants when adding a new LLM provider
Applied to files:
internal/handlers/llm_mcp_bridge.go
📚 Learning: 2025-07-20T02:54:03.119Z
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/llm/**/*.go : LangChain gateway should provide a unified interface for LLM providers
Applied to files:
internal/handlers/llm_mcp_bridge.go
📚 Learning: 2025-07-20T02:54:03.119Z
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/llm/**/*.go : Support both native tools and system prompt-based tools in LLM provider code
Applied to files:
internal/handlers/llm_mcp_bridge.go
📚 Learning: 2025-07-20T02:54:03.119Z
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/mcp/**/*.go : Component-specific loggers should be used for MCP servers
Applied to files:
cmd/main.go
📚 Learning: 2025-07-20T02:54:03.119Z
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/mcp/**/*.go : MCP client implementations (SSE, HTTP, stdio) should be in `internal/mcp/`
Applied to files:
cmd/main.go
📚 Learning: 2025-07-20T02:54:03.119Z
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/mcp/**/*.go : Enable `--mcpdebug` for MCP client logs when debugging MCP issues
Applied to files:
cmd/main.go
📚 Learning: 2025-07-20T02:54:03.119Z
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/mcp/**/*_test.go : Integration tests should be written for MCP client functionality
Applied to files:
cmd/main.go
📚 Learning: 2025-07-20T02:54:03.119Z
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/mcp/**/*.go : Failed servers should be logged but must not crash the application
Applied to files:
cmd/main.go
📚 Learning: 2025-07-20T02:54:03.119Z
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/mcp/**/*.go : Tool discovery should happen at startup with allow/block lists
Applied to files:
cmd/main.go
📚 Learning: 2025-07-10T21:48:54.979Z
Learnt from: AdamShannag
Repo: tuannvm/slack-mcp-client PR: 64
File: internal/mcp/sseClient.go:56-58
Timestamp: 2025-07-10T21:48:54.979Z
Learning: The mcp-go package handles client initialization internally when Start() is called on an SSE client. Explicit initialization calls are not needed in wrapper Start() methods - only when creating new client instances like in reconnect scenarios.
Applied to files:
cmd/main.go
🧬 Code graph analysis (2)
internal/handlers/llm_mcp_bridge.go (1)
internal/common/errors/app_errors.go (1)
WrapSlackError(213-215)
cmd/main.go (3)
internal/config/config.go (1)
Config(25-38)internal/slack/client.go (1)
Client(29-40)internal/mcp/client.go (1)
Client(29-38)
🔇 Additional comments (1)
cmd/main.go (1)
224-235: Threading cfg into processSingleMCPServer keeps MCP init consistent with global timeouts.Passing
cfgintoprocessSingleMCPServerso each server can honor both per‑server and global timeout settings aligns well with the timeout configuration model and maintains graceful degradation when some servers fail to initialize. Based on learnings.Also applies to: 265-265
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
internal/handlers/llm_mcp_bridge.go (1)
577-585: Timeout‑specific error mapping is good; consider centralizing timeout detectionThe new mapping of deadline errors to
bridge_operation_timeoutwith user‑actionable messages (“Consider increasing the bridgeOperationTimeout.”) is a nice improvement in both agent and chat paths.Given the duplication:
- The condition
errors.Is(err, context.DeadlineExceeded) || strings.Contains(err.Error(), "context deadline exceeded")appears twice; you could extract a small helper likeisBridgeTimeout(err error) boolon the bridge to keep this logic in one place.- That helper could also be reused elsewhere (e.g. Slack client) if you want consistent timeout detection across components.
Functionally this is fine as‑is; the suggestion is mainly about DRY and consistency.
Also applies to: 666-673
cmd/main.go (2)
514-523: NewinitializeMCPClientInstancesignature is cleaner; minor doc comment nitSwitching
initializeMCPClientInstanceto accept atime.Durationis a nice simplification and aligns well with the new timeout calculation logic. The logging oftimeout.Seconds()also makes the effective timeout easy to understand in logs.One small nit: the function header comments above this definition are duplicated; you can safely remove the extra pair to avoid confusion.
304-318: Tighten timeout validation to prevent context cancellation surprisesThe timeout precedence (server-specific
InitializeTimeoutSeconds→ globalTimeouts.MCPInitTimeout→ 5s default) is sound. However, three robustness improvements would prevent edge cases:
Empty MCPInitTimeout: If
cfg.Timeouts.MCPInitTimeoutis empty,time.ParseDurationerrors and the warning message says "Invalid" when empty is simply unset. Skip the parse call and silently use the default when the string is empty.Guard against ≤0 durations: After obtaining
InitializeTimeoutSeconds(line 309) andmcpInitTimeout(line 313), validate that both are > 0. Per Go docs,context.WithTimeoutwith d ≤ 0 creates an already-cancelled context, which can be surprising if misconfigured.Parse global timeout once: The global
MCPInitTimeoutis parsed inside the server loop. Parse it once before the loop and reuse it to avoid repeatedtime.ParseDurationcalls per server.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
cmd/main.go(4 hunks)internal/handlers/llm_mcp_bridge.go(4 hunks)internal/slack/client.go(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- internal/slack/client.go
🧰 Additional context used
📓 Path-based instructions (2)
internal/handlers/**/*.go
📄 CodeRabbit inference engine (CLAUDE.md)
Tool handlers and LLM-MCP bridge should be implemented in
internal/handlers/
Files:
internal/handlers/llm_mcp_bridge.go
cmd/main.go
📄 CodeRabbit inference engine (CLAUDE.md)
Application entry point and initialization should be in
cmd/main.go
Files:
cmd/main.go
🧠 Learnings (12)
📓 Common learnings
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/mcp/**/*.go : Check server initialization timeouts when debugging MCP issues
📚 Learning: 2025-07-20T02:54:03.119Z
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/handlers/**/*.go : Tool handlers and LLM-MCP bridge should be implemented in `internal/handlers/`
Applied to files:
internal/handlers/llm_mcp_bridge.go
📚 Learning: 2025-07-20T02:54:03.119Z
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/mcp/**/*.go : Check server initialization timeouts when debugging MCP issues
Applied to files:
internal/handlers/llm_mcp_bridge.gocmd/main.go
📚 Learning: 2025-07-20T02:54:03.119Z
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/llm/**/*.go : Update provider constants when adding a new LLM provider
Applied to files:
internal/handlers/llm_mcp_bridge.go
📚 Learning: 2025-07-20T02:54:03.119Z
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/llm/**/*.go : LangChain gateway should provide a unified interface for LLM providers
Applied to files:
internal/handlers/llm_mcp_bridge.go
📚 Learning: 2025-07-20T02:54:03.119Z
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/llm/**/*.go : Support both native tools and system prompt-based tools in LLM provider code
Applied to files:
internal/handlers/llm_mcp_bridge.go
📚 Learning: 2025-07-20T02:54:03.119Z
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/mcp/**/*.go : Component-specific loggers should be used for MCP servers
Applied to files:
cmd/main.go
📚 Learning: 2025-07-20T02:54:03.119Z
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/mcp/**/*.go : Enable `--mcpdebug` for MCP client logs when debugging MCP issues
Applied to files:
cmd/main.go
📚 Learning: 2025-07-20T02:54:03.119Z
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/mcp/**/*.go : MCP client implementations (SSE, HTTP, stdio) should be in `internal/mcp/`
Applied to files:
cmd/main.go
📚 Learning: 2025-07-20T02:54:03.119Z
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/mcp/**/*.go : Failed servers should be logged but must not crash the application
Applied to files:
cmd/main.go
📚 Learning: 2025-07-20T02:54:03.119Z
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/mcp/**/*_test.go : Integration tests should be written for MCP client functionality
Applied to files:
cmd/main.go
📚 Learning: 2025-07-10T21:48:54.979Z
Learnt from: AdamShannag
Repo: tuannvm/slack-mcp-client PR: 64
File: internal/mcp/sseClient.go:56-58
Timestamp: 2025-07-10T21:48:54.979Z
Learning: The mcp-go package handles client initialization internally when Start() is called on an SSE client. Explicit initialization calls are not needed in wrapper Start() methods - only when creating new client instances like in reconnect scenarios.
Applied to files:
cmd/main.go
🧬 Code graph analysis (2)
internal/handlers/llm_mcp_bridge.go (2)
internal/common/errors/errors.go (1)
Is(145-147)internal/common/errors/app_errors.go (1)
WrapSlackError(213-215)
cmd/main.go (2)
internal/config/config.go (1)
Config(25-38)internal/mcp/client.go (1)
Client(29-38)
🔇 Additional comments (2)
internal/handlers/llm_mcp_bridge.go (1)
5-15: Use oferrorsimport for timeout classification looks appropriateThe new
errorsimport is used to classifycontext.DeadlineExceededin the bridge methods; this is consistent with Go idioms and keeps timeout handling explicit. No issues here.cmd/main.go (1)
217-235: Config propagation intoprocessSingleMCPServerlooks consistentPassing
cfgintoprocessSingleMCPServerand threading it through the call ininitializeMCPClientsis straightforward and keeps timeout logic close to per‑server config. No functional concerns here.Also applies to: 257-266
| func (b *LLMMCPBridge) CallLLMAgent(userDisplayName, systemPrompt, prompt, contextHistory string, callbackHandler callbacks.Handler) (string, error) { | ||
| // Create a context with an appropriate timeout | ||
| ctx, cancel := context.WithTimeout(context.Background(), 3*time.Minute) | ||
| // Create a context with the configured bridgeTimeout | ||
| bridgeTimeout, err := time.ParseDuration(b.cfg.Timeouts.BridgeOperationTimeout) | ||
| if err != nil { | ||
| b.logger.WarnKV("Invalid bridge operation timeout, using default of 3 minutes", "configured_value", b.cfg.Timeouts.BridgeOperationTimeout, "error", err) | ||
| bridgeTimeout = 3 * time.Minute // fallback to default if parsing fails | ||
| } | ||
| ctx, cancel := context.WithTimeout(context.Background(), bridgeTimeout) | ||
| defer cancel() |
There was a problem hiding this comment.
🧩 Analysis chain
Bridge timeout parsing is correct but could be more robust and DRY
The BridgeOperationTimeout parsing with a 3m fallback is functionally sound in both CallLLMAgent and CallLLM. A couple of small improvements to consider:
- Treat an empty
BridgeOperationTimeoutas “not configured” and silently use the default instead of logging a warning on every call when it’s unset. - After parsing, guard against non‑positive durations (e.g.
0s, negative values) and fall back to the default to avoid creating immediately‑expired contexts. - Factor this parsing into a small helper (e.g.
getBridgeTimeout()onLLMMCPBridge) soCallLLMAgentandCallLLMshare identical logic and logging.
These don’t block correctness but would reduce log noise and configuration footguns.
Please double‑check the desired behavior for zero/negative timeouts.bridgeOperationTimeout in your configuration docs and time.ParseDuration semantics.
Also applies to: 592-599
🌐 Web query:
In Go, how does `context.WithTimeout` behave when given a non-positive timeout, and what error does it return on expiry?
💡 Result:
Short answer: WithTimeout calls WithDeadline(parent, time.Now().Add(timeout)). If timeout is non‑positive the deadline is already past, so the returned child context is already expired (Done is closed). When it expires its Err() is context.DeadlineExceeded (i.e. "context deadline exceeded"). [1][2]
Sources:
- pkg: context.WithTimeout == WithDeadline(parent, time.Now().Add(timeout)). [1]
- example showing ctx.Done() / ctx.Err() prints "context deadline exceeded". [2]
🏁 Script executed:
#!/bin/bash
# Examine both timeout parsing locations
echo "=== Lines 543-551 (CallLLMAgent) ==="
sed -n '543,551p' internal/handlers/llm_mcp_bridge.go
echo -e "\n=== Lines 592-599 (CallLLM - implied) ==="
sed -n '592,599p' internal/handlers/llm_mcp_bridge.go
echo -e "\n=== Check for existing timeout helper functions ==="
rg -n "BridgeOperationTimeout|bridgeTimeout|getTimeout|ParseDuration" internal/handlers/llm_mcp_bridge.go -C2
echo -e "\n=== Config structure search ==="
rg -n "type.*Timeouts|BridgeOperationTimeout" internal/ -C3Length of output: 449
🏁 Script executed:
#!/bin/bash
# Examine both timeout parsing locations
echo "=== Lines 543-551 (CallLLMAgent) ==="
sed -n '543,551p' internal/handlers/llm_mcp_bridge.go
echo -e "\n=== Lines 592-599 (CallLLM - implied) ==="
sed -n '592,599p' internal/handlers/llm_mcp_bridge.go
echo -e "\n=== Check for existing timeout helper functions ==="
rg -n "BridgeOperationTimeout|bridgeTimeout|getTimeout|ParseDuration" internal/handlers/llm_mcp_bridge.go -C2
echo -e "\n=== Config structure search ==="
rg -n "type.*Timeouts|BridgeOperationTimeout" internal/ -C3
echo -e "\n=== Count CallLLMAgent and CallLLM invocations ==="
rg -n "\.CallLLMAgent\(|\.CallLLM\(" --type=go | head -20Length of output: 6394
Extract timeout parsing to helper method and guard against non-positive durations
The identical parsing logic at both callsites (543–551, 592–599) should be extracted into a helper (e.g., getBridgeTimeout() on LLMMCPBridge) to follow DRY and ensure consistent behavior. Additionally, guard against non-positive durations—if timeout is non‑positive the deadline is already past, so the returned child context is already expired. Since time.ParseDuration("0s") and time.ParseDuration("-5s") parse successfully without error, the current code does not catch these misconfigurations. The config layer already defaults BridgeOperationTimeout to "3m" when empty (internal/config/config.go:361–362), so the warning on parse failure is only triggered by user misconfiguration; silently use the fallback in that case and add a duration > 0 check to prevent misconfigured timeouts from creating immediately-expired contexts.
🤖 Prompt for AI Agents
In internal/handlers/llm_mcp_bridge.go around lines 543–551 (and the similar
block at 592–599), extract the timeout parsing/fallback logic into a new method
on LLMMCPBridge (e.g., func (b *LLMMCPBridge) getBridgeTimeout() time.Duration)
that: parses b.cfg.Timeouts.BridgeOperationTimeout, logs a warning and returns
the hardcoded 3*time.Minute fallback on parse error, and additionally checks
that the parsed duration is > 0 (log a warning and return the fallback if not);
then replace both callsites to call getBridgeTimeout() and create the context
with context.WithTimeout(context.Background(), timeout). Ensure logging includes
the configured value and error when parse fails and do not return an
immediately-expired duration.
There was a problem hiding this comment.
@tuannvm for me it was not super clear, which timeout parameter is supposed to define which timeout. I used the bridge operation parameter here, because it's both in the bridge file. But I think the non-agent completion could also be the response processing Timeout
There was a problem hiding this comment.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
…e operations Signed-off-by: Nicolas Wyrwa <nicolas.wyrwa@celebrate.company>
Signed-off-by: Nicolas Wyrwa <nicolas.wyrwa@celebrate.company>
Signed-off-by: Nicolas Wyrwa <nicolas.wyrwa@celebrate.company>
d4efcde to
3e2a055
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
internal/handlers/llm_mcp_bridge.go (1)
543-551: Extract duplicate timeout parsing to helper method.The identical timeout parsing logic appears in both
CallLLMAgent(lines 543–551) andCallLLM(lines 592–599). This should be extracted into a helper method likegetBridgeTimeout()onLLMMCPBridgeto eliminate duplication and ensure consistent behavior. Additionally, the helper should guard against non-positive durations (see next comment).Apply this refactor:
+// getBridgeTimeout parses and returns the bridge operation timeout from config +// with validation and fallback to 3 minutes +func (b *LLMMCPBridge) getBridgeTimeout() time.Duration { + timeout, err := time.ParseDuration(b.cfg.Timeouts.BridgeOperationTimeout) + if err != nil || timeout <= 0 { + if err != nil { + b.logger.WarnKV("Invalid bridge operation timeout, using default of 3 minutes", "configured_value", b.cfg.Timeouts.BridgeOperationTimeout, "error", err) + } else { + b.logger.WarnKV("Non-positive bridge operation timeout, using default of 3 minutes", "configured_value", b.cfg.Timeouts.BridgeOperationTimeout) + } + return 3 * time.Minute + } + return timeout +} + func (b *LLMMCPBridge) CallLLMAgent(userDisplayName, systemPrompt, prompt, contextHistory string, callbackHandler callbacks.Handler) (string, error) { // Create a context with the configured bridgeTimeout - bridgeTimeout, err := time.ParseDuration(b.cfg.Timeouts.BridgeOperationTimeout) - if err != nil { - b.logger.WarnKV("Invalid bridge operation timeout, using default of 3 minutes", "configured_value", b.cfg.Timeouts.BridgeOperationTimeout, "error", err) - bridgeTimeout = 3 * time.Minute // fallback to default if parsing fails - } - ctx, cancel := context.WithTimeout(context.Background(), bridgeTimeout) + ctx, cancel := context.WithTimeout(context.Background(), b.getBridgeTimeout()) defer cancel()Apply the same simplification to
CallLLMat lines 592–599.Also applies to: 592-599
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
cmd/main.go(4 hunks)internal/handlers/llm_mcp_bridge.go(4 hunks)internal/slack/client.go(3 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
internal/slack/**/*.go
📄 CodeRabbit inference engine (CLAUDE.md)
Slack client implementation and message formatting should be in
internal/slack/
Files:
internal/slack/client.go
internal/handlers/**/*.go
📄 CodeRabbit inference engine (CLAUDE.md)
Tool handlers and LLM-MCP bridge should be implemented in
internal/handlers/
Files:
internal/handlers/llm_mcp_bridge.go
cmd/main.go
📄 CodeRabbit inference engine (CLAUDE.md)
Application entry point and initialization should be in
cmd/main.go
Files:
cmd/main.go
🧠 Learnings (12)
📓 Common learnings
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/mcp/**/*.go : Check server initialization timeouts when debugging MCP issues
📚 Learning: 2025-07-20T02:54:03.119Z
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/mcp/**/*.go : Check server initialization timeouts when debugging MCP issues
Applied to files:
internal/slack/client.gointernal/handlers/llm_mcp_bridge.gocmd/main.go
📚 Learning: 2025-07-20T02:54:03.119Z
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/monitoring/**/*.go : Prometheus metrics should include tool invocation counters with error rates and LLM token usage histograms by model and type
Applied to files:
internal/slack/client.go
📚 Learning: 2025-07-20T02:54:03.119Z
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/handlers/**/*.go : Tool handlers and LLM-MCP bridge should be implemented in `internal/handlers/`
Applied to files:
internal/handlers/llm_mcp_bridge.go
📚 Learning: 2025-07-20T02:54:03.119Z
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/llm/**/*.go : LangChain gateway should provide a unified interface for LLM providers
Applied to files:
internal/handlers/llm_mcp_bridge.go
📚 Learning: 2025-07-20T02:54:03.119Z
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/llm/**/*.go : Support both native tools and system prompt-based tools in LLM provider code
Applied to files:
internal/handlers/llm_mcp_bridge.go
📚 Learning: 2025-07-20T02:54:03.119Z
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/mcp/**/*.go : Component-specific loggers should be used for MCP servers
Applied to files:
cmd/main.go
📚 Learning: 2025-07-20T02:54:03.119Z
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/mcp/**/*.go : MCP client implementations (SSE, HTTP, stdio) should be in `internal/mcp/`
Applied to files:
cmd/main.go
📚 Learning: 2025-07-20T02:54:03.119Z
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/mcp/**/*.go : Enable `--mcpdebug` for MCP client logs when debugging MCP issues
Applied to files:
cmd/main.go
📚 Learning: 2025-07-20T02:54:03.119Z
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/mcp/**/*_test.go : Integration tests should be written for MCP client functionality
Applied to files:
cmd/main.go
📚 Learning: 2025-07-20T02:54:03.119Z
Learnt from: CR
Repo: tuannvm/slack-mcp-client PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T02:54:03.119Z
Learning: Applies to internal/mcp/**/*.go : Failed servers should be logged but must not crash the application
Applied to files:
cmd/main.go
📚 Learning: 2025-07-10T21:48:54.979Z
Learnt from: AdamShannag
Repo: tuannvm/slack-mcp-client PR: 64
File: internal/mcp/sseClient.go:56-58
Timestamp: 2025-07-10T21:48:54.979Z
Learning: The mcp-go package handles client initialization internally when Start() is called on an SSE client. Explicit initialization calls are not needed in wrapper Start() methods - only when creating new client instances like in reconnect scenarios.
Applied to files:
cmd/main.go
🧬 Code graph analysis (2)
internal/handlers/llm_mcp_bridge.go (1)
internal/common/errors/app_errors.go (1)
WrapSlackError(213-215)
cmd/main.go (3)
internal/config/config.go (1)
Config(25-38)internal/slack/client.go (1)
Client(30-41)internal/mcp/client.go (1)
Client(29-38)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Security Scan
🔇 Additional comments (3)
internal/slack/client.go (1)
707-712: Timeout error handling is comprehensive and user-friendly.The dual check using both
errors.Is(err, context.DeadlineExceeded)and string matching is appropriate for catching wrapped errors and different error representations. The specific error message clearly indicates a timeout and suggests the relevant configuration parameter.internal/handlers/llm_mcp_bridge.go (1)
577-584: Timeout-specific error handling is well-implemented.The timeout detection and error mapping in both
CallLLMAgentandCallLLMis comprehensive—using botherrors.Isand string matching to catch wrapped errors. The specialized error codebridge_operation_timeoutand the actionable error messages provide clear guidance to users.Also applies to: 666-673
cmd/main.go (1)
519-552: Refactored signature and error handling improve clarity.The change from
timeoutSeconds *inttotimeout time.Durationis a clean improvement that aligns with Go idioms and simplifies the context creation. The enhanced error handling at lines 535–544 provides helpful structured data for specific failure modes (timeout exceeded, process exited), making debugging easier.
| // Determine timeout | ||
| initTimeout := time.Duration(5) * time.Second // fallback if no configuration found | ||
| if serverConf.InitializeTimeoutSeconds != nil { | ||
| // Prefer server-specific timeout if set | ||
| initTimeout = time.Duration(*serverConf.InitializeTimeoutSeconds) * time.Second | ||
| } else { | ||
| // Parse global timeout from config if set | ||
| mcpInitTimeout, err := time.ParseDuration(cfg.Timeouts.MCPInitTimeout) | ||
| if err == nil { | ||
| // Use parsed value | ||
| initTimeout = mcpInitTimeout | ||
| } else { | ||
| serverLogger.WarnKV("Invalid global MCP init timeout, using default of 5 seconds", "configured_value", cfg.Timeouts.MCPInitTimeout, "error", err) | ||
| } | ||
| } |
There was a problem hiding this comment.
Guard against non-positive global timeout duration.
The timeout resolution logic is clear and follows a sensible precedence (server-specific → global → default). However, after successfully parsing the global cfg.Timeouts.MCPInitTimeout, the code doesn't verify that the duration is positive. A misconfigured value like "0s" or "-5s" would parse successfully but create an immediately‑expired context.
Apply this guard:
} else {
// Parse global timeout from config if set
mcpInitTimeout, err := time.ParseDuration(cfg.Timeouts.MCPInitTimeout)
- if err == nil {
+ if err == nil && mcpInitTimeout > 0 {
// Use parsed value
initTimeout = mcpInitTimeout
} else {
- serverLogger.WarnKV("Invalid global MCP init timeout, using default of 5 seconds", "configured_value", cfg.Timeouts.MCPInitTimeout, "error", err)
+ if err != nil {
+ serverLogger.WarnKV("Invalid global MCP init timeout, using default of 5 seconds", "configured_value", cfg.Timeouts.MCPInitTimeout, "error", err)
+ } else {
+ serverLogger.WarnKV("Non-positive global MCP init timeout, using default of 5 seconds", "configured_value", cfg.Timeouts.MCPInitTimeout)
+ }
}
}🤖 Prompt for AI Agents
In cmd/main.go around lines 304 to 318, after parsing
cfg.Timeouts.MCPInitTimeout you must guard against non-positive durations: if
time.ParseDuration succeeds, check that the parsed mcpInitTimeout is > 0 before
assigning it to initTimeout; if it is <= 0, log a warning (include the
configured value and that it is non-positive) and continue using the existing
default (or server-specific value if set), rather than assigning the
non-positive duration which would immediately expire contexts.
| // Create a context with the configured timeout for tool processing | ||
| timeout, err := time.ParseDuration(c.cfg.Timeouts.ToolProcessingTimeout) | ||
| if err != nil { | ||
| c.logger.WarnKV("Invalid tool processing timeout, using default of 1 minute", "configured_value", c.cfg.Timeouts.ToolProcessingTimeout, "error", err) | ||
| timeout = 1 * time.Minute // fallback to default if parsing fails | ||
| } | ||
| toolCtx, cancel := context.WithTimeout(ctx, timeout) |
There was a problem hiding this comment.
Guard against non-positive timeout durations.
The current code doesn't validate that the parsed duration is positive. time.ParseDuration("0s") and time.ParseDuration("-5s") parse successfully but would create an immediately‑expired context. Additionally, the warning is logged on every tool processing attempt when the config value is invalid, which could be noisy in production.
Apply this fix to validate the parsed duration:
// Create a context with the configured timeout for tool processing
timeout, err := time.ParseDuration(c.cfg.Timeouts.ToolProcessingTimeout)
- if err != nil {
+ if err != nil || timeout <= 0 {
- c.logger.WarnKV("Invalid tool processing timeout, using default of 1 minute", "configured_value", c.cfg.Timeouts.ToolProcessingTimeout, "error", err)
+ if err != nil {
+ c.logger.WarnKV("Invalid tool processing timeout, using default of 1 minute", "configured_value", c.cfg.Timeouts.ToolProcessingTimeout, "error", err)
+ } else {
+ c.logger.WarnKV("Non-positive tool processing timeout, using default of 1 minute", "configured_value", c.cfg.Timeouts.ToolProcessingTimeout)
+ }
timeout = 1 * time.Minute // fallback to default if parsing fails
}
toolCtx, cancel := context.WithTimeout(ctx, timeout)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Create a context with the configured timeout for tool processing | |
| timeout, err := time.ParseDuration(c.cfg.Timeouts.ToolProcessingTimeout) | |
| if err != nil { | |
| c.logger.WarnKV("Invalid tool processing timeout, using default of 1 minute", "configured_value", c.cfg.Timeouts.ToolProcessingTimeout, "error", err) | |
| timeout = 1 * time.Minute // fallback to default if parsing fails | |
| } | |
| toolCtx, cancel := context.WithTimeout(ctx, timeout) | |
| // Create a context with the configured timeout for tool processing | |
| timeout, err := time.ParseDuration(c.cfg.Timeouts.ToolProcessingTimeout) | |
| if err != nil || timeout <= 0 { | |
| if err != nil { | |
| c.logger.WarnKV("Invalid tool processing timeout, using default of 1 minute", "configured_value", c.cfg.Timeouts.ToolProcessingTimeout, "error", err) | |
| } else { | |
| c.logger.WarnKV("Non-positive tool processing timeout, using default of 1 minute", "configured_value", c.cfg.Timeouts.ToolProcessingTimeout) | |
| } | |
| timeout = 1 * time.Minute // fallback to default if parsing fails | |
| } | |
| toolCtx, cancel := context.WithTimeout(ctx, timeout) |
🤖 Prompt for AI Agents
internal/slack/client.go around lines 671 to 677: validate that the parsed
duration is positive and avoid noisy per-call warnings by falling back to the
default when the value is non-positive; specifically, after parsing
c.cfg.Timeouts.ToolProcessingTimeout, check if err != nil OR timeout <= 0 and in
either case log a single warning and set timeout = 1*time.Minute; additionally
consider moving/parsing this config at client initialization (or using
sync.Once) so the warning is emitted at startup rather than on every tool
processing attempt.
|
@tuannvm Do you think it's feasible to merge this? |
This pull request improves configuration-driven timeout handling and error reporting throughout the MCP client initialization, LLM bridge, and Slack client tool processing. The changes ensure that timeouts are consistently read from configuration, provide more specific error messages on timeout, and fall back to sensible defaults if configuration is missing or invalid.
Timeout configuration improvements:
cmd/main.gonow determines the initialization timeout by checking for a server-specific value, then falling back to a global configuration value (cfg.Timeouts.MCPInitTimeout), and finally to a hardcoded default if neither is set. The timeout is now passed as atime.Durationfor clarity and precision. [1] [2] [3] [4]internal/handlers/llm_mcp_bridge.go, bothCallLLMAgentandCallLLMnow use a configurableBridgeOperationTimeoutfrom the config, with a default fallback if parsing fails. [1] [2]internal/slack/client.go, the tool processing timeout is now configurable viaToolProcessingTimeoutin the config, with a fallback to a default if parsing fails.Improved error handling and messaging:
Summary by CodeRabbit