Skip to content

Conversation

@zzstoatzz
Copy link
Collaborator

Summary

  • Adds a FastMCP server in examples/slack_search/ for searching Prefect Slack community thread summaries
  • Stores thread summaries in Turso with Voyage AI embeddings for semantic search
  • Dynamic Literal types for topic/channel filters populated at startup
  • get_thread_messages tool fetches actual Slack conversation content via API
  • Includes migration script for normalizing topic labels

Tools

  • search(query, topic?, channel?) - text search across thread summaries
  • similar(query, topic?, channel?) - semantic search using embeddings
  • get_thread(key) - get full AI summary + metadata
  • get_thread_messages(key) - fetch actual Slack messages from thread
  • get_stats() - index statistics
  • list_topics() / list_channels() - available filters

Test plan

  • Verified MCP server starts and loads categories from Turso
  • Tested text search and semantic search return results with URLs
  • Tested get_thread_messages fetches actual Slack content
  • Ran topic normalization migration on production data

🤖 Generated with Claude Code

zzstoatzz and others added 11 commits January 9, 2026 01:16
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
TURSO_TOKEN had surrounding quotes causing JWT Base64 decode error

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
pydantic-settings only strips quotes from .env files, not from
env vars set directly in the deployment environment

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
computed field constructs clickable URL from key parts

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- adds SLACK_API_TOKEN env var
- new tool fetches real messages from Slack API via conversations.replies
- returns structured ThreadContent with messages, url, timestamps

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Copilot AI review requested due to automatic review settings January 9, 2026 08:15
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a FastMCP server for searching Prefect Slack community thread summaries with semantic and text search capabilities. The implementation refactors configuration management to use pydantic-settings, adds a new tool to fetch actual Slack messages, and includes a migration script for normalizing topic labels.

Key changes:

  • Introduced pydantic-settings for centralized configuration management with environment variable support
  • Added get_thread_messages tool to fetch raw Slack conversation content via the Slack API
  • Fixed URL and key parsing logic for ThreadSummary (corrected array indices)

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
pyproject.toml Added pydantic-settings>=2.0 dependency
uv.lock Updated lock file with pydantic-settings dependency
client.py Refactored to use Settings class with pydantic-settings; added slack_get_thread function
server.py Refactored _load_categories to use Settings; added get_thread_messages tool
_types.py Fixed key parsing indices; added URL computed fields and new SlackMessage/ThreadContent types
normalize_topics.py New migration script for normalizing topic labels in database

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

)
resp.raise_for_status()
if resp.status_code >= 400:
raise RuntimeError(f"Turso HTTP {resp.status_code}: {resp.text}")
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message includes the full response text which could potentially expose sensitive information like authentication tokens or internal server details. Consider sanitizing the error message or logging it separately while showing a generic error to users.

Suggested change
raise RuntimeError(f"Turso HTTP {resp.status_code}: {resp.text}")
raise RuntimeError(f"Turso HTTP {resp.status_code}")

Copilot uses AI. Check for mistakes.
timeout=30,
)
if response.status_code >= 400:
raise RuntimeError(f"Turso HTTP {response.status_code} for {url}: {response.text}")
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message includes the full response text which could potentially expose sensitive information like authentication tokens or internal server details. Consider sanitizing the error message or logging it separately while showing a generic error to users.

Suggested change
raise RuntimeError(f"Turso HTTP {response.status_code} for {url}: {response.text}")
raise RuntimeError(f"Turso HTTP {response.status_code} for {url}")

Copilot uses AI. Check for mistakes.
turso_url = os.environ.get("TURSO_URL", "").strip().strip('"')
turso_token = os.environ.get("TURSO_TOKEN", "").strip().strip('"')
if turso_url.startswith("libsql://"):
turso_url = turso_url[len("libsql://") :]
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function doesn't validate that host and token are non-empty before returning. If environment variables are not set, this will return empty strings and cause cryptic errors later. Consider adding validation to raise a clear error message if credentials are missing.

Suggested change
turso_url = turso_url[len("libsql://") :]
turso_url = turso_url[len("libsql://") :]
missing = []
if not turso_url:
missing.append("TURSO_URL")
if not turso_token:
missing.append("TURSO_TOKEN")
if missing:
raise RuntimeError(
f"Missing required environment variable(s): {', '.join(missing)}. "
"Please set them before running this script."
)

Copilot uses AI. Check for mistakes.

@mcp.tool
async def get_thread_messages(key: str) -> ThreadContent | None:
"""fetch actual messages from a Slack thread.
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring style is inconsistent with the get_thread function above. The first line should start with a capital letter and end with a period to match the established convention (e.g., "Fetch actual messages from a Slack thread.").

Suggested change
"""fetch actual messages from a Slack thread.
"""Fetch actual messages from a Slack thread.

Copilot uses AI. Check for mistakes.
Comment on lines +315 to +324
"""fetch actual messages from a Slack thread.

retrieves the full conversation content from Slack's API,
not just the AI summary. requires SLACK_API_TOKEN.

args:
key: the thread key (from search/similar results)

returns:
thread content with all messages, or None if not found
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring style is inconsistent with the get_thread function above. Lines should start with capital letters to match the established convention (e.g., "Retrieves the full conversation..." and "Args:" and "Returns:").

Suggested change
"""fetch actual messages from a Slack thread.
retrieves the full conversation content from Slack's API,
not just the AI summary. requires SLACK_API_TOKEN.
args:
key: the thread key (from search/similar results)
returns:
thread content with all messages, or None if not found
"""Fetch actual messages from a Slack thread.
Retrieves the full conversation content from Slack's API,
not just the AI summary. Requires SLACK_API_TOKEN.
Args:
key: The thread key (from search/similar results).
Returns:
Thread content with all messages, or None if not found.

Copilot uses AI. Check for mistakes.
Comment on lines +54 to +62
response = httpx.post(
url,
headers={
"Authorization": f"Bearer {settings.turso_token}",
"Content-Type": "application/json",
},
json=payload,
timeout=30,
)
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The turso_query function is declared as async but uses synchronous httpx.post instead of an async client. This should either use httpx.AsyncClient or the function should not be async. The current implementation will block the event loop.

Suggested change
response = httpx.post(
url,
headers={
"Authorization": f"Bearer {settings.turso_token}",
"Content-Type": "application/json",
},
json=payload,
timeout=30,
)
async with httpx.AsyncClient() as client:
response = await client.post(
url,
headers={
"Authorization": f"Bearer {settings.turso_token}",
"Content-Type": "application/json",
},
json=payload,
timeout=30,
)

Copilot uses AI. Check for mistakes.
Comment on lines +128 to +131
from datetime import datetime

try:
dt = datetime.fromtimestamp(float(self.ts))
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using datetime.fromtimestamp without specifying a timezone will use the local timezone, which can lead to inconsistent behavior across different environments. Consider using datetime.fromtimestamp with timezone awareness (e.g., tz=timezone.utc) or datetime.utcfromtimestamp for consistent UTC timestamps.

Suggested change
from datetime import datetime
try:
dt = datetime.fromtimestamp(float(self.ts))
from datetime import datetime, timezone
try:
dt = datetime.fromtimestamp(float(self.ts), tz=timezone.utc)

Copilot uses AI. Check for mistakes.
channel = parts[6]
thread_ts = parts[7]

messages = await slack_get_thread(channel, thread_ts)
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function doesn't handle the case where slack_get_thread might raise an exception (e.g., authentication failure, network errors, or invalid channel/thread_ts). This could cause the tool to fail without a meaningful error message. Consider wrapping the slack_get_thread call in a try-except block to provide better error messages.

Suggested change
messages = await slack_get_thread(channel, thread_ts)
try:
messages = await slack_get_thread(channel, thread_ts)
except Exception as e:
# Log a meaningful error message instead of letting the exception crash the tool.
print(f"error: failed to fetch Slack thread {channel}/{thread_ts}: {e}")
return None

Copilot uses AI. Check for mistakes.
if not meta_raw:
continue

meta = json.loads(meta_raw)
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The json.loads call could raise a JSONDecodeError if the metadata contains invalid JSON. This would cause the migration to fail completely. Consider wrapping this in a try-except block to skip malformed metadata and log a warning instead.

Copilot uses AI. Check for mistakes.
Comment on lines +338 to +339
ts_clean = thread_ts.replace(".", "")
url = f"https://{workspace}.slack.com/archives/{channel}/p{ts_clean}"
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The URL construction logic here duplicates the logic in ThreadSummary.url (lines 37-44 in _types.py). Consider extracting this into a shared utility function or reusing the computed field logic to maintain consistency and reduce code duplication.

Copilot uses AI. Check for mistakes.
- Add RETENTION_DAYS constant (90 days)
- Filter search/similar queries by thread_ts > cutoff
- Uses json_extract to filter on metadata.thread_ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant