Skip to content

feat: add x_read_article tool for reading full X Article content#14

Closed
nj-io wants to merge 1 commit intonirholas:mainfrom
nj-io:feat/read-article
Closed

feat: add x_read_article tool for reading full X Article content#14
nj-io wants to merge 1 commit intonirholas:mainfrom
nj-io:feat/read-article

Conversation

@nj-io
Copy link
Copy Markdown

@nj-io nj-io commented Apr 2, 2026

Summary

New scrapeArticle() scraper and x_read_article MCP tool for reading the full content of X Articles (long-form posts).

What it does

  • Accepts a tweet URL or direct article URL
  • For tweet URLs, discovers the /article/ link from the page
  • For quote tweets where the article belongs to the quoted author (not the tweet author), falls back to clicking article-cover-image and following the navigation
  • Scrolls through the article to load lazy content
  • Returns cleaned text with header/footer noise stripped

Return shape

{
  "title": "From Hierarchy to Intelligence",
  "author": "jack",
  "handle": "jack",
  "text": "At Sequoia, we see that speed is the best predictor...",
  "images": ["https://pbs.twimg.com/media/..."],
  "url": "https://x.com/jack/article/2039003879841362278"
}

Architecture

Follows the standard scraper pattern:

  • scrapeArticle() in src/scrapers/twitter/index.js
  • Re-exported through src/scrapers/index.js
  • Wrapped as x_read_article() in src/mcp/local-tools.js
  • Tool definition in src/mcp/server.js (Articles section)

Pairs with

This tool pairs well with x_get_likes (PR #13) — when liked tweets contain articles, the article metadata includes a tweetUrl that can be passed directly to x_read_article to fetch the full content.

Test plan

  • x_read_article with direct article URL returns full content
  • x_read_article with tweet URL discovers and reads the article
  • x_read_article with quote-tweet URL (article from quoted author) clicks through correctly
  • Article text is clean — no author metadata, timestamps, or engagement numbers in output
  • Profile images filtered from returned images array

🤖 Generated with Claude Code

New scrapeArticle() scraper and x_read_article MCP tool that reads the
full content of X Articles (long-form posts).

Accepts either:
- Direct article URL (x.com/user/article/ID)
- Tweet URL (x.com/user/status/ID) — discovers the article link

For quote tweets where the article belongs to the quoted author, falls
back to clicking the article-cover-image element to navigate to the
real article page.

Returns: title, author, handle, cleaned text (header/footer stripped),
content images (profile pics filtered out), and article URL.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@nj-io nj-io requested a review from nirholas as a code owner April 2, 2026 09:21
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 2, 2026

@nj-io is attempting to deploy a commit to the kaivocmenirehtacgmailcom's projects Team on Vercel.

A member of the Team first needs to authorize it.

@nj-io
Copy link
Copy Markdown
Author

nj-io commented Apr 7, 2026

Superseded — resubmitting as clean PRs from current codebase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant