feat: add x_read_post tool for full rich tweet reading with recursive quote tweets by nj-io · Pull Request #18 · nirholas/XActions

nj-io · 2026-04-05T09:08:32Z

Summary

New scrapePost() function and x_read_post MCP tool that reads any tweet URL with full rich data and recursive quote tweet resolution.

What it does

Give it any tweet URL. It returns:

Thread detection — if the author replied to themselves, all thread tweets are returned (not replies to others)
Rich data per tweet — text (full, including note_tweets >280 chars), media (images + best-quality video MP4 URL), X Articles (title + cover + URL), cards (link previews with title/description/URL), external URLs (Substack, GitHub, etc. — filtered from entities), engagement stats
Recursive quote tweets — if a tweet quotes another tweet, that quoted tweet is fetched as its own focal tweet (to get its full data, thread, and its own quote tweets). Recurses up to 5 levels deep.

Example: 3-level deep recursive resolution

@brian_blum1 — single post about Medvi marketing
  └─ quotes @pitdesi — 4-tweet THREAD
       Tweet 1: skepticism + quotes @sarthakgh
       Tweet 2: "said I wanted to be 60 lbs" + photo
       Tweet 3: "FDA compliance issue" + 2 photos
       Tweet 4: "800+ fake doctors" + photo
       └─ quotes @sarthakgh — NYT article quote
            └─ card: nytimes.com article link

Return shape

{
  "thread": [
    {
      "id": "...", "author": "...", "text": "...",
      "timestamp": "...", "url": "...",
      "media": [{ "type": "photo", "url": "..." }],
      "article": { "id": "...", "title": "...", "coverImage": "...", "url": "..." },
      "card": { "title": "...", "description": "...", "url": "...", "image": "..." },
      "urls": [{ "url": "https://open.substack.com/...", "display": "..." }],
      "replies": 14, "retweets": 5, "likes": 228, "views": "40642",
      "quotedPost": {
        "thread": [ ...same shape, recursively... ]
      }
    }
  ]
}

Architecture

scrapePost() in src/scrapers/twitter/index.js
Uses shared helpers from scrapeThread rewrite: fetchTweetDetail, parseTweetResult, parseThreadFromEntries
Wired through src/scrapers/index.js → src/mcp/local-tools.js → src/mcp/server.js

Depends on

feat: rewrite scrapeThread to use TweetDetail GraphQL API #17 — scrapeThread GraphQL rewrite (provides the shared helpers this PR uses)

Test plan

Single post with quote tweet (Brian/pitdesi/sarthakgh) — 3 levels deep, cards, photos
Thread with quote tweet (Aakash — 2 tweets + Grummz QT with photos)
Thread quoting an X Article (MetaMorpehus — article title + Substack URL in thread tweet 2)
3-level nesting with videos (PeptideList → a16z → latentspacepod)
Complex nesting: thread quoting thread quoting thread (neural_avb → Prince_Canuma → MaziyarPanahi)
Single tweet with card (Karpathy follow-up — llm-wiki GitHub link + quotes original)
15-tweet thread (Breedlove — peptides thread with photos and video)

🤖 Generated with Claude Code

Replace DOM-based thread scraping with direct GraphQL API calls. X doesn't render self-reply threads as article elements in the DOM, causing empty results — especially for high-engagement tweets. The new approach: - Calls TweetDetail GraphQL API from the page context using session cookies - Gets full_text (no truncation, no "Show more" needed) - note_tweet support for long-form posts - Filters to self-reply chain only (author replying to themselves) - Chronological sorting Also introduces shared helpers for future use by scrapePost: - fetchTweetDetail() — GraphQL API caller - parseTweetResult() — rich data extraction (text, media, article, card, external URLs, engagement stats) - parseThreadFromEntries() — thread chain detection - extractEntries(), unwrapResult(), getScreenName() Fixes: - screen_name moved from user.legacy to user.core in X's GraphQL schema - Self-replies missing from API response for viral tweets (2000+ replies) now handled gracefully (returns available tweets) Supersedes nirholas#12 which patches the DOM approach — this replaces it entirely. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

vercel · 2026-04-05T09:08:36Z

@nj-io is attempting to deploy a commit to the kaivocmenirehtacgmailcom's projects Team on Vercel.

A member of the Team first needs to authorize it.

- Replace uniform randomDelay (1-3s) with log-normal distribution (2-7s base + 8% distraction spikes of 8-20s) - Add checkAuth() guard after page navigation — fails fast on expired cookies - Add randomDelay before each fetchTweetDetail API call to simulate human browsing between tweet reads

New scrapePost() function and x_read_post MCP tool that reads any tweet URL with full rich data and recursive quote tweet resolution. Features: - Single tweets or threads (auto-detected via self-reply chain) - Rich data per tweet: text, media (images + best-quality video URL), X Articles (title + cover image + URL), cards (link previews), external URLs (Substack, GitHub, etc.), engagement stats - Recursive quote tweet resolution — if a quoted tweet is itself a thread, or contains its own quote tweet, those are fetched too (up to 5 levels deep) - Human-like delays between API calls (inherited from fetchTweetDetail) - Auth check on navigation (inherited from shared helpers) Depends on: nirholas#17 (scrapeThread GraphQL rewrite with shared helpers) Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

…likes Two new scrapers and MCP tools: scrapeLikedTweets — GraphQL-based likes scraper: - Cursor pagination via Likes API (50 tweets in 14s, 200 in 49s) - JSONL output to ~/.xactions/exports/ - from/to timestamp filtering with early exit - Rich data via parseTweetResult discoverLikes — interleaved fetch + deep read: - Fetches likes via API, deep-reads each via scrapePost - Human-like pacing: 3-8s between pages, 2-5s before reads, 5-15s after - Produces two JSONL files: likes index + deep reads - ~38s per tweet average (5 likes = 190s) Both remove x_get_likes from xeepyTools and delete the old DOM handler. Depends on: nirholas#17 (shared helpers), nirholas#18 (scrapePost) Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

nj-io · 2026-04-07T05:56:18Z

Superseded — resubmitting as clean PRs from current codebase.

nj-io requested a review from nirholas as a code owner April 5, 2026 09:08

nj-io and others added 2 commits April 5, 2026 10:31

nj-io force-pushed the feat/read-post-tool branch from faba5c9 to 80890dd Compare April 5, 2026 10:35

nj-io mentioned this pull request Apr 6, 2026

feat: add x_get_likes (GraphQL) and x_discover_likes with human-like deep reads #21

Closed

5 tasks

nj-io closed this Apr 7, 2026

nj-io mentioned this pull request Apr 7, 2026

feat: GraphQL-based scrapeThread, scrapePost, x_read_post, and shared infrastructure #23

Open

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add x_read_post tool for full rich tweet reading with recursive quote tweets#18

feat: add x_read_post tool for full rich tweet reading with recursive quote tweets#18
nj-io wants to merge 3 commits intonirholas:mainfrom
nj-io:feat/read-post-tool

nj-io commented Apr 5, 2026

Uh oh!

vercel bot commented Apr 5, 2026

Uh oh!

nj-io commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

nj-io commented Apr 5, 2026

Summary

What it does

Example: 3-level deep recursive resolution

Return shape

Architecture

Depends on

Test plan

Uh oh!

vercel bot commented Apr 5, 2026

Uh oh!

nj-io commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant