feat: enhanced scrapeLikedTweets with rich data and timestamp filtering by nj-io · Pull Request #19 · nirholas/XActions

nj-io · 2026-04-05T11:01:19Z

Summary

Replaces the broken xeepy-based x_get_likes handler with a proper scrapeLikedTweets() scraper, following the same pattern as scrapeBookmarks(). Supersedes #13 with timestamp filtering and auth checks added.

Rich data per tweet

Field	Source
`text`	Full text with "Show more" expansion
`author`, `handle`	`User-Name` + first `a[href]`
`timestamp`, `link`	`time[datetime]`, first `/status/` link
`images`	`a[href*="/photo/"]` attributed to correct author by handle matching
`quotedTweet`	Detected via multiple `UserAvatar-Container-*` elements
`article`	`article-cover-image` + `nextElementSibling` for title/description
`card`	`card.wrapper` for link previews
`replies`, `retweets`, `likes`, `views`	Parsed from `role="group"` `aria-label`

Timestamp filtering (new)

Param	Effect
`from`	Only include likes from this date onward. Stops scrolling early when older tweets are reached (reverse chronological optimization).
`to`	Only include likes up to this date. Skips newer tweets but keeps scrolling to reach the target window.
`limit`	Works in conjunction with `from`/`to` — caps total results.

Accepts any format new Date() understands: "2026-03-01", "March 1, 2026", ISO timestamps, etc.

Architecture

scrapeLikedTweets() in src/scrapers/twitter/index.js
Exported through src/scrapers/index.js
Wrapped as x_get_likes() in src/mcp/local-tools.js
Removed from xeepyTools array, old executeXeepyTool handler deleted

Bug fixes

"Show more" clicks one at a time — X re-renders DOM after each click
Auth check after navigation — fails fast on expired cookies
Article URL construction — only for direct articles, not quoted tweet articles

Relation to other PRs

Supersedes closed feat: enhanced scrapeLikedTweets scraper with rich data extraction #13
Overlaps with scrapeLikedTweets portion of Encrypted DM reader, batch profiles, liked tweets, more human-like delays #7. More comprehensive (rich data, timestamp filtering, auth check).

Test plan

x_get_likes returns rich data with quote tweets, articles, cards, engagement stats
from param stops scrolling early when passing the target date
to param skips newer tweets but keeps scrolling
limit + from work together
Invalid date string throws clear error
Expired cookie throws auth error instead of scraping empty pages
"Show more" expansion works for truncated tweets

🤖 Generated with Claude Code

Replace the broken xeepy-based x_get_likes handler with a proper scraper in src/scrapers/twitter/index.js, following the same pattern as scrapeBookmarks. Rich data per tweet: - text (with "Show more" expansion), author, handle, timestamp, link - images (attributed to correct author by handle matching) - quoted tweets (detected via multiple UserAvatar-Container elements) - X Articles (title, description, cover image via article-cover-image) - link cards (via card.wrapper) - engagement stats (replies, retweets, likes, views from role="group") Timestamp filtering: - from: only include likes from this date onward, stops scrolling early when older tweets are reached (reverse chronological optimization) - to: only include likes up to this date, skips newer but keeps scrolling - Works in conjunction with limit Bug fixes: - "Show more" clicks one at a time (X re-renders DOM after each click) - Auth check after navigation — fails fast on expired cookies - Scroll-based pagination with deduplication - Removes x_get_likes from xeepyTools, routes through local-tools.js Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

vercel · 2026-04-05T11:01:25Z

@nj-io is attempting to deploy a commit to the kaivocmenirehtacgmailcom's projects Team on Vercel.

A member of the Team first needs to authorize it.

- Expand viewport to 2400px height before scrolling (default 800px only fits ~1 tweet, causing X's virtualization to never render more) - Restore viewport to 800px after scraping - Wait for initial tweet selector before entering scroll loop - Scroll by window.innerHeight instead of fixed 1200px - MutationObserver-based wait for DOM changes after each scroll - Progressive backoff on empty scrolls (2-4s base + 1-1.5s per miss) - Increase empty scroll tolerance from 5 to 8 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

nj-io · 2026-04-06T01:46:49Z

Superseded — rewrote to use GraphQL API instead of DOM scraping. See new PR.

nj-io requested a review from nirholas as a code owner April 5, 2026 11:01

nj-io closed this Apr 6, 2026

nj-io mentioned this pull request Apr 6, 2026

feat: GraphQL-based scrapeLikedTweets with JSONL output #20

Closed

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: enhanced scrapeLikedTweets with rich data and timestamp filtering#19

feat: enhanced scrapeLikedTweets with rich data and timestamp filtering#19
nj-io wants to merge 2 commits intonirholas:mainfrom
nj-io:feat/enhanced-liked-tweets-v2

nj-io commented Apr 5, 2026

Uh oh!

vercel bot commented Apr 5, 2026

Uh oh!

nj-io commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

nj-io commented Apr 5, 2026

Summary

Rich data per tweet

Timestamp filtering (new)

Architecture

Bug fixes

Relation to other PRs

Test plan

Uh oh!

vercel bot commented Apr 5, 2026

Uh oh!

nj-io commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant