CLI tool to collect JavaScript chunks from web pages during loading. Uses Playwright to capture actual network responses.
Note: This tool is intended for use on websites you own or have explicit permission to analyze. Please respect terms of service and copyright laws.
npm install
npx playwright install chromium
npm run buildCollect all JavaScript files loaded by a page:
npm run grab -- --url https://example.comCollect only scripts from the same domain:
npm run grab -- --url https://example.com --same-originInclude only specific patterns:
npm run grab -- --url https://example.com --include "chunk.*\.js"Exclude certain patterns:
npm run grab -- --url https://example.com --exclude "analytics|tracking"Trigger dynamic imports by scrolling and clicking:
npm run grab -- --url https://example.com --scroll 3 --click ".tab-button" --click ".load-more"Navigate through multiple pages on the same site:
npm run grab -- --url https://example.com --route /about --route /products| Option | Default | Description |
|---|---|---|
--url <url> |
(required) | URL to collect JavaScript from |
--out <dir> |
./out |
Output directory |
--same-origin |
false |
Only collect same-origin scripts |
--include <regex> |
- | Include URLs matching regex |
--exclude <regex> |
- | Exclude URLs matching regex |
--wait <ms> |
1500 |
Wait time for network idle |
--timeout <ms> |
45000 |
Page load timeout |
--scroll <n> |
0 |
Number of scroll actions |
--click <selector> |
- | Click selector (repeatable) |
--hover <selector> |
- | Hover selector (repeatable) |
--route <path> |
- | Additional route (repeatable) |
out/
example.com/
main.js
chunk-abc123.js
static/
vendor.js
manifest.json
Contains metadata about collected files:
{
"version": "1.0",
"baseUrl": "https://example.com",
"collectedAt": "2024-01-15T10:30:00.000Z",
"actions": ["scroll 1/3", "click: .tab-button"],
"files": [
{
"url": "https://example.com/main.js",
"savedAs": "example.com/main.js",
"bytes": 125000,
"contentType": "application/javascript",
"timestamp": "2024-01-15T10:30:01.000Z"
}
]
}- Files are saved under
out/<host>/<pathname> - URLs with query strings get a hash suffix:
file__q_a1b2c3d4.js - Files without
.jsextension get it added automatically - Duplicate URLs are saved only once
MIT