js-chunk-collector

CLI tool to collect JavaScript chunks from web pages during loading. Uses Playwright to capture actual network responses.

Note: This tool is intended for use on websites you own or have explicit permission to analyze. Please respect terms of service and copyright laws.

Installation

npm install
npx playwright install chromium
npm run build

Usage

Basic Usage

Collect all JavaScript files loaded by a page:

npm run grab -- --url https://example.com

Same-Origin Only

Collect only scripts from the same domain:

npm run grab -- --url https://example.com --same-origin

With Filtering

Include only specific patterns:

npm run grab -- --url https://example.com --include "chunk.*\.js"

Exclude certain patterns:

npm run grab -- --url https://example.com --exclude "analytics|tracking"

Lazy Chunk Collection

Trigger dynamic imports by scrolling and clicking:

npm run grab -- --url https://example.com --scroll 3 --click ".tab-button" --click ".load-more"

Multiple Routes

Navigate through multiple pages on the same site:

npm run grab -- --url https://example.com --route /about --route /products

CLI Options

Option	Default	Description
`--url <url>`	(required)	URL to collect JavaScript from
`--out <dir>`	`./out`	Output directory
`--same-origin`	`false`	Only collect same-origin scripts
`--include <regex>`	-	Include URLs matching regex
`--exclude <regex>`	-	Exclude URLs matching regex
`--wait <ms>`	`1500`	Wait time for network idle
`--timeout <ms>`	`45000`	Page load timeout
`--scroll <n>`	`0`	Number of scroll actions
`--click <selector>`	-	Click selector (repeatable)
`--hover <selector>`	-	Hover selector (repeatable)
`--route <path>`	-	Additional route (repeatable)

Output Structure

out/
  example.com/
    main.js
    chunk-abc123.js
    static/
      vendor.js
  manifest.json

manifest.json

Contains metadata about collected files:

{
  "version": "1.0",
  "baseUrl": "https://example.com",
  "collectedAt": "2024-01-15T10:30:00.000Z",
  "actions": ["scroll 1/3", "click: .tab-button"],
  "files": [
    {
      "url": "https://example.com/main.js",
      "savedAs": "example.com/main.js",
      "bytes": 125000,
      "contentType": "application/javascript",
      "timestamp": "2024-01-15T10:30:01.000Z"
    }
  ]
}

File Naming Rules

Files are saved under out/<host>/<pathname>
URLs with query strings get a hash suffix: file__q_a1b2c3d4.js
Files without .js extension get it added automatically
Duplicate URLs are saved only once

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
src		src
test-server		test-server
.gitignore		.gitignore
README.md		README.md
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

js-chunk-collector

Installation

Usage

Basic Usage

Same-Origin Only

With Filtering

Lazy Chunk Collection

Multiple Routes

CLI Options

Output Structure

manifest.json

File Naming Rules

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

js-chunk-collector

Installation

Usage

Basic Usage

Same-Origin Only

With Filtering

Lazy Chunk Collection

Multiple Routes

CLI Options

Output Structure

manifest.json

File Naming Rules

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages