Site-Shooter

Playwright-powered crawler that visits a site from a start URL, accepts common cookie banners, forces lazy-loaded images to load, optionally stabilizes sticky headers, and saves full-page screenshots. Output is mirrored to a local folder by host and path.

Prerequisites

Node.js 18+ recommended
npm (comes with Node)

Setup

Install dependencies:
```
npm install
```
Install the Playwright browser binaries (Chromium):
```
npx playwright install chromium
```

Usage

Basic run (headless, default options):

node crawl-shoot.js https://example.com

Screenshots are written under shots/<host>/<path>/.../*.png by default.

Options

Pass flags as --name=value after the start URL:

--limit (default: 20): Max pages to capture.
--out (default: shots): Output directory root.
--delay (default: 400): ms to wait between page visits.
--ignoreQuery (default: true): Strip querystrings for de-duplication.
--width (default: 1366): Viewport width in pixels.
--forceStickyTop (default: true): Temporarily pin a likely header to the top for the shot.
--stickySelector (default: empty): Comma-separated CSS selectors to explicitly target a header (overrides auto-detection), e.g. "header, .site-header".
--stickyWaitMs (default: 400): Extra settle time after forcing sticky header.
--subtreeOnly (default: true): Only crawl within the start URL path subtree and origin.
--normalizeGallery (default: true): Normalize common horizontal galleries so they don't add blank space.
--headed (default: false): Run Chromium in headed mode (useful for debugging or sites that block headless).

Examples

Crawl up to 100 pages, save to shots, 1440px width:

node crawl-shoot.js https://example.com --limit=100 --width=1440

Force a specific header selector and wait longer for sticky stabilization:

node crawl-shoot.js https://example.com \
  --forceStickyTop=true \
  --stickySelector="header, .site-header" \
  --stickyWaitMs=800

Run with UI (headed) and write to a custom folder:

node crawl-shoot.js https://example.com --headed=true --out=shots/example

Output structure

Files are mirrored by host and path under the output directory. Example:
- shots/example.com/index.png
- shots/example.com/products/index.png
- shots/example.com/products/item-123.png

Tips

If a site shows elements only after interaction, try --headed=true to observe behavior and adjust flags.
If cookie banners block scrolling, they are auto-accepted when possible; re-run with --headed=true if a site uses a custom CMP.
Keep --ignoreQuery=true to reduce duplicates when querystrings don’t change content.
Set --subtreeOnly=false to crawl the entire origin instead of only the start path subtree.

Troubleshooting

If Chromium is missing, run: npx playwright install chromium.
Some pages with heavy client rendering may need more time; increase --delay and/or --stickyWaitMs.
If the detected header is wrong, supply an explicit --stickySelector.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.vscode		.vscode
.gitignore		.gitignore
README.md		README.md
crawl-shoot.js		crawl-shoot.js
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Site-Shooter

Prerequisites

Setup

Usage

Options

Examples

Output structure

Tips

Troubleshooting

About

Uh oh!

Releases

Packages

Languages

Vizioz/Site-Shooter

Folders and files

Latest commit

History

Repository files navigation

Site-Shooter

Prerequisites

Setup

Usage

Options

Examples

Output structure

Tips

Troubleshooting

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages