Spider

Website | Guides | API Docs | Chat

The fastest web crawler written in Rust. Primitives for data curation workloads at scale.

Why Spider?

Fast by Design: Concurrent crawling with streaming responses at scale
Flexible Rendering: HTTP, Chrome DevTools Protocol (CDP), or WebDriver (Selenium/remote browsers)
Production Ready: Battle-tested with anti-bot mitigation, caching, and distributed crawling

Features

Core

Concurrent & streaming crawls
Decentralized crawling for horizontal scaling
Caching (memory, disk, or hybrid)
Proxy support with rotation
Cron job scheduling

Browser Automation

Chrome DevTools Protocol (CDP) for local Chrome
WebDriver support for Selenium Grid, remote browsers, and cross-browser testing
AI-powered automation workflows
Web challenge solving (deterministic + AI built-in)

Data Processing

HTML transformations
CSS/XPath scraping with spider_utils
Smart mode for JS-rendered content detection

Security & Control

Anti-bot mitigation
Ad blocking
Firewall
Blacklisting, whitelisting, and depth budgeting

AI Agent

spider_agent - Concurrent-safe multimodal agent for web automation and research
Multiple LLM providers (OpenAI, OpenAI-compatible APIs)
Multiple search providers (Serper, Brave, Bing, Tavily)
HTML extraction and research synthesis

Quick Start

The fastest way to get started is with Spider Cloud - no infrastructure to manage. Pay-per-use at $1/GB data transfer, designed to keep crawling costs low.

For local development:

[dependencies]
spider = "2"

Streaming Pages

Process pages as they're crawled with real-time subscriptions:

use spider::tokio;
use spider::website::Website;

#[tokio::main]
async fn main() {
    let mut website = Website::new("https://spider.cloud");
    let mut rx = website.subscribe(0).unwrap();

    tokio::spawn(async move {
        while let Ok(page) = rx.recv().await {
            println!("- {}", page.get_url());
        }
    });

    website.crawl().await;
    website.unsubscribe();
}

Chrome (CDP)

Render JavaScript-heavy pages with stealth mode and request interception:

[dependencies]
spider = { version = "2", features = ["chrome"] }

use spider::features::chrome_common::RequestInterceptConfiguration;
use spider::website::Website;

#[tokio::main]
async fn main() {
    let mut website = Website::new("https://spider.cloud")
        .with_chrome_intercept(RequestInterceptConfiguration::new(true))
        .with_stealth(true)
        .build()
        .unwrap();

    website.crawl().await;
}

WebDriver (Selenium Grid)

Connect to remote browsers, Selenium Grid, or any W3C WebDriver-compatible service:

[dependencies]
spider = { version = "2", features = ["webdriver"] }

use spider::features::webdriver_common::{WebDriverConfig, WebDriverBrowser};
use spider::website::Website;

#[tokio::main]
async fn main() {
    let mut website = Website::new("https://spider.cloud")
        .with_webdriver(
            WebDriverConfig::new()
                .with_server_url("http://localhost:4444")
                .with_browser(WebDriverBrowser::Chrome)
                .with_headless(true)
        )
        .build()
        .unwrap();

    website.crawl().await;
}

Get Spider

Method	Best For
Spider Cloud	Production workloads, no setup required
spider	Rust applications
spider_agent	AI-powered web automation and research
spider_cli	Command-line usage
spider-nodejs	Node.js projects
spider-py	Python projects

Resources

Examples - Code samples for common use cases
Benchmarks - Performance comparisons
Changelog - Version history

License

MIT

Contributing

See CONTRIBUTING.

Name		Name	Last commit message	Last commit date
Latest commit History 1,576 Commits
.github/workflows		.github/workflows
benches		benches
examples		examples
spider		spider
spider_agent		spider_agent
spider_cli		spider_cli
spider_utils		spider_utils
spider_worker		spider_worker
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
default.nix		default.nix

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spider

Why Spider?

Features

Core

Browser Automation

Data Processing

Security & Control

AI Agent

Quick Start

Streaming Pages

Chrome (CDP)

WebDriver (Selenium Grid)

Get Spider

Resources

License

Contributing

About

Uh oh!

Releases 155

Packages

Used by 128

Contributors 28

Uh oh!

Languages

License

spider-rs/spider

Folders and files

Latest commit

History

Repository files navigation

Spider

Why Spider?

Features

Core

Browser Automation

Data Processing

Security & Control

AI Agent

Quick Start

Streaming Pages

Chrome (CDP)

WebDriver (Selenium Grid)

Get Spider

Resources

License

Contributing

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 155

Packages 0

Used by 128

Contributors 28

Uh oh!

Languages

Packages