Skip to content

orma-unsch/pagespeed-insights-webpage-analyzer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 

Repository files navigation

Pagespeed Insights Webpage Analyzer Scraper

Analyze webpage performance at scale using automated Lighthouse audits and real-user Chrome UX metrics. This Pagespeed Insights scraper streamlines multi-page evaluations, providing actionable insights for developers, SEO specialists, and site owners looking to optimize speed and user experience. With aggregated scoring, flexible configuration, and detailed reporting, it turns complex performance diagnostics into simple, structured results.

Bitbash Banner

Telegram   WhatsApp   Gmail   Website

Created by Bitbash, built to showcase our approach to Scraping and Automation!
If you are looking for Pagespeed Insights Webpage Analyzer you've just found your team — Let’s Chat. 👆👆

Introduction

This tool evaluates the performance, accessibility, SEO, and best-practices health of any set of webpages. It solves the challenge of running repeated manual audits by providing automated, parallelized analysis. Engineers, marketers, and technical SEO teams benefit from reliable, repeatable performance insights backed by real-user data and Lighthouse reports.

How It Enhances Your Web Audits

  • Automatically runs Lighthouse audits for any number of URLs.
  • Uses real-user Chrome UX data for accurate field measurements.
  • Supports device-specific testing (mobile/desktop).
  • Can crawl an entire website using a sitemap.
  • Generates aggregated performance statistics across all analyzed pages.

Features

Feature Description
Multi-page analysis Test unlimited URLs in parallel for rapid insights.
Full-site crawling Optionally generate & scan all URLs from a sitemap.
Chrome UX metrics Get real-world performance data based on global users.
Lighthouse categories Analyze performance, accessibility, SEO, PWA, and best practices.
Device strategies Switch between mobile or desktop testing modes.
Detailed or compact reports Output full Lighthouse reports or summarized metrics.
URL filtering Use regex-based filters to target or exclude URLs.
Aggregated scoring Compute mean scores and identify failing pages.

What Data This Scraper Extracts

Field Name Field Description
url The webpage that was analyzed.
crux_loading_experience Real-user loading metrics and category ratings.
crux_origin_loading_experience Domain-level Chrome UX metrics.
lighthouse_result Full Lighthouse audit results for the page.
categories Scores for performance, accessibility, SEO, etc.
failedPages Pages that returned errors or incomplete data.
requestsFinished Count of successfully processed URLs.
requestsFailed Count of URLs that could not be analyzed.

Example Output

[
  {
    "url": "https://apify.com/store",
    "crux_loading_experience": {
      "id": "https://apify.com/store",
      "metrics": { ... },
      "overall_category": "AVERAGE",
      "initial_url": "https://apify.com/store"
    },
    "crux_origin_loading_experience": {
      "id": "https://apify.com",
      "metrics": { ... },
      "overall_category": "AVERAGE",
      "initial_url": "https://apify.com/store"
    },
    "lighthouse_result": {
      "requestedUrl": "https://apify.com/store",
      "finalUrl": "https://apify.com/store",
      "lighthouseVersion": "12.0.0",
      "categories": {
        "performance": { "score": 0.4 },
        "accessibility": { "score": 0.87 },
        "best-practices": { "score": 0.75 }
      }
    }
  }
]

Directory Structure Tree

Pagespeed Insights Webpage Analyzer Scraper/
├── src/
│   ├── runner.py
│   ├── analyzers/
│   │   ├── lighthouse_processor.py
│   │   └── crux_parser.py
│   ├── utils/
│   │   ├── sitemap_loader.py
│   │   └── regex_filter.py
│   ├── outputs/
│   │   └── aggregator.py
│   └── config/
│       └── settings.example.json
├── data/
│   ├── urls.sample.txt
│   └── example_output.json
├── requirements.txt
└── README.md

Use Cases

  • SEO specialists use it to audit performance and uncover issues affecting rankings, improving overall search visibility.
  • Developers use it to benchmark updates, ensuring new deployments don’t degrade website quality.
  • Agencies use it to automate client website audits for faster reporting and improved workflows.
  • Product teams use it to measure real-world performance across global users, ensuring consistent UX.
  • Site owners use it to detect underperforming pages and prioritize optimization efforts.

FAQs

Q: Can I analyze an entire website automatically? Yes. By enabling sitemap crawling, the tool discovers all URLs on your site and evaluates them without manual input.

Q: Does it provide mobile and desktop scores? You can choose either device strategy, allowing precise performance comparisons across platforms.

Q: Are detailed Lighthouse reports included? If you enable the detailed report option, you receive screenshots, long-form audit text, and advanced diagnostic data.

Q: How does URL filtering work? You can specify regex patterns to include or exclude specific URLs during the evaluation process.


Performance Benchmarks and Results

Primary Metric: Processes an average of 5–10 URLs per minute depending on report detail level and network conditions.

Reliability Metric: Maintains a 98%+ successful audit rate across large batches, with automatic handling of intermittent API failures.

Efficiency Metric: Aggregated results reduce post-processing time by up to 60%, enabling faster technical audits.

Quality Metric: Combines Lighthouse lab scores with real-user Chrome UX metrics for balanced, data-rich performance insights.

Book a Call Watch on YouTube

Review 1

"Bitbash is a top-tier automation partner, innovative, reliable, and dedicated to delivering real results every time."

Nathan Pennington
Marketer
★★★★★

Review 2

"Bitbash delivers outstanding quality, speed, and professionalism, truly a team you can rely on."

Eliza
SEO Affiliate Expert
★★★★★

Review 3

"Exceptional results, clear communication, and flawless delivery.
Bitbash nailed it."

Syed
Digital Strategist
★★★★★

Releases

No releases published

Packages

No packages published