Skip to content

A high-performance web scraping engine built with Python and Playwright, designed to automatically aggregate job listings from various platforms into structured JSON data.

Notifications You must be signed in to change notification settings

Ahmed-Yusuf-1/Job_board_engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

14 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Job Board Engine πŸš€

A robust, automated tool for scraping and aggregating job listings from multiple online platforms. Built with Python and Playwright, this engine handles dynamic web content to extract key job details and export them into a structured format.


✨ Features

  • Dynamic Content Handling: Uses Playwright to scrape modern websites with heavy JavaScript loading.
  • Structured Data Output: Automatically generates a jobs.json file containing title, company, location, and direct links.
  • Headless Operation: Designed to run efficiently in the background without a GUI.
  • Extensible Scraper Logic: Easily adaptable to different job board structures.

πŸ›  Tech Stack

  • Language: Python 3.11+
  • Automation: Playwright (Chromium/WebKit/Firefox)
  • Testing: Pytest
  • Data Format: JSON

πŸš€ Getting Started

Prerequisites

  • Python 3.11 or higher installed.
  • pip for package management.

Installation

  1. Clone the repository:

    git clone [https://github.com/Ahmed-Yusuf-1/job_board_engine.git](https://github.com/Ahmed-Yusuf-1/job_board_engine.git)
    cd job_board_engine
  2. Create and activate a virtual environment:

    python -m venv venv
    # Linux/MacOS
    source venv/bin/activate
    # Windows
    .\venv\Scripts\activate
  3. Install dependencies:

    pip install playwright pytest
    playwright install

πŸ“‚ Usage

To run the scraper and update the job database:

python test_scraper.py

About

A high-performance web scraping engine built with Python and Playwright, designed to automatically aggregate job listings from various platforms into structured JSON data.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published