Fixed scrapers by lalalaurentiu · Pull Request #677 · peviitor-ro/based_scraper_py

lalalaurentiu · 2026-02-09T22:17:30Z

This pull request updates several company-specific job scraper scripts to use more modern and robust APIs, improves consistency in request formatting, and simplifies the extraction logic. The changes primarily focus on switching to JSON-based POST requests, updating endpoints, and cleaning up or refactoring the parsing logic for job listings.

API and Endpoint Updates:

sites/atkinsrealis.py: Switched from scraping HTML pages to using a JSON-based POST API (https://slihrms.wd3.myworkdayjobs.com/wday/cxs/slihrms/Careers/jobs). The script now paginates using the API's offset and limit fields and extracts job data directly from the JSON response, simplifying the code and improving reliability.
sites/hcltechnologies.py: Migrated from scraping HTML to using the official recruiting API endpoint (https://careers.hcltech.com/services/recruiting/v1/jobs) with JSON payloads. The code now fetches and paginates jobs using the API, removing complex HTML parsing and city/county translation logic.
sites/hm.py: Updated the job search request to use a JSON POST with correct headers and a more targeted payload for Romania jobs, improving accuracy and reliability.

Request Formatting and Consistency:

sites/goodyear.py: Adjusted the order of location IDs in the post_data payload for consistency and corrected the logic for extracting the remote field from job data. [1] [2]
sites/hm.py: Added proper request headers (Content-Type and User-Agent) for the POST request to ensure compatibility with the API.

Code Simplification and Cleanup:

sites/atkinsrealis.py & sites/hcltechnologies.py: Removed unused imports and legacy code related to HTML parsing, city/county translation, and manual pagination, resulting in cleaner and more maintainable scripts. [1] [2]

These changes collectively modernize the scrapers, making them more robust against website changes and easier to maintain.

… data extraction

… job data extraction

…ove request handling

lalalaurentiu added 4 commits February 7, 2026 18:30

Refactor AtkinsRealis scraper to use new API endpoint and improve job…

890a65e

… data extraction

Fix formatting issues in Goodyear scraper URL and job data extraction

9bae1c5

Refactor HCL Technologies scraper to use new API endpoint and improve…

3b320e9

… job data extraction

Refactor HM scraper to update job data extraction parameters and impr…

75fd8e2

…ove request handling

lalalaurentiu merged commit cd3c31c into peviitor-ro:main Feb 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed scrapers#677

Fixed scrapers#677
lalalaurentiu merged 4 commits intopeviitor-ro:mainfrom
lalalaurentiu:main

lalalaurentiu commented Feb 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lalalaurentiu commented Feb 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant