Refactor SiemensHealthineers scraper to use updated job listing URL and improve job data extraction by lalalaurentiu · Pull Request #660 · peviitor-ro/based_scraper_py

lalalaurentiu · 2025-11-04T11:25:16Z

This pull request refactors the Siemens Healthineers job scraper in sites/siemenshealthineers.py to adapt to changes in the target website's structure and improve data extraction. The main changes involve switching from a JSON API to HTML parsing, updating pagination logic, and modifying how job details are collected.

Adaptation to new website structure:

Changed the url and data fetching logic from a JSON API endpoint to an HTML page, and updated the scraper to parse HTML elements instead of JSON objects.
Updated job extraction to find job elements using find_all("article", class_="article") and extract job details (title, link, city) from HTML tags.

Pagination and job count calculation:

Modified how the total number of jobs is calculated, now parsing the count from an HTML element (div.list-controls__text__legend) instead of a JSON field.
Adjusted pagination logic to match the new page size (6 jobs per page) and updated the URL for fetching subsequent pages with the correct offset parameter.

Job location and county extraction:

Changed city extraction to parse from a single HTML span per job, and updated county lookup to match the new city extraction method. (F12813f1L5

…nd improve job data extraction

Refactor SiemensHealthineers scraper to use updated job listing URL a…

3dabdd2

…nd improve job data extraction

lalalaurentiu merged commit da6ed9b into peviitor-ro:main Nov 4, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor SiemensHealthineers scraper to use updated job listing URL and improve job data extraction#660

Refactor SiemensHealthineers scraper to use updated job listing URL and improve job data extraction#660
lalalaurentiu merged 1 commit intopeviitor-ro:mainfrom
lalalaurentiu:main

lalalaurentiu commented Nov 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lalalaurentiu commented Nov 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant