GitHub - djibril-marega/oddsportal_scraper

🧾 README

Overview

This Python project allows you to scrape both the match history and the upcoming matches of a team or a competition from oddsPortal.com.

It provides detailed information for a given sport, team, or competition during a specific season, according to user-defined parameters.

📊 Collected Data

For a competition

In addition to the general information provided by the user (season, bookmaker, market, region, competition, sport), the script collects for each match:

Match score
Home team
Away team
Opening and closing odds (value, date, and time) for outcome 1, X, and 2 in the 1X2 market

Depending on the typegame and spread options, the scraper adapts its behavior:

Key	Description	Possible Values	Default
typegame	Defines whether to scrape upcoming or historical matches	`"upcoming"` or `"historical"`	`"historical"`
spread	Defines the scraping scope for competitions	`"none"` (only the selected competition) / `"team"` (also fetches team histories) / `"completly"` (fetches all teams + all competitions they played in)	`"none"`

For example:

typegame="historical", spread="team" → fetches the competition’s match history + all participating teams’ histories.
typegame="upcoming" → fetches only upcoming matches of the competition.

When spread is "team" or "completly", already-scraped data in scraped_data/ are skipped to avoid redundancy and save time.

For a team

The script collects the same information as for competitions, plus:

The region where the competition takes place
The competition name

⚠️ The odds are retrieved according to the specified bookmaker.

⚙️ Requirements

Python 3.x
Playwright

💻 Installation

git clone https://github.com/djibril-marega/oddsportal_scraper.git
cd oddsportal_scraper
pip install -r requirements.txt

🚀 Usage

Method 1 — Recommended (modular and reproducible)

Create a .json configuration file containing teams or competitions.

Example for a team:

[
  {
    "sport": "Football",
    "region": "France",
    "team": "PSG",
    "teamid": "CjhkPw0k",
    "season": "2024/2025",
    "bookmaker": "Betclic"
  }
]

Example for a competition (historical data):

[
  {
    "sport": "Football",
    "region": "England",
    "competition": "Premier League",
    "season": "2024/2025",
    "bookmaker": "Betclic"
  }
]

Example for upcoming matches of a competition:

[
  {
    "sport": "Football",
    "region": "France",
    "competition": "Ligue 1",
    "season": "2025/2026",
    "bookmaker": "Betclic",
    "typegame": "upcoming"
  }
]

To execute the script:

python .\run_parallel_tests.py

Logs are stored in the logs/ directory — one file per team/competition, plus a global summary log.
Scraped data are saved in scraped_data/ — one file per team/competition.
To enable verbose output:
```
python .\run_parallel_tests.py -v
```

Method 2 — Quick test (less reproducible)

Run directly with pytest:

pytest test_oddsportal.py --sport=Football --region=France --competition="Ligue 1" --season=2025/2026 --bookmaker=Betclic --typegame=upcoming -v --tb=short

Options:

--typegame: "upcoming" or "historical"
--spread: "none", "team", or "completly"
-v: verbose mode
--tb=short: concise traceback

📁 Output Structure

project/
│
├── scraped_data/        # One file per team or competition
├── logs/                # Detailed and summary logs
├── run_parallel_tests.py
└── test_oddsportal.py

🧠 Notes

The scraper relies on Playwright, so the first run may download browsers automatically.
Using Method 1 is recommended for scalability and reproducibility.
Already collected teams or competitions (for the same season) are automatically skipped to avoid redundant scraping.
typegame and spread allow flexible control of scraping scope — from a single competition’s upcoming games to a full seasonal network of related competitions and teams.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.gitignore		.gitignore
LICENCE		LICENCE
README.md		README.md
conftest.py		conftest.py
date_sorting.py		date_sorting.py
extract_data.py		extract_data.py
manage_date.py		manage_date.py
manage_links.py		manage_links.py
requirements.txt		requirements.txt
run_parallel_tests.py		run_parallel_tests.py
save_data.py		save_data.py
test_configs_example.json		test_configs_example.json
test_get_competition_match_history.py		test_get_competition_match_history.py
test_get_match_history.py		test_get_match_history.py
test_get_team_match_history.py		test_get_team_match_history.py
test_oddsportal.py		test_oddsportal.py
test_website_navigation.py		test_website_navigation.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧾 README

Overview

📊 Collected Data

For a competition

For a team

⚙️ Requirements

💻 Installation

🚀 Usage

Method 1 — Recommended (modular and reproducible)

Method 2 — Quick test (less reproducible)

📁 Output Structure

🧠 Notes

About

Uh oh!

Releases

Packages

Languages

License

djibril-marega/oddsportal_scraper

Folders and files

Latest commit

History

Repository files navigation

🧾 README

Overview

📊 Collected Data

For a competition

For a team

⚙️ Requirements

💻 Installation

🚀 Usage

Method 1 — Recommended (modular and reproducible)

Method 2 — Quick test (less reproducible)

📁 Output Structure

🧠 Notes

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages